1*9880d681SAndroid Build Coastguard Worker=========================================== 2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Implementing a Parser and AST 3*9880d681SAndroid Build Coastguard Worker=========================================== 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard Worker.. contents:: 6*9880d681SAndroid Build Coastguard Worker :local: 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard WorkerChapter 2 Introduction 9*9880d681SAndroid Build Coastguard Worker====================== 10*9880d681SAndroid Build Coastguard Worker 11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 2 of the "`Implementing a language with LLVM in 12*9880d681SAndroid Build Coastguard WorkerObjective Caml <index.html>`_" tutorial. This chapter shows you how to 13*9880d681SAndroid Build Coastguard Workeruse the lexer, built in `Chapter 1 <OCamlLangImpl1.html>`_, to build a 14*9880d681SAndroid Build Coastguard Workerfull `parser <http://en.wikipedia.org/wiki/Parsing>`_ for our 15*9880d681SAndroid Build Coastguard WorkerKaleidoscope language. Once we have a parser, we'll define and build an 16*9880d681SAndroid Build Coastguard Worker`Abstract Syntax 17*9880d681SAndroid Build Coastguard WorkerTree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST). 18*9880d681SAndroid Build Coastguard Worker 19*9880d681SAndroid Build Coastguard WorkerThe parser we will build uses a combination of `Recursive Descent 20*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and 21*9880d681SAndroid Build Coastguard Worker`Operator-Precedence 22*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to 23*9880d681SAndroid Build Coastguard Workerparse the Kaleidoscope language (the latter for binary expressions and 24*9880d681SAndroid Build Coastguard Workerthe former for everything else). Before we get to parsing though, lets 25*9880d681SAndroid Build Coastguard Workertalk about the output of the parser: the Abstract Syntax Tree. 26*9880d681SAndroid Build Coastguard Worker 27*9880d681SAndroid Build Coastguard WorkerThe Abstract Syntax Tree (AST) 28*9880d681SAndroid Build Coastguard Worker============================== 29*9880d681SAndroid Build Coastguard Worker 30*9880d681SAndroid Build Coastguard WorkerThe AST for a program captures its behavior in such a way that it is 31*9880d681SAndroid Build Coastguard Workereasy for later stages of the compiler (e.g. code generation) to 32*9880d681SAndroid Build Coastguard Workerinterpret. We basically want one object for each construct in the 33*9880d681SAndroid Build Coastguard Workerlanguage, and the AST should closely model the language. In 34*9880d681SAndroid Build Coastguard WorkerKaleidoscope, we have expressions, a prototype, and a function object. 35*9880d681SAndroid Build Coastguard WorkerWe'll start with expressions first: 36*9880d681SAndroid Build Coastguard Worker 37*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard Worker (* expr - Base type for all expression nodes. *) 40*9880d681SAndroid Build Coastguard Worker type expr = 41*9880d681SAndroid Build Coastguard Worker (* variant for numeric literals like "1.0". *) 42*9880d681SAndroid Build Coastguard Worker | Number of float 43*9880d681SAndroid Build Coastguard Worker 44*9880d681SAndroid Build Coastguard WorkerThe code above shows the definition of the base ExprAST class and one 45*9880d681SAndroid Build Coastguard Workersubclass which we use for numeric literals. The important thing to note 46*9880d681SAndroid Build Coastguard Workerabout this code is that the Number variant captures the numeric value of 47*9880d681SAndroid Build Coastguard Workerthe literal as an instance variable. This allows later phases of the 48*9880d681SAndroid Build Coastguard Workercompiler to know what the stored numeric value is. 49*9880d681SAndroid Build Coastguard Worker 50*9880d681SAndroid Build Coastguard WorkerRight now we only create the AST, so there are no useful functions on 51*9880d681SAndroid Build Coastguard Workerthem. It would be very easy to add a function to pretty print the code, 52*9880d681SAndroid Build Coastguard Workerfor example. Here are the other expression AST node definitions that 53*9880d681SAndroid Build Coastguard Workerwe'll use in the basic form of the Kaleidoscope language: 54*9880d681SAndroid Build Coastguard Worker 55*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 56*9880d681SAndroid Build Coastguard Worker 57*9880d681SAndroid Build Coastguard Worker (* variant for referencing a variable, like "a". *) 58*9880d681SAndroid Build Coastguard Worker | Variable of string 59*9880d681SAndroid Build Coastguard Worker 60*9880d681SAndroid Build Coastguard Worker (* variant for a binary operator. *) 61*9880d681SAndroid Build Coastguard Worker | Binary of char * expr * expr 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard Worker (* variant for function calls. *) 64*9880d681SAndroid Build Coastguard Worker | Call of string * expr array 65*9880d681SAndroid Build Coastguard Worker 66*9880d681SAndroid Build Coastguard WorkerThis is all (intentionally) rather straight-forward: variables capture 67*9880d681SAndroid Build Coastguard Workerthe variable name, binary operators capture their opcode (e.g. '+'), and 68*9880d681SAndroid Build Coastguard Workercalls capture a function name as well as a list of any argument 69*9880d681SAndroid Build Coastguard Workerexpressions. One thing that is nice about our AST is that it captures 70*9880d681SAndroid Build Coastguard Workerthe language features without talking about the syntax of the language. 71*9880d681SAndroid Build Coastguard WorkerNote that there is no discussion about precedence of binary operators, 72*9880d681SAndroid Build Coastguard Workerlexical structure, etc. 73*9880d681SAndroid Build Coastguard Worker 74*9880d681SAndroid Build Coastguard WorkerFor our basic language, these are all of the expression nodes we'll 75*9880d681SAndroid Build Coastguard Workerdefine. Because it doesn't have conditional control flow, it isn't 76*9880d681SAndroid Build Coastguard WorkerTuring-complete; we'll fix that in a later installment. The two things 77*9880d681SAndroid Build Coastguard Workerwe need next are a way to talk about the interface to a function, and a 78*9880d681SAndroid Build Coastguard Workerway to talk about functions themselves: 79*9880d681SAndroid Build Coastguard Worker 80*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 81*9880d681SAndroid Build Coastguard Worker 82*9880d681SAndroid Build Coastguard Worker (* proto - This type represents the "prototype" for a function, which captures 83*9880d681SAndroid Build Coastguard Worker * its name, and its argument names (thus implicitly the number of arguments the 84*9880d681SAndroid Build Coastguard Worker * function takes). *) 85*9880d681SAndroid Build Coastguard Worker type proto = Prototype of string * string array 86*9880d681SAndroid Build Coastguard Worker 87*9880d681SAndroid Build Coastguard Worker (* func - This type represents a function definition itself. *) 88*9880d681SAndroid Build Coastguard Worker type func = Function of proto * expr 89*9880d681SAndroid Build Coastguard Worker 90*9880d681SAndroid Build Coastguard WorkerIn Kaleidoscope, functions are typed with just a count of their 91*9880d681SAndroid Build Coastguard Workerarguments. Since all values are double precision floating point, the 92*9880d681SAndroid Build Coastguard Workertype of each argument doesn't need to be stored anywhere. In a more 93*9880d681SAndroid Build Coastguard Workeraggressive and realistic language, the "expr" variants would probably 94*9880d681SAndroid Build Coastguard Workerhave a type field. 95*9880d681SAndroid Build Coastguard Worker 96*9880d681SAndroid Build Coastguard WorkerWith this scaffolding, we can now talk about parsing expressions and 97*9880d681SAndroid Build Coastguard Workerfunction bodies in Kaleidoscope. 98*9880d681SAndroid Build Coastguard Worker 99*9880d681SAndroid Build Coastguard WorkerParser Basics 100*9880d681SAndroid Build Coastguard Worker============= 101*9880d681SAndroid Build Coastguard Worker 102*9880d681SAndroid Build Coastguard WorkerNow that we have an AST to build, we need to define the parser code to 103*9880d681SAndroid Build Coastguard Workerbuild it. The idea here is that we want to parse something like "x+y" 104*9880d681SAndroid Build Coastguard Worker(which is returned as three tokens by the lexer) into an AST that could 105*9880d681SAndroid Build Coastguard Workerbe generated with calls like this: 106*9880d681SAndroid Build Coastguard Worker 107*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 108*9880d681SAndroid Build Coastguard Worker 109*9880d681SAndroid Build Coastguard Worker let x = Variable "x" in 110*9880d681SAndroid Build Coastguard Worker let y = Variable "y" in 111*9880d681SAndroid Build Coastguard Worker let result = Binary ('+', x, y) in 112*9880d681SAndroid Build Coastguard Worker ... 113*9880d681SAndroid Build Coastguard Worker 114*9880d681SAndroid Build Coastguard WorkerThe error handling routines make use of the builtin ``Stream.Failure`` 115*9880d681SAndroid Build Coastguard Workerand ``Stream.Error``s. ``Stream.Failure`` is raised when the parser is 116*9880d681SAndroid Build Coastguard Workerunable to find any matching token in the first position of a pattern. 117*9880d681SAndroid Build Coastguard Worker``Stream.Error`` is raised when the first token matches, but the rest do 118*9880d681SAndroid Build Coastguard Workernot. The error recovery in our parser will not be the best and is not 119*9880d681SAndroid Build Coastguard Workerparticular user-friendly, but it will be enough for our tutorial. These 120*9880d681SAndroid Build Coastguard Workerexceptions make it easier to handle errors in routines that have various 121*9880d681SAndroid Build Coastguard Workerreturn types. 122*9880d681SAndroid Build Coastguard Worker 123*9880d681SAndroid Build Coastguard WorkerWith these basic types and exceptions, we can implement the first piece 124*9880d681SAndroid Build Coastguard Workerof our grammar: numeric literals. 125*9880d681SAndroid Build Coastguard Worker 126*9880d681SAndroid Build Coastguard WorkerBasic Expression Parsing 127*9880d681SAndroid Build Coastguard Worker======================== 128*9880d681SAndroid Build Coastguard Worker 129*9880d681SAndroid Build Coastguard WorkerWe start with numeric literals, because they are the simplest to 130*9880d681SAndroid Build Coastguard Workerprocess. For each production in our grammar, we'll define a function 131*9880d681SAndroid Build Coastguard Workerwhich parses that production. We call this class of expressions 132*9880d681SAndroid Build Coastguard Worker"primary" expressions, for reasons that will become more clear `later in 133*9880d681SAndroid Build Coastguard Workerthe tutorial <OCamlLangImpl6.html#user-defined-unary-operators>`_. In order to parse an 134*9880d681SAndroid Build Coastguard Workerarbitrary primary expression, we need to determine what sort of 135*9880d681SAndroid Build Coastguard Workerexpression it is. For numeric literals, we have: 136*9880d681SAndroid Build Coastguard Worker 137*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 138*9880d681SAndroid Build Coastguard Worker 139*9880d681SAndroid Build Coastguard Worker (* primary 140*9880d681SAndroid Build Coastguard Worker * ::= identifier 141*9880d681SAndroid Build Coastguard Worker * ::= numberexpr 142*9880d681SAndroid Build Coastguard Worker * ::= parenexpr *) 143*9880d681SAndroid Build Coastguard Worker parse_primary = parser 144*9880d681SAndroid Build Coastguard Worker (* numberexpr ::= number *) 145*9880d681SAndroid Build Coastguard Worker | [< 'Token.Number n >] -> Ast.Number n 146*9880d681SAndroid Build Coastguard Worker 147*9880d681SAndroid Build Coastguard WorkerThis routine is very simple: it expects to be called when the current 148*9880d681SAndroid Build Coastguard Workertoken is a ``Token.Number`` token. It takes the current number value, 149*9880d681SAndroid Build Coastguard Workercreates a ``Ast.Number`` node, advances the lexer to the next token, and 150*9880d681SAndroid Build Coastguard Workerfinally returns. 151*9880d681SAndroid Build Coastguard Worker 152*9880d681SAndroid Build Coastguard WorkerThere are some interesting aspects to this. The most important one is 153*9880d681SAndroid Build Coastguard Workerthat this routine eats all of the tokens that correspond to the 154*9880d681SAndroid Build Coastguard Workerproduction and returns the lexer buffer with the next token (which is 155*9880d681SAndroid Build Coastguard Workernot part of the grammar production) ready to go. This is a fairly 156*9880d681SAndroid Build Coastguard Workerstandard way to go for recursive descent parsers. For a better example, 157*9880d681SAndroid Build Coastguard Workerthe parenthesis operator is defined like this: 158*9880d681SAndroid Build Coastguard Worker 159*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 160*9880d681SAndroid Build Coastguard Worker 161*9880d681SAndroid Build Coastguard Worker (* parenexpr ::= '(' expression ')' *) 162*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e 163*9880d681SAndroid Build Coastguard Worker 164*9880d681SAndroid Build Coastguard WorkerThis function illustrates a number of interesting things about the 165*9880d681SAndroid Build Coastguard Workerparser: 166*9880d681SAndroid Build Coastguard Worker 167*9880d681SAndroid Build Coastguard Worker1) It shows how we use the ``Stream.Error`` exception. When called, this 168*9880d681SAndroid Build Coastguard Workerfunction expects that the current token is a '(' token, but after 169*9880d681SAndroid Build Coastguard Workerparsing the subexpression, it is possible that there is no ')' waiting. 170*9880d681SAndroid Build Coastguard WorkerFor example, if the user types in "(4 x" instead of "(4)", the parser 171*9880d681SAndroid Build Coastguard Workershould emit an error. Because errors can occur, the parser needs a way 172*9880d681SAndroid Build Coastguard Workerto indicate that they happened. In our parser, we use the camlp4 173*9880d681SAndroid Build Coastguard Workershortcut syntax ``token ?? "parse error"``, where if the token before 174*9880d681SAndroid Build Coastguard Workerthe ``??`` does not match, then ``Stream.Error "parse error"`` will be 175*9880d681SAndroid Build Coastguard Workerraised. 176*9880d681SAndroid Build Coastguard Worker 177*9880d681SAndroid Build Coastguard Worker2) Another interesting aspect of this function is that it uses recursion 178*9880d681SAndroid Build Coastguard Workerby calling ``Parser.parse_primary`` (we will soon see that 179*9880d681SAndroid Build Coastguard Worker``Parser.parse_primary`` can call ``Parser.parse_primary``). This is 180*9880d681SAndroid Build Coastguard Workerpowerful because it allows us to handle recursive grammars, and keeps 181*9880d681SAndroid Build Coastguard Workereach production very simple. Note that parentheses do not cause 182*9880d681SAndroid Build Coastguard Workerconstruction of AST nodes themselves. While we could do it this way, the 183*9880d681SAndroid Build Coastguard Workermost important role of parentheses are to guide the parser and provide 184*9880d681SAndroid Build Coastguard Workergrouping. Once the parser constructs the AST, parentheses are not 185*9880d681SAndroid Build Coastguard Workerneeded. 186*9880d681SAndroid Build Coastguard Worker 187*9880d681SAndroid Build Coastguard WorkerThe next simple production is for handling variable references and 188*9880d681SAndroid Build Coastguard Workerfunction calls: 189*9880d681SAndroid Build Coastguard Worker 190*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 191*9880d681SAndroid Build Coastguard Worker 192*9880d681SAndroid Build Coastguard Worker (* identifierexpr 193*9880d681SAndroid Build Coastguard Worker * ::= identifier 194*9880d681SAndroid Build Coastguard Worker * ::= identifier '(' argumentexpr ')' *) 195*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; stream >] -> 196*9880d681SAndroid Build Coastguard Worker let rec parse_args accumulator = parser 197*9880d681SAndroid Build Coastguard Worker | [< e=parse_expr; stream >] -> 198*9880d681SAndroid Build Coastguard Worker begin parser 199*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e 200*9880d681SAndroid Build Coastguard Worker | [< >] -> e :: accumulator 201*9880d681SAndroid Build Coastguard Worker end stream 202*9880d681SAndroid Build Coastguard Worker | [< >] -> accumulator 203*9880d681SAndroid Build Coastguard Worker in 204*9880d681SAndroid Build Coastguard Worker let rec parse_ident id = parser 205*9880d681SAndroid Build Coastguard Worker (* Call. *) 206*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd '('; 207*9880d681SAndroid Build Coastguard Worker args=parse_args []; 208*9880d681SAndroid Build Coastguard Worker 'Token.Kwd ')' ?? "expected ')'">] -> 209*9880d681SAndroid Build Coastguard Worker Ast.Call (id, Array.of_list (List.rev args)) 210*9880d681SAndroid Build Coastguard Worker 211*9880d681SAndroid Build Coastguard Worker (* Simple variable ref. *) 212*9880d681SAndroid Build Coastguard Worker | [< >] -> Ast.Variable id 213*9880d681SAndroid Build Coastguard Worker in 214*9880d681SAndroid Build Coastguard Worker parse_ident id stream 215*9880d681SAndroid Build Coastguard Worker 216*9880d681SAndroid Build Coastguard WorkerThis routine follows the same style as the other routines. (It expects 217*9880d681SAndroid Build Coastguard Workerto be called if the current token is a ``Token.Ident`` token). It also 218*9880d681SAndroid Build Coastguard Workerhas recursion and error handling. One interesting aspect of this is that 219*9880d681SAndroid Build Coastguard Workerit uses *look-ahead* to determine if the current identifier is a stand 220*9880d681SAndroid Build Coastguard Workeralone variable reference or if it is a function call expression. It 221*9880d681SAndroid Build Coastguard Workerhandles this by checking to see if the token after the identifier is a 222*9880d681SAndroid Build Coastguard Worker'(' token, constructing either a ``Ast.Variable`` or ``Ast.Call`` node 223*9880d681SAndroid Build Coastguard Workeras appropriate. 224*9880d681SAndroid Build Coastguard Worker 225*9880d681SAndroid Build Coastguard WorkerWe finish up by raising an exception if we received a token we didn't 226*9880d681SAndroid Build Coastguard Workerexpect: 227*9880d681SAndroid Build Coastguard Worker 228*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 229*9880d681SAndroid Build Coastguard Worker 230*9880d681SAndroid Build Coastguard Worker | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") 231*9880d681SAndroid Build Coastguard Worker 232*9880d681SAndroid Build Coastguard WorkerNow that basic expressions are handled, we need to handle binary 233*9880d681SAndroid Build Coastguard Workerexpressions. They are a bit more complex. 234*9880d681SAndroid Build Coastguard Worker 235*9880d681SAndroid Build Coastguard WorkerBinary Expression Parsing 236*9880d681SAndroid Build Coastguard Worker========================= 237*9880d681SAndroid Build Coastguard Worker 238*9880d681SAndroid Build Coastguard WorkerBinary expressions are significantly harder to parse because they are 239*9880d681SAndroid Build Coastguard Workeroften ambiguous. For example, when given the string "x+y\*z", the parser 240*9880d681SAndroid Build Coastguard Workercan choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common 241*9880d681SAndroid Build Coastguard Workerdefinitions from mathematics, we expect the later parse, because "\*" 242*9880d681SAndroid Build Coastguard Worker(multiplication) has higher *precedence* than "+" (addition). 243*9880d681SAndroid Build Coastguard Worker 244*9880d681SAndroid Build Coastguard WorkerThere are many ways to handle this, but an elegant and efficient way is 245*9880d681SAndroid Build Coastguard Workerto use `Operator-Precedence 246*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_. 247*9880d681SAndroid Build Coastguard WorkerThis parsing technique uses the precedence of binary operators to guide 248*9880d681SAndroid Build Coastguard Workerrecursion. To start with, we need a table of precedences: 249*9880d681SAndroid Build Coastguard Worker 250*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 251*9880d681SAndroid Build Coastguard Worker 252*9880d681SAndroid Build Coastguard Worker (* binop_precedence - This holds the precedence for each binary operator that is 253*9880d681SAndroid Build Coastguard Worker * defined *) 254*9880d681SAndroid Build Coastguard Worker let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 255*9880d681SAndroid Build Coastguard Worker 256*9880d681SAndroid Build Coastguard Worker (* precedence - Get the precedence of the pending binary operator token. *) 257*9880d681SAndroid Build Coastguard Worker let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 258*9880d681SAndroid Build Coastguard Worker 259*9880d681SAndroid Build Coastguard Worker ... 260*9880d681SAndroid Build Coastguard Worker 261*9880d681SAndroid Build Coastguard Worker let main () = 262*9880d681SAndroid Build Coastguard Worker (* Install standard binary operators. 263*9880d681SAndroid Build Coastguard Worker * 1 is the lowest precedence. *) 264*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '<' 10; 265*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '+' 20; 266*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '-' 20; 267*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) 268*9880d681SAndroid Build Coastguard Worker ... 269*9880d681SAndroid Build Coastguard Worker 270*9880d681SAndroid Build Coastguard WorkerFor the basic form of Kaleidoscope, we will only support 4 binary 271*9880d681SAndroid Build Coastguard Workeroperators (this can obviously be extended by you, our brave and intrepid 272*9880d681SAndroid Build Coastguard Workerreader). The ``Parser.precedence`` function returns the precedence for 273*9880d681SAndroid Build Coastguard Workerthe current token, or -1 if the token is not a binary operator. Having a 274*9880d681SAndroid Build Coastguard Worker``Hashtbl.t`` makes it easy to add new operators and makes it clear that 275*9880d681SAndroid Build Coastguard Workerthe algorithm doesn't depend on the specific operators involved, but it 276*9880d681SAndroid Build Coastguard Workerwould be easy enough to eliminate the ``Hashtbl.t`` and do the 277*9880d681SAndroid Build Coastguard Workercomparisons in the ``Parser.precedence`` function. (Or just use a 278*9880d681SAndroid Build Coastguard Workerfixed-size array). 279*9880d681SAndroid Build Coastguard Worker 280*9880d681SAndroid Build Coastguard WorkerWith the helper above defined, we can now start parsing binary 281*9880d681SAndroid Build Coastguard Workerexpressions. The basic idea of operator precedence parsing is to break 282*9880d681SAndroid Build Coastguard Workerdown an expression with potentially ambiguous binary operators into 283*9880d681SAndroid Build Coastguard Workerpieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g". 284*9880d681SAndroid Build Coastguard WorkerOperator precedence parsing considers this as a stream of primary 285*9880d681SAndroid Build Coastguard Workerexpressions separated by binary operators. As such, it will first parse 286*9880d681SAndroid Build Coastguard Workerthe leading primary expression "a", then it will see the pairs [+, b] 287*9880d681SAndroid Build Coastguard Worker[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are 288*9880d681SAndroid Build Coastguard Workerprimary expressions, the binary expression parser doesn't need to worry 289*9880d681SAndroid Build Coastguard Workerabout nested subexpressions like (c+d) at all. 290*9880d681SAndroid Build Coastguard Worker 291*9880d681SAndroid Build Coastguard WorkerTo start, an expression is a primary expression potentially followed by 292*9880d681SAndroid Build Coastguard Workera sequence of [binop,primaryexpr] pairs: 293*9880d681SAndroid Build Coastguard Worker 294*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 295*9880d681SAndroid Build Coastguard Worker 296*9880d681SAndroid Build Coastguard Worker (* expression 297*9880d681SAndroid Build Coastguard Worker * ::= primary binoprhs *) 298*9880d681SAndroid Build Coastguard Worker and parse_expr = parser 299*9880d681SAndroid Build Coastguard Worker | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream 300*9880d681SAndroid Build Coastguard Worker 301*9880d681SAndroid Build Coastguard Worker``Parser.parse_bin_rhs`` is the function that parses the sequence of 302*9880d681SAndroid Build Coastguard Workerpairs for us. It takes a precedence and a pointer to an expression for 303*9880d681SAndroid Build Coastguard Workerthe part that has been parsed so far. Note that "x" is a perfectly valid 304*9880d681SAndroid Build Coastguard Workerexpression: As such, "binoprhs" is allowed to be empty, in which case it 305*9880d681SAndroid Build Coastguard Workerreturns the expression that is passed into it. In our example above, the 306*9880d681SAndroid Build Coastguard Workercode passes the expression for "a" into ``Parser.parse_bin_rhs`` and the 307*9880d681SAndroid Build Coastguard Workercurrent token is "+". 308*9880d681SAndroid Build Coastguard Worker 309*9880d681SAndroid Build Coastguard WorkerThe precedence value passed into ``Parser.parse_bin_rhs`` indicates the 310*9880d681SAndroid Build Coastguard Worker*minimal operator precedence* that the function is allowed to eat. For 311*9880d681SAndroid Build Coastguard Workerexample, if the current pair stream is [+, x] and 312*9880d681SAndroid Build Coastguard Worker``Parser.parse_bin_rhs`` is passed in a precedence of 40, it will not 313*9880d681SAndroid Build Coastguard Workerconsume any tokens (because the precedence of '+' is only 20). With this 314*9880d681SAndroid Build Coastguard Workerin mind, ``Parser.parse_bin_rhs`` starts with: 315*9880d681SAndroid Build Coastguard Worker 316*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 317*9880d681SAndroid Build Coastguard Worker 318*9880d681SAndroid Build Coastguard Worker (* binoprhs 319*9880d681SAndroid Build Coastguard Worker * ::= ('+' primary)* *) 320*9880d681SAndroid Build Coastguard Worker and parse_bin_rhs expr_prec lhs stream = 321*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 322*9880d681SAndroid Build Coastguard Worker (* If this is a binop, find its precedence. *) 323*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> 324*9880d681SAndroid Build Coastguard Worker let token_prec = precedence c in 325*9880d681SAndroid Build Coastguard Worker 326*9880d681SAndroid Build Coastguard Worker (* If this is a binop that binds at least as tightly as the current binop, 327*9880d681SAndroid Build Coastguard Worker * consume it, otherwise we are done. *) 328*9880d681SAndroid Build Coastguard Worker if token_prec < expr_prec then lhs else begin 329*9880d681SAndroid Build Coastguard Worker 330*9880d681SAndroid Build Coastguard WorkerThis code gets the precedence of the current token and checks to see if 331*9880d681SAndroid Build Coastguard Workerif is too low. Because we defined invalid tokens to have a precedence of 332*9880d681SAndroid Build Coastguard Worker-1, this check implicitly knows that the pair-stream ends when the token 333*9880d681SAndroid Build Coastguard Workerstream runs out of binary operators. If this check succeeds, we know 334*9880d681SAndroid Build Coastguard Workerthat the token is a binary operator and that it will be included in this 335*9880d681SAndroid Build Coastguard Workerexpression: 336*9880d681SAndroid Build Coastguard Worker 337*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 338*9880d681SAndroid Build Coastguard Worker 339*9880d681SAndroid Build Coastguard Worker (* Eat the binop. *) 340*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 341*9880d681SAndroid Build Coastguard Worker 342*9880d681SAndroid Build Coastguard Worker (* Parse the primary expression after the binary operator *) 343*9880d681SAndroid Build Coastguard Worker let rhs = parse_primary stream in 344*9880d681SAndroid Build Coastguard Worker 345*9880d681SAndroid Build Coastguard Worker (* Okay, we know this is a binop. *) 346*9880d681SAndroid Build Coastguard Worker let rhs = 347*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 348*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd c2) -> 349*9880d681SAndroid Build Coastguard Worker 350*9880d681SAndroid Build Coastguard WorkerAs such, this code eats (and remembers) the binary operator and then 351*9880d681SAndroid Build Coastguard Workerparses the primary expression that follows. This builds up the whole 352*9880d681SAndroid Build Coastguard Workerpair, the first of which is [+, b] for the running example. 353*9880d681SAndroid Build Coastguard Worker 354*9880d681SAndroid Build Coastguard WorkerNow that we parsed the left-hand side of an expression and one pair of 355*9880d681SAndroid Build Coastguard Workerthe RHS sequence, we have to decide which way the expression associates. 356*9880d681SAndroid Build Coastguard WorkerIn particular, we could have "(a+b) binop unparsed" or "a + (b binop 357*9880d681SAndroid Build Coastguard Workerunparsed)". To determine this, we look ahead at "binop" to determine its 358*9880d681SAndroid Build Coastguard Workerprecedence and compare it to BinOp's precedence (which is '+' in this 359*9880d681SAndroid Build Coastguard Workercase): 360*9880d681SAndroid Build Coastguard Worker 361*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 362*9880d681SAndroid Build Coastguard Worker 363*9880d681SAndroid Build Coastguard Worker (* If BinOp binds less tightly with rhs than the operator after 364*9880d681SAndroid Build Coastguard Worker * rhs, let the pending operator take rhs as its lhs. *) 365*9880d681SAndroid Build Coastguard Worker let next_prec = precedence c2 in 366*9880d681SAndroid Build Coastguard Worker if token_prec < next_prec 367*9880d681SAndroid Build Coastguard Worker 368*9880d681SAndroid Build Coastguard WorkerIf the precedence of the binop to the right of "RHS" is lower or equal 369*9880d681SAndroid Build Coastguard Workerto the precedence of our current operator, then we know that the 370*9880d681SAndroid Build Coastguard Workerparentheses associate as "(a+b) binop ...". In our example, the current 371*9880d681SAndroid Build Coastguard Workeroperator is "+" and the next operator is "+", we know that they have the 372*9880d681SAndroid Build Coastguard Workersame precedence. In this case we'll create the AST node for "a+b", and 373*9880d681SAndroid Build Coastguard Workerthen continue parsing: 374*9880d681SAndroid Build Coastguard Worker 375*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 376*9880d681SAndroid Build Coastguard Worker 377*9880d681SAndroid Build Coastguard Worker ... if body omitted ... 378*9880d681SAndroid Build Coastguard Worker in 379*9880d681SAndroid Build Coastguard Worker 380*9880d681SAndroid Build Coastguard Worker (* Merge lhs/rhs. *) 381*9880d681SAndroid Build Coastguard Worker let lhs = Ast.Binary (c, lhs, rhs) in 382*9880d681SAndroid Build Coastguard Worker parse_bin_rhs expr_prec lhs stream 383*9880d681SAndroid Build Coastguard Worker end 384*9880d681SAndroid Build Coastguard Worker 385*9880d681SAndroid Build Coastguard WorkerIn our example above, this will turn "a+b+" into "(a+b)" and execute the 386*9880d681SAndroid Build Coastguard Workernext iteration of the loop, with "+" as the current token. The code 387*9880d681SAndroid Build Coastguard Workerabove will eat, remember, and parse "(c+d)" as the primary expression, 388*9880d681SAndroid Build Coastguard Workerwhich makes the current pair equal to [+, (c+d)]. It will then evaluate 389*9880d681SAndroid Build Coastguard Workerthe 'if' conditional above with "\*" as the binop to the right of the 390*9880d681SAndroid Build Coastguard Workerprimary. In this case, the precedence of "\*" is higher than the 391*9880d681SAndroid Build Coastguard Workerprecedence of "+" so the if condition will be entered. 392*9880d681SAndroid Build Coastguard Worker 393*9880d681SAndroid Build Coastguard WorkerThe critical question left here is "how can the if condition parse the 394*9880d681SAndroid Build Coastguard Workerright hand side in full"? In particular, to build the AST correctly for 395*9880d681SAndroid Build Coastguard Workerour example, it needs to get all of "(c+d)\*e\*f" as the RHS expression 396*9880d681SAndroid Build Coastguard Workervariable. The code to do this is surprisingly simple (code from the 397*9880d681SAndroid Build Coastguard Workerabove two blocks duplicated for context): 398*9880d681SAndroid Build Coastguard Worker 399*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 400*9880d681SAndroid Build Coastguard Worker 401*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 402*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd c2) -> 403*9880d681SAndroid Build Coastguard Worker (* If BinOp binds less tightly with rhs than the operator after 404*9880d681SAndroid Build Coastguard Worker * rhs, let the pending operator take rhs as its lhs. *) 405*9880d681SAndroid Build Coastguard Worker if token_prec < precedence c2 406*9880d681SAndroid Build Coastguard Worker then parse_bin_rhs (token_prec + 1) rhs stream 407*9880d681SAndroid Build Coastguard Worker else rhs 408*9880d681SAndroid Build Coastguard Worker | _ -> rhs 409*9880d681SAndroid Build Coastguard Worker in 410*9880d681SAndroid Build Coastguard Worker 411*9880d681SAndroid Build Coastguard Worker (* Merge lhs/rhs. *) 412*9880d681SAndroid Build Coastguard Worker let lhs = Ast.Binary (c, lhs, rhs) in 413*9880d681SAndroid Build Coastguard Worker parse_bin_rhs expr_prec lhs stream 414*9880d681SAndroid Build Coastguard Worker end 415*9880d681SAndroid Build Coastguard Worker 416*9880d681SAndroid Build Coastguard WorkerAt this point, we know that the binary operator to the RHS of our 417*9880d681SAndroid Build Coastguard Workerprimary has higher precedence than the binop we are currently parsing. 418*9880d681SAndroid Build Coastguard WorkerAs such, we know that any sequence of pairs whose operators are all 419*9880d681SAndroid Build Coastguard Workerhigher precedence than "+" should be parsed together and returned as 420*9880d681SAndroid Build Coastguard Worker"RHS". To do this, we recursively invoke the ``Parser.parse_bin_rhs`` 421*9880d681SAndroid Build Coastguard Workerfunction specifying "token\_prec+1" as the minimum precedence required 422*9880d681SAndroid Build Coastguard Workerfor it to continue. In our example above, this will cause it to return 423*9880d681SAndroid Build Coastguard Workerthe AST node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of 424*9880d681SAndroid Build Coastguard Workerthe '+' expression. 425*9880d681SAndroid Build Coastguard Worker 426*9880d681SAndroid Build Coastguard WorkerFinally, on the next iteration of the while loop, the "+g" piece is 427*9880d681SAndroid Build Coastguard Workerparsed and added to the AST. With this little bit of code (14 428*9880d681SAndroid Build Coastguard Workernon-trivial lines), we correctly handle fully general binary expression 429*9880d681SAndroid Build Coastguard Workerparsing in a very elegant way. This was a whirlwind tour of this code, 430*9880d681SAndroid Build Coastguard Workerand it is somewhat subtle. I recommend running through it with a few 431*9880d681SAndroid Build Coastguard Workertough examples to see how it works. 432*9880d681SAndroid Build Coastguard Worker 433*9880d681SAndroid Build Coastguard WorkerThis wraps up handling of expressions. At this point, we can point the 434*9880d681SAndroid Build Coastguard Workerparser at an arbitrary token stream and build an expression from it, 435*9880d681SAndroid Build Coastguard Workerstopping at the first token that is not part of the expression. Next up 436*9880d681SAndroid Build Coastguard Workerwe need to handle function definitions, etc. 437*9880d681SAndroid Build Coastguard Worker 438*9880d681SAndroid Build Coastguard WorkerParsing the Rest 439*9880d681SAndroid Build Coastguard Worker================ 440*9880d681SAndroid Build Coastguard Worker 441*9880d681SAndroid Build Coastguard WorkerThe next thing missing is handling of function prototypes. In 442*9880d681SAndroid Build Coastguard WorkerKaleidoscope, these are used both for 'extern' function declarations as 443*9880d681SAndroid Build Coastguard Workerwell as function body definitions. The code to do this is 444*9880d681SAndroid Build Coastguard Workerstraight-forward and not very interesting (once you've survived 445*9880d681SAndroid Build Coastguard Workerexpressions): 446*9880d681SAndroid Build Coastguard Worker 447*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 448*9880d681SAndroid Build Coastguard Worker 449*9880d681SAndroid Build Coastguard Worker (* prototype 450*9880d681SAndroid Build Coastguard Worker * ::= id '(' id* ')' *) 451*9880d681SAndroid Build Coastguard Worker let parse_prototype = 452*9880d681SAndroid Build Coastguard Worker let rec parse_args accumulator = parser 453*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 454*9880d681SAndroid Build Coastguard Worker | [< >] -> accumulator 455*9880d681SAndroid Build Coastguard Worker in 456*9880d681SAndroid Build Coastguard Worker 457*9880d681SAndroid Build Coastguard Worker parser 458*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; 459*9880d681SAndroid Build Coastguard Worker 'Token.Kwd '(' ?? "expected '(' in prototype"; 460*9880d681SAndroid Build Coastguard Worker args=parse_args []; 461*9880d681SAndroid Build Coastguard Worker 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 462*9880d681SAndroid Build Coastguard Worker (* success. *) 463*9880d681SAndroid Build Coastguard Worker Ast.Prototype (id, Array.of_list (List.rev args)) 464*9880d681SAndroid Build Coastguard Worker 465*9880d681SAndroid Build Coastguard Worker | [< >] -> 466*9880d681SAndroid Build Coastguard Worker raise (Stream.Error "expected function name in prototype") 467*9880d681SAndroid Build Coastguard Worker 468*9880d681SAndroid Build Coastguard WorkerGiven this, a function definition is very simple, just a prototype plus 469*9880d681SAndroid Build Coastguard Workeran expression to implement the body: 470*9880d681SAndroid Build Coastguard Worker 471*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 472*9880d681SAndroid Build Coastguard Worker 473*9880d681SAndroid Build Coastguard Worker (* definition ::= 'def' prototype expression *) 474*9880d681SAndroid Build Coastguard Worker let parse_definition = parser 475*9880d681SAndroid Build Coastguard Worker | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> 476*9880d681SAndroid Build Coastguard Worker Ast.Function (p, e) 477*9880d681SAndroid Build Coastguard Worker 478*9880d681SAndroid Build Coastguard WorkerIn addition, we support 'extern' to declare functions like 'sin' and 479*9880d681SAndroid Build Coastguard Worker'cos' as well as to support forward declaration of user functions. These 480*9880d681SAndroid Build Coastguard Worker'extern's are just prototypes with no body: 481*9880d681SAndroid Build Coastguard Worker 482*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 483*9880d681SAndroid Build Coastguard Worker 484*9880d681SAndroid Build Coastguard Worker (* external ::= 'extern' prototype *) 485*9880d681SAndroid Build Coastguard Worker let parse_extern = parser 486*9880d681SAndroid Build Coastguard Worker | [< 'Token.Extern; e=parse_prototype >] -> e 487*9880d681SAndroid Build Coastguard Worker 488*9880d681SAndroid Build Coastguard WorkerFinally, we'll also let the user type in arbitrary top-level expressions 489*9880d681SAndroid Build Coastguard Workerand evaluate them on the fly. We will handle this by defining anonymous 490*9880d681SAndroid Build Coastguard Workernullary (zero argument) functions for them: 491*9880d681SAndroid Build Coastguard Worker 492*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 493*9880d681SAndroid Build Coastguard Worker 494*9880d681SAndroid Build Coastguard Worker (* toplevelexpr ::= expression *) 495*9880d681SAndroid Build Coastguard Worker let parse_toplevel = parser 496*9880d681SAndroid Build Coastguard Worker | [< e=parse_expr >] -> 497*9880d681SAndroid Build Coastguard Worker (* Make an anonymous proto. *) 498*9880d681SAndroid Build Coastguard Worker Ast.Function (Ast.Prototype ("", [||]), e) 499*9880d681SAndroid Build Coastguard Worker 500*9880d681SAndroid Build Coastguard WorkerNow that we have all the pieces, let's build a little driver that will 501*9880d681SAndroid Build Coastguard Workerlet us actually *execute* this code we've built! 502*9880d681SAndroid Build Coastguard Worker 503*9880d681SAndroid Build Coastguard WorkerThe Driver 504*9880d681SAndroid Build Coastguard Worker========== 505*9880d681SAndroid Build Coastguard Worker 506*9880d681SAndroid Build Coastguard WorkerThe driver for this simply invokes all of the parsing pieces with a 507*9880d681SAndroid Build Coastguard Workertop-level dispatch loop. There isn't much interesting here, so I'll just 508*9880d681SAndroid Build Coastguard Workerinclude the top-level loop. See `below <#full-code-listing>`_ for full code in the 509*9880d681SAndroid Build Coastguard Worker"Top-Level Parsing" section. 510*9880d681SAndroid Build Coastguard Worker 511*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml 512*9880d681SAndroid Build Coastguard Worker 513*9880d681SAndroid Build Coastguard Worker (* top ::= definition | external | expression | ';' *) 514*9880d681SAndroid Build Coastguard Worker let rec main_loop stream = 515*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 516*9880d681SAndroid Build Coastguard Worker | None -> () 517*9880d681SAndroid Build Coastguard Worker 518*9880d681SAndroid Build Coastguard Worker (* ignore top-level semicolons. *) 519*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd ';') -> 520*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 521*9880d681SAndroid Build Coastguard Worker main_loop stream 522*9880d681SAndroid Build Coastguard Worker 523*9880d681SAndroid Build Coastguard Worker | Some token -> 524*9880d681SAndroid Build Coastguard Worker begin 525*9880d681SAndroid Build Coastguard Worker try match token with 526*9880d681SAndroid Build Coastguard Worker | Token.Def -> 527*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_definition stream); 528*9880d681SAndroid Build Coastguard Worker print_endline "parsed a function definition."; 529*9880d681SAndroid Build Coastguard Worker | Token.Extern -> 530*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_extern stream); 531*9880d681SAndroid Build Coastguard Worker print_endline "parsed an extern."; 532*9880d681SAndroid Build Coastguard Worker | _ -> 533*9880d681SAndroid Build Coastguard Worker (* Evaluate a top-level expression into an anonymous function. *) 534*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_toplevel stream); 535*9880d681SAndroid Build Coastguard Worker print_endline "parsed a top-level expr"; 536*9880d681SAndroid Build Coastguard Worker with Stream.Error s -> 537*9880d681SAndroid Build Coastguard Worker (* Skip token for error recovery. *) 538*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 539*9880d681SAndroid Build Coastguard Worker print_endline s; 540*9880d681SAndroid Build Coastguard Worker end; 541*9880d681SAndroid Build Coastguard Worker print_string "ready> "; flush stdout; 542*9880d681SAndroid Build Coastguard Worker main_loop stream 543*9880d681SAndroid Build Coastguard Worker 544*9880d681SAndroid Build Coastguard WorkerThe most interesting part of this is that we ignore top-level 545*9880d681SAndroid Build Coastguard Workersemicolons. Why is this, you ask? The basic reason is that if you type 546*9880d681SAndroid Build Coastguard Worker"4 + 5" at the command line, the parser doesn't know whether that is the 547*9880d681SAndroid Build Coastguard Workerend of what you will type or not. For example, on the next line you 548*9880d681SAndroid Build Coastguard Workercould type "def foo..." in which case 4+5 is the end of a top-level 549*9880d681SAndroid Build Coastguard Workerexpression. Alternatively you could type "\* 6", which would continue 550*9880d681SAndroid Build Coastguard Workerthe expression. Having top-level semicolons allows you to type "4+5;", 551*9880d681SAndroid Build Coastguard Workerand the parser will know you are done. 552*9880d681SAndroid Build Coastguard Worker 553*9880d681SAndroid Build Coastguard WorkerConclusions 554*9880d681SAndroid Build Coastguard Worker=========== 555*9880d681SAndroid Build Coastguard Worker 556*9880d681SAndroid Build Coastguard WorkerWith just under 300 lines of commented code (240 lines of non-comment, 557*9880d681SAndroid Build Coastguard Workernon-blank code), we fully defined our minimal language, including a 558*9880d681SAndroid Build Coastguard Workerlexer, parser, and AST builder. With this done, the executable will 559*9880d681SAndroid Build Coastguard Workervalidate Kaleidoscope code and tell us if it is grammatically invalid. 560*9880d681SAndroid Build Coastguard WorkerFor example, here is a sample interaction: 561*9880d681SAndroid Build Coastguard Worker 562*9880d681SAndroid Build Coastguard Worker.. code-block:: bash 563*9880d681SAndroid Build Coastguard Worker 564*9880d681SAndroid Build Coastguard Worker $ ./toy.byte 565*9880d681SAndroid Build Coastguard Worker ready> def foo(x y) x+foo(y, 4.0); 566*9880d681SAndroid Build Coastguard Worker Parsed a function definition. 567*9880d681SAndroid Build Coastguard Worker ready> def foo(x y) x+y y; 568*9880d681SAndroid Build Coastguard Worker Parsed a function definition. 569*9880d681SAndroid Build Coastguard Worker Parsed a top-level expr 570*9880d681SAndroid Build Coastguard Worker ready> def foo(x y) x+y ); 571*9880d681SAndroid Build Coastguard Worker Parsed a function definition. 572*9880d681SAndroid Build Coastguard Worker Error: unknown token when expecting an expression 573*9880d681SAndroid Build Coastguard Worker ready> extern sin(a); 574*9880d681SAndroid Build Coastguard Worker ready> Parsed an extern 575*9880d681SAndroid Build Coastguard Worker ready> ^D 576*9880d681SAndroid Build Coastguard Worker $ 577*9880d681SAndroid Build Coastguard Worker 578*9880d681SAndroid Build Coastguard WorkerThere is a lot of room for extension here. You can define new AST nodes, 579*9880d681SAndroid Build Coastguard Workerextend the language in many ways, etc. In the `next 580*9880d681SAndroid Build Coastguard Workerinstallment <OCamlLangImpl3.html>`_, we will describe how to generate 581*9880d681SAndroid Build Coastguard WorkerLLVM Intermediate Representation (IR) from the AST. 582*9880d681SAndroid Build Coastguard Worker 583*9880d681SAndroid Build Coastguard WorkerFull Code Listing 584*9880d681SAndroid Build Coastguard Worker================= 585*9880d681SAndroid Build Coastguard Worker 586*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for this and the previous chapter. 587*9880d681SAndroid Build Coastguard WorkerNote that it is fully self-contained: you don't need LLVM or any 588*9880d681SAndroid Build Coastguard Workerexternal libraries at all for this. (Besides the ocaml standard 589*9880d681SAndroid Build Coastguard Workerlibraries, of course.) To build this, just compile with: 590*9880d681SAndroid Build Coastguard Worker 591*9880d681SAndroid Build Coastguard Worker.. code-block:: bash 592*9880d681SAndroid Build Coastguard Worker 593*9880d681SAndroid Build Coastguard Worker # Compile 594*9880d681SAndroid Build Coastguard Worker ocamlbuild toy.byte 595*9880d681SAndroid Build Coastguard Worker # Run 596*9880d681SAndroid Build Coastguard Worker ./toy.byte 597*9880d681SAndroid Build Coastguard Worker 598*9880d681SAndroid Build Coastguard WorkerHere is the code: 599*9880d681SAndroid Build Coastguard Worker 600*9880d681SAndroid Build Coastguard Worker\_tags: 601*9880d681SAndroid Build Coastguard Worker :: 602*9880d681SAndroid Build Coastguard Worker 603*9880d681SAndroid Build Coastguard Worker <{lexer,parser}.ml>: use_camlp4, pp(camlp4of) 604*9880d681SAndroid Build Coastguard Worker 605*9880d681SAndroid Build Coastguard Workertoken.ml: 606*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 607*9880d681SAndroid Build Coastguard Worker 608*9880d681SAndroid Build Coastguard Worker (*===----------------------------------------------------------------------=== 609*9880d681SAndroid Build Coastguard Worker * Lexer Tokens 610*9880d681SAndroid Build Coastguard Worker *===----------------------------------------------------------------------===*) 611*9880d681SAndroid Build Coastguard Worker 612*9880d681SAndroid Build Coastguard Worker (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of 613*9880d681SAndroid Build Coastguard Worker * these others for known things. *) 614*9880d681SAndroid Build Coastguard Worker type token = 615*9880d681SAndroid Build Coastguard Worker (* commands *) 616*9880d681SAndroid Build Coastguard Worker | Def | Extern 617*9880d681SAndroid Build Coastguard Worker 618*9880d681SAndroid Build Coastguard Worker (* primary *) 619*9880d681SAndroid Build Coastguard Worker | Ident of string | Number of float 620*9880d681SAndroid Build Coastguard Worker 621*9880d681SAndroid Build Coastguard Worker (* unknown *) 622*9880d681SAndroid Build Coastguard Worker | Kwd of char 623*9880d681SAndroid Build Coastguard Worker 624*9880d681SAndroid Build Coastguard Workerlexer.ml: 625*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 626*9880d681SAndroid Build Coastguard Worker 627*9880d681SAndroid Build Coastguard Worker (*===----------------------------------------------------------------------=== 628*9880d681SAndroid Build Coastguard Worker * Lexer 629*9880d681SAndroid Build Coastguard Worker *===----------------------------------------------------------------------===*) 630*9880d681SAndroid Build Coastguard Worker 631*9880d681SAndroid Build Coastguard Worker let rec lex = parser 632*9880d681SAndroid Build Coastguard Worker (* Skip any whitespace. *) 633*9880d681SAndroid Build Coastguard Worker | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream 634*9880d681SAndroid Build Coastguard Worker 635*9880d681SAndroid Build Coastguard Worker (* identifier: [a-zA-Z][a-zA-Z0-9] *) 636*9880d681SAndroid Build Coastguard Worker | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] -> 637*9880d681SAndroid Build Coastguard Worker let buffer = Buffer.create 1 in 638*9880d681SAndroid Build Coastguard Worker Buffer.add_char buffer c; 639*9880d681SAndroid Build Coastguard Worker lex_ident buffer stream 640*9880d681SAndroid Build Coastguard Worker 641*9880d681SAndroid Build Coastguard Worker (* number: [0-9.]+ *) 642*9880d681SAndroid Build Coastguard Worker | [< ' ('0' .. '9' as c); stream >] -> 643*9880d681SAndroid Build Coastguard Worker let buffer = Buffer.create 1 in 644*9880d681SAndroid Build Coastguard Worker Buffer.add_char buffer c; 645*9880d681SAndroid Build Coastguard Worker lex_number buffer stream 646*9880d681SAndroid Build Coastguard Worker 647*9880d681SAndroid Build Coastguard Worker (* Comment until end of line. *) 648*9880d681SAndroid Build Coastguard Worker | [< ' ('#'); stream >] -> 649*9880d681SAndroid Build Coastguard Worker lex_comment stream 650*9880d681SAndroid Build Coastguard Worker 651*9880d681SAndroid Build Coastguard Worker (* Otherwise, just return the character as its ascii value. *) 652*9880d681SAndroid Build Coastguard Worker | [< 'c; stream >] -> 653*9880d681SAndroid Build Coastguard Worker [< 'Token.Kwd c; lex stream >] 654*9880d681SAndroid Build Coastguard Worker 655*9880d681SAndroid Build Coastguard Worker (* end of stream. *) 656*9880d681SAndroid Build Coastguard Worker | [< >] -> [< >] 657*9880d681SAndroid Build Coastguard Worker 658*9880d681SAndroid Build Coastguard Worker and lex_number buffer = parser 659*9880d681SAndroid Build Coastguard Worker | [< ' ('0' .. '9' | '.' as c); stream >] -> 660*9880d681SAndroid Build Coastguard Worker Buffer.add_char buffer c; 661*9880d681SAndroid Build Coastguard Worker lex_number buffer stream 662*9880d681SAndroid Build Coastguard Worker | [< stream=lex >] -> 663*9880d681SAndroid Build Coastguard Worker [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >] 664*9880d681SAndroid Build Coastguard Worker 665*9880d681SAndroid Build Coastguard Worker and lex_ident buffer = parser 666*9880d681SAndroid Build Coastguard Worker | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] -> 667*9880d681SAndroid Build Coastguard Worker Buffer.add_char buffer c; 668*9880d681SAndroid Build Coastguard Worker lex_ident buffer stream 669*9880d681SAndroid Build Coastguard Worker | [< stream=lex >] -> 670*9880d681SAndroid Build Coastguard Worker match Buffer.contents buffer with 671*9880d681SAndroid Build Coastguard Worker | "def" -> [< 'Token.Def; stream >] 672*9880d681SAndroid Build Coastguard Worker | "extern" -> [< 'Token.Extern; stream >] 673*9880d681SAndroid Build Coastguard Worker | id -> [< 'Token.Ident id; stream >] 674*9880d681SAndroid Build Coastguard Worker 675*9880d681SAndroid Build Coastguard Worker and lex_comment = parser 676*9880d681SAndroid Build Coastguard Worker | [< ' ('\n'); stream=lex >] -> stream 677*9880d681SAndroid Build Coastguard Worker | [< 'c; e=lex_comment >] -> e 678*9880d681SAndroid Build Coastguard Worker | [< >] -> [< >] 679*9880d681SAndroid Build Coastguard Worker 680*9880d681SAndroid Build Coastguard Workerast.ml: 681*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 682*9880d681SAndroid Build Coastguard Worker 683*9880d681SAndroid Build Coastguard Worker (*===----------------------------------------------------------------------=== 684*9880d681SAndroid Build Coastguard Worker * Abstract Syntax Tree (aka Parse Tree) 685*9880d681SAndroid Build Coastguard Worker *===----------------------------------------------------------------------===*) 686*9880d681SAndroid Build Coastguard Worker 687*9880d681SAndroid Build Coastguard Worker (* expr - Base type for all expression nodes. *) 688*9880d681SAndroid Build Coastguard Worker type expr = 689*9880d681SAndroid Build Coastguard Worker (* variant for numeric literals like "1.0". *) 690*9880d681SAndroid Build Coastguard Worker | Number of float 691*9880d681SAndroid Build Coastguard Worker 692*9880d681SAndroid Build Coastguard Worker (* variant for referencing a variable, like "a". *) 693*9880d681SAndroid Build Coastguard Worker | Variable of string 694*9880d681SAndroid Build Coastguard Worker 695*9880d681SAndroid Build Coastguard Worker (* variant for a binary operator. *) 696*9880d681SAndroid Build Coastguard Worker | Binary of char * expr * expr 697*9880d681SAndroid Build Coastguard Worker 698*9880d681SAndroid Build Coastguard Worker (* variant for function calls. *) 699*9880d681SAndroid Build Coastguard Worker | Call of string * expr array 700*9880d681SAndroid Build Coastguard Worker 701*9880d681SAndroid Build Coastguard Worker (* proto - This type represents the "prototype" for a function, which captures 702*9880d681SAndroid Build Coastguard Worker * its name, and its argument names (thus implicitly the number of arguments the 703*9880d681SAndroid Build Coastguard Worker * function takes). *) 704*9880d681SAndroid Build Coastguard Worker type proto = Prototype of string * string array 705*9880d681SAndroid Build Coastguard Worker 706*9880d681SAndroid Build Coastguard Worker (* func - This type represents a function definition itself. *) 707*9880d681SAndroid Build Coastguard Worker type func = Function of proto * expr 708*9880d681SAndroid Build Coastguard Worker 709*9880d681SAndroid Build Coastguard Workerparser.ml: 710*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 711*9880d681SAndroid Build Coastguard Worker 712*9880d681SAndroid Build Coastguard Worker (*===---------------------------------------------------------------------=== 713*9880d681SAndroid Build Coastguard Worker * Parser 714*9880d681SAndroid Build Coastguard Worker *===---------------------------------------------------------------------===*) 715*9880d681SAndroid Build Coastguard Worker 716*9880d681SAndroid Build Coastguard Worker (* binop_precedence - This holds the precedence for each binary operator that is 717*9880d681SAndroid Build Coastguard Worker * defined *) 718*9880d681SAndroid Build Coastguard Worker let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 719*9880d681SAndroid Build Coastguard Worker 720*9880d681SAndroid Build Coastguard Worker (* precedence - Get the precedence of the pending binary operator token. *) 721*9880d681SAndroid Build Coastguard Worker let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 722*9880d681SAndroid Build Coastguard Worker 723*9880d681SAndroid Build Coastguard Worker (* primary 724*9880d681SAndroid Build Coastguard Worker * ::= identifier 725*9880d681SAndroid Build Coastguard Worker * ::= numberexpr 726*9880d681SAndroid Build Coastguard Worker * ::= parenexpr *) 727*9880d681SAndroid Build Coastguard Worker let rec parse_primary = parser 728*9880d681SAndroid Build Coastguard Worker (* numberexpr ::= number *) 729*9880d681SAndroid Build Coastguard Worker | [< 'Token.Number n >] -> Ast.Number n 730*9880d681SAndroid Build Coastguard Worker 731*9880d681SAndroid Build Coastguard Worker (* parenexpr ::= '(' expression ')' *) 732*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e 733*9880d681SAndroid Build Coastguard Worker 734*9880d681SAndroid Build Coastguard Worker (* identifierexpr 735*9880d681SAndroid Build Coastguard Worker * ::= identifier 736*9880d681SAndroid Build Coastguard Worker * ::= identifier '(' argumentexpr ')' *) 737*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; stream >] -> 738*9880d681SAndroid Build Coastguard Worker let rec parse_args accumulator = parser 739*9880d681SAndroid Build Coastguard Worker | [< e=parse_expr; stream >] -> 740*9880d681SAndroid Build Coastguard Worker begin parser 741*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e 742*9880d681SAndroid Build Coastguard Worker | [< >] -> e :: accumulator 743*9880d681SAndroid Build Coastguard Worker end stream 744*9880d681SAndroid Build Coastguard Worker | [< >] -> accumulator 745*9880d681SAndroid Build Coastguard Worker in 746*9880d681SAndroid Build Coastguard Worker let rec parse_ident id = parser 747*9880d681SAndroid Build Coastguard Worker (* Call. *) 748*9880d681SAndroid Build Coastguard Worker | [< 'Token.Kwd '('; 749*9880d681SAndroid Build Coastguard Worker args=parse_args []; 750*9880d681SAndroid Build Coastguard Worker 'Token.Kwd ')' ?? "expected ')'">] -> 751*9880d681SAndroid Build Coastguard Worker Ast.Call (id, Array.of_list (List.rev args)) 752*9880d681SAndroid Build Coastguard Worker 753*9880d681SAndroid Build Coastguard Worker (* Simple variable ref. *) 754*9880d681SAndroid Build Coastguard Worker | [< >] -> Ast.Variable id 755*9880d681SAndroid Build Coastguard Worker in 756*9880d681SAndroid Build Coastguard Worker parse_ident id stream 757*9880d681SAndroid Build Coastguard Worker 758*9880d681SAndroid Build Coastguard Worker | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") 759*9880d681SAndroid Build Coastguard Worker 760*9880d681SAndroid Build Coastguard Worker (* binoprhs 761*9880d681SAndroid Build Coastguard Worker * ::= ('+' primary)* *) 762*9880d681SAndroid Build Coastguard Worker and parse_bin_rhs expr_prec lhs stream = 763*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 764*9880d681SAndroid Build Coastguard Worker (* If this is a binop, find its precedence. *) 765*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> 766*9880d681SAndroid Build Coastguard Worker let token_prec = precedence c in 767*9880d681SAndroid Build Coastguard Worker 768*9880d681SAndroid Build Coastguard Worker (* If this is a binop that binds at least as tightly as the current binop, 769*9880d681SAndroid Build Coastguard Worker * consume it, otherwise we are done. *) 770*9880d681SAndroid Build Coastguard Worker if token_prec < expr_prec then lhs else begin 771*9880d681SAndroid Build Coastguard Worker (* Eat the binop. *) 772*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 773*9880d681SAndroid Build Coastguard Worker 774*9880d681SAndroid Build Coastguard Worker (* Parse the primary expression after the binary operator. *) 775*9880d681SAndroid Build Coastguard Worker let rhs = parse_primary stream in 776*9880d681SAndroid Build Coastguard Worker 777*9880d681SAndroid Build Coastguard Worker (* Okay, we know this is a binop. *) 778*9880d681SAndroid Build Coastguard Worker let rhs = 779*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 780*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd c2) -> 781*9880d681SAndroid Build Coastguard Worker (* If BinOp binds less tightly with rhs than the operator after 782*9880d681SAndroid Build Coastguard Worker * rhs, let the pending operator take rhs as its lhs. *) 783*9880d681SAndroid Build Coastguard Worker let next_prec = precedence c2 in 784*9880d681SAndroid Build Coastguard Worker if token_prec < next_prec 785*9880d681SAndroid Build Coastguard Worker then parse_bin_rhs (token_prec + 1) rhs stream 786*9880d681SAndroid Build Coastguard Worker else rhs 787*9880d681SAndroid Build Coastguard Worker | _ -> rhs 788*9880d681SAndroid Build Coastguard Worker in 789*9880d681SAndroid Build Coastguard Worker 790*9880d681SAndroid Build Coastguard Worker (* Merge lhs/rhs. *) 791*9880d681SAndroid Build Coastguard Worker let lhs = Ast.Binary (c, lhs, rhs) in 792*9880d681SAndroid Build Coastguard Worker parse_bin_rhs expr_prec lhs stream 793*9880d681SAndroid Build Coastguard Worker end 794*9880d681SAndroid Build Coastguard Worker | _ -> lhs 795*9880d681SAndroid Build Coastguard Worker 796*9880d681SAndroid Build Coastguard Worker (* expression 797*9880d681SAndroid Build Coastguard Worker * ::= primary binoprhs *) 798*9880d681SAndroid Build Coastguard Worker and parse_expr = parser 799*9880d681SAndroid Build Coastguard Worker | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream 800*9880d681SAndroid Build Coastguard Worker 801*9880d681SAndroid Build Coastguard Worker (* prototype 802*9880d681SAndroid Build Coastguard Worker * ::= id '(' id* ')' *) 803*9880d681SAndroid Build Coastguard Worker let parse_prototype = 804*9880d681SAndroid Build Coastguard Worker let rec parse_args accumulator = parser 805*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e 806*9880d681SAndroid Build Coastguard Worker | [< >] -> accumulator 807*9880d681SAndroid Build Coastguard Worker in 808*9880d681SAndroid Build Coastguard Worker 809*9880d681SAndroid Build Coastguard Worker parser 810*9880d681SAndroid Build Coastguard Worker | [< 'Token.Ident id; 811*9880d681SAndroid Build Coastguard Worker 'Token.Kwd '(' ?? "expected '(' in prototype"; 812*9880d681SAndroid Build Coastguard Worker args=parse_args []; 813*9880d681SAndroid Build Coastguard Worker 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> 814*9880d681SAndroid Build Coastguard Worker (* success. *) 815*9880d681SAndroid Build Coastguard Worker Ast.Prototype (id, Array.of_list (List.rev args)) 816*9880d681SAndroid Build Coastguard Worker 817*9880d681SAndroid Build Coastguard Worker | [< >] -> 818*9880d681SAndroid Build Coastguard Worker raise (Stream.Error "expected function name in prototype") 819*9880d681SAndroid Build Coastguard Worker 820*9880d681SAndroid Build Coastguard Worker (* definition ::= 'def' prototype expression *) 821*9880d681SAndroid Build Coastguard Worker let parse_definition = parser 822*9880d681SAndroid Build Coastguard Worker | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> 823*9880d681SAndroid Build Coastguard Worker Ast.Function (p, e) 824*9880d681SAndroid Build Coastguard Worker 825*9880d681SAndroid Build Coastguard Worker (* toplevelexpr ::= expression *) 826*9880d681SAndroid Build Coastguard Worker let parse_toplevel = parser 827*9880d681SAndroid Build Coastguard Worker | [< e=parse_expr >] -> 828*9880d681SAndroid Build Coastguard Worker (* Make an anonymous proto. *) 829*9880d681SAndroid Build Coastguard Worker Ast.Function (Ast.Prototype ("", [||]), e) 830*9880d681SAndroid Build Coastguard Worker 831*9880d681SAndroid Build Coastguard Worker (* external ::= 'extern' prototype *) 832*9880d681SAndroid Build Coastguard Worker let parse_extern = parser 833*9880d681SAndroid Build Coastguard Worker | [< 'Token.Extern; e=parse_prototype >] -> e 834*9880d681SAndroid Build Coastguard Worker 835*9880d681SAndroid Build Coastguard Workertoplevel.ml: 836*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 837*9880d681SAndroid Build Coastguard Worker 838*9880d681SAndroid Build Coastguard Worker (*===----------------------------------------------------------------------=== 839*9880d681SAndroid Build Coastguard Worker * Top-Level parsing and JIT Driver 840*9880d681SAndroid Build Coastguard Worker *===----------------------------------------------------------------------===*) 841*9880d681SAndroid Build Coastguard Worker 842*9880d681SAndroid Build Coastguard Worker (* top ::= definition | external | expression | ';' *) 843*9880d681SAndroid Build Coastguard Worker let rec main_loop stream = 844*9880d681SAndroid Build Coastguard Worker match Stream.peek stream with 845*9880d681SAndroid Build Coastguard Worker | None -> () 846*9880d681SAndroid Build Coastguard Worker 847*9880d681SAndroid Build Coastguard Worker (* ignore top-level semicolons. *) 848*9880d681SAndroid Build Coastguard Worker | Some (Token.Kwd ';') -> 849*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 850*9880d681SAndroid Build Coastguard Worker main_loop stream 851*9880d681SAndroid Build Coastguard Worker 852*9880d681SAndroid Build Coastguard Worker | Some token -> 853*9880d681SAndroid Build Coastguard Worker begin 854*9880d681SAndroid Build Coastguard Worker try match token with 855*9880d681SAndroid Build Coastguard Worker | Token.Def -> 856*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_definition stream); 857*9880d681SAndroid Build Coastguard Worker print_endline "parsed a function definition."; 858*9880d681SAndroid Build Coastguard Worker | Token.Extern -> 859*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_extern stream); 860*9880d681SAndroid Build Coastguard Worker print_endline "parsed an extern."; 861*9880d681SAndroid Build Coastguard Worker | _ -> 862*9880d681SAndroid Build Coastguard Worker (* Evaluate a top-level expression into an anonymous function. *) 863*9880d681SAndroid Build Coastguard Worker ignore(Parser.parse_toplevel stream); 864*9880d681SAndroid Build Coastguard Worker print_endline "parsed a top-level expr"; 865*9880d681SAndroid Build Coastguard Worker with Stream.Error s -> 866*9880d681SAndroid Build Coastguard Worker (* Skip token for error recovery. *) 867*9880d681SAndroid Build Coastguard Worker Stream.junk stream; 868*9880d681SAndroid Build Coastguard Worker print_endline s; 869*9880d681SAndroid Build Coastguard Worker end; 870*9880d681SAndroid Build Coastguard Worker print_string "ready> "; flush stdout; 871*9880d681SAndroid Build Coastguard Worker main_loop stream 872*9880d681SAndroid Build Coastguard Worker 873*9880d681SAndroid Build Coastguard Workertoy.ml: 874*9880d681SAndroid Build Coastguard Worker .. code-block:: ocaml 875*9880d681SAndroid Build Coastguard Worker 876*9880d681SAndroid Build Coastguard Worker (*===----------------------------------------------------------------------=== 877*9880d681SAndroid Build Coastguard Worker * Main driver code. 878*9880d681SAndroid Build Coastguard Worker *===----------------------------------------------------------------------===*) 879*9880d681SAndroid Build Coastguard Worker 880*9880d681SAndroid Build Coastguard Worker let main () = 881*9880d681SAndroid Build Coastguard Worker (* Install standard binary operators. 882*9880d681SAndroid Build Coastguard Worker * 1 is the lowest precedence. *) 883*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '<' 10; 884*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '+' 20; 885*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '-' 20; 886*9880d681SAndroid Build Coastguard Worker Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) 887*9880d681SAndroid Build Coastguard Worker 888*9880d681SAndroid Build Coastguard Worker (* Prime the first token. *) 889*9880d681SAndroid Build Coastguard Worker print_string "ready> "; flush stdout; 890*9880d681SAndroid Build Coastguard Worker let stream = Lexer.lex (Stream.of_channel stdin) in 891*9880d681SAndroid Build Coastguard Worker 892*9880d681SAndroid Build Coastguard Worker (* Run the main "interpreter loop" now. *) 893*9880d681SAndroid Build Coastguard Worker Toplevel.main_loop stream; 894*9880d681SAndroid Build Coastguard Worker ;; 895*9880d681SAndroid Build Coastguard Worker 896*9880d681SAndroid Build Coastguard Worker main () 897*9880d681SAndroid Build Coastguard Worker 898*9880d681SAndroid Build Coastguard Worker`Next: Implementing Code Generation to LLVM IR <OCamlLangImpl3.html>`_ 899*9880d681SAndroid Build Coastguard Worker 900