xref: /aosp_15_r20/external/llvm/docs/tutorial/OCamlLangImpl2.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker===========================================
2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Implementing a Parser and AST
3*9880d681SAndroid Build Coastguard Worker===========================================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard Worker.. contents::
6*9880d681SAndroid Build Coastguard Worker   :local:
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard WorkerChapter 2 Introduction
9*9880d681SAndroid Build Coastguard Worker======================
10*9880d681SAndroid Build Coastguard Worker
11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 2 of the "`Implementing a language with LLVM in
12*9880d681SAndroid Build Coastguard WorkerObjective Caml <index.html>`_" tutorial. This chapter shows you how to
13*9880d681SAndroid Build Coastguard Workeruse the lexer, built in `Chapter 1 <OCamlLangImpl1.html>`_, to build a
14*9880d681SAndroid Build Coastguard Workerfull `parser <http://en.wikipedia.org/wiki/Parsing>`_ for our
15*9880d681SAndroid Build Coastguard WorkerKaleidoscope language. Once we have a parser, we'll define and build an
16*9880d681SAndroid Build Coastguard Worker`Abstract Syntax
17*9880d681SAndroid Build Coastguard WorkerTree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
18*9880d681SAndroid Build Coastguard Worker
19*9880d681SAndroid Build Coastguard WorkerThe parser we will build uses a combination of `Recursive Descent
20*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and
21*9880d681SAndroid Build Coastguard Worker`Operator-Precedence
22*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
23*9880d681SAndroid Build Coastguard Workerparse the Kaleidoscope language (the latter for binary expressions and
24*9880d681SAndroid Build Coastguard Workerthe former for everything else). Before we get to parsing though, lets
25*9880d681SAndroid Build Coastguard Workertalk about the output of the parser: the Abstract Syntax Tree.
26*9880d681SAndroid Build Coastguard Worker
27*9880d681SAndroid Build Coastguard WorkerThe Abstract Syntax Tree (AST)
28*9880d681SAndroid Build Coastguard Worker==============================
29*9880d681SAndroid Build Coastguard Worker
30*9880d681SAndroid Build Coastguard WorkerThe AST for a program captures its behavior in such a way that it is
31*9880d681SAndroid Build Coastguard Workereasy for later stages of the compiler (e.g. code generation) to
32*9880d681SAndroid Build Coastguard Workerinterpret. We basically want one object for each construct in the
33*9880d681SAndroid Build Coastguard Workerlanguage, and the AST should closely model the language. In
34*9880d681SAndroid Build Coastguard WorkerKaleidoscope, we have expressions, a prototype, and a function object.
35*9880d681SAndroid Build Coastguard WorkerWe'll start with expressions first:
36*9880d681SAndroid Build Coastguard Worker
37*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
38*9880d681SAndroid Build Coastguard Worker
39*9880d681SAndroid Build Coastguard Worker    (* expr - Base type for all expression nodes. *)
40*9880d681SAndroid Build Coastguard Worker    type expr =
41*9880d681SAndroid Build Coastguard Worker      (* variant for numeric literals like "1.0". *)
42*9880d681SAndroid Build Coastguard Worker      | Number of float
43*9880d681SAndroid Build Coastguard Worker
44*9880d681SAndroid Build Coastguard WorkerThe code above shows the definition of the base ExprAST class and one
45*9880d681SAndroid Build Coastguard Workersubclass which we use for numeric literals. The important thing to note
46*9880d681SAndroid Build Coastguard Workerabout this code is that the Number variant captures the numeric value of
47*9880d681SAndroid Build Coastguard Workerthe literal as an instance variable. This allows later phases of the
48*9880d681SAndroid Build Coastguard Workercompiler to know what the stored numeric value is.
49*9880d681SAndroid Build Coastguard Worker
50*9880d681SAndroid Build Coastguard WorkerRight now we only create the AST, so there are no useful functions on
51*9880d681SAndroid Build Coastguard Workerthem. It would be very easy to add a function to pretty print the code,
52*9880d681SAndroid Build Coastguard Workerfor example. Here are the other expression AST node definitions that
53*9880d681SAndroid Build Coastguard Workerwe'll use in the basic form of the Kaleidoscope language:
54*9880d681SAndroid Build Coastguard Worker
55*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
56*9880d681SAndroid Build Coastguard Worker
57*9880d681SAndroid Build Coastguard Worker      (* variant for referencing a variable, like "a". *)
58*9880d681SAndroid Build Coastguard Worker      | Variable of string
59*9880d681SAndroid Build Coastguard Worker
60*9880d681SAndroid Build Coastguard Worker      (* variant for a binary operator. *)
61*9880d681SAndroid Build Coastguard Worker      | Binary of char * expr * expr
62*9880d681SAndroid Build Coastguard Worker
63*9880d681SAndroid Build Coastguard Worker      (* variant for function calls. *)
64*9880d681SAndroid Build Coastguard Worker      | Call of string * expr array
65*9880d681SAndroid Build Coastguard Worker
66*9880d681SAndroid Build Coastguard WorkerThis is all (intentionally) rather straight-forward: variables capture
67*9880d681SAndroid Build Coastguard Workerthe variable name, binary operators capture their opcode (e.g. '+'), and
68*9880d681SAndroid Build Coastguard Workercalls capture a function name as well as a list of any argument
69*9880d681SAndroid Build Coastguard Workerexpressions. One thing that is nice about our AST is that it captures
70*9880d681SAndroid Build Coastguard Workerthe language features without talking about the syntax of the language.
71*9880d681SAndroid Build Coastguard WorkerNote that there is no discussion about precedence of binary operators,
72*9880d681SAndroid Build Coastguard Workerlexical structure, etc.
73*9880d681SAndroid Build Coastguard Worker
74*9880d681SAndroid Build Coastguard WorkerFor our basic language, these are all of the expression nodes we'll
75*9880d681SAndroid Build Coastguard Workerdefine. Because it doesn't have conditional control flow, it isn't
76*9880d681SAndroid Build Coastguard WorkerTuring-complete; we'll fix that in a later installment. The two things
77*9880d681SAndroid Build Coastguard Workerwe need next are a way to talk about the interface to a function, and a
78*9880d681SAndroid Build Coastguard Workerway to talk about functions themselves:
79*9880d681SAndroid Build Coastguard Worker
80*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
81*9880d681SAndroid Build Coastguard Worker
82*9880d681SAndroid Build Coastguard Worker    (* proto - This type represents the "prototype" for a function, which captures
83*9880d681SAndroid Build Coastguard Worker     * its name, and its argument names (thus implicitly the number of arguments the
84*9880d681SAndroid Build Coastguard Worker     * function takes). *)
85*9880d681SAndroid Build Coastguard Worker    type proto = Prototype of string * string array
86*9880d681SAndroid Build Coastguard Worker
87*9880d681SAndroid Build Coastguard Worker    (* func - This type represents a function definition itself. *)
88*9880d681SAndroid Build Coastguard Worker    type func = Function of proto * expr
89*9880d681SAndroid Build Coastguard Worker
90*9880d681SAndroid Build Coastguard WorkerIn Kaleidoscope, functions are typed with just a count of their
91*9880d681SAndroid Build Coastguard Workerarguments. Since all values are double precision floating point, the
92*9880d681SAndroid Build Coastguard Workertype of each argument doesn't need to be stored anywhere. In a more
93*9880d681SAndroid Build Coastguard Workeraggressive and realistic language, the "expr" variants would probably
94*9880d681SAndroid Build Coastguard Workerhave a type field.
95*9880d681SAndroid Build Coastguard Worker
96*9880d681SAndroid Build Coastguard WorkerWith this scaffolding, we can now talk about parsing expressions and
97*9880d681SAndroid Build Coastguard Workerfunction bodies in Kaleidoscope.
98*9880d681SAndroid Build Coastguard Worker
99*9880d681SAndroid Build Coastguard WorkerParser Basics
100*9880d681SAndroid Build Coastguard Worker=============
101*9880d681SAndroid Build Coastguard Worker
102*9880d681SAndroid Build Coastguard WorkerNow that we have an AST to build, we need to define the parser code to
103*9880d681SAndroid Build Coastguard Workerbuild it. The idea here is that we want to parse something like "x+y"
104*9880d681SAndroid Build Coastguard Worker(which is returned as three tokens by the lexer) into an AST that could
105*9880d681SAndroid Build Coastguard Workerbe generated with calls like this:
106*9880d681SAndroid Build Coastguard Worker
107*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
108*9880d681SAndroid Build Coastguard Worker
109*9880d681SAndroid Build Coastguard Worker      let x = Variable "x" in
110*9880d681SAndroid Build Coastguard Worker      let y = Variable "y" in
111*9880d681SAndroid Build Coastguard Worker      let result = Binary ('+', x, y) in
112*9880d681SAndroid Build Coastguard Worker      ...
113*9880d681SAndroid Build Coastguard Worker
114*9880d681SAndroid Build Coastguard WorkerThe error handling routines make use of the builtin ``Stream.Failure``
115*9880d681SAndroid Build Coastguard Workerand ``Stream.Error``s. ``Stream.Failure`` is raised when the parser is
116*9880d681SAndroid Build Coastguard Workerunable to find any matching token in the first position of a pattern.
117*9880d681SAndroid Build Coastguard Worker``Stream.Error`` is raised when the first token matches, but the rest do
118*9880d681SAndroid Build Coastguard Workernot. The error recovery in our parser will not be the best and is not
119*9880d681SAndroid Build Coastguard Workerparticular user-friendly, but it will be enough for our tutorial. These
120*9880d681SAndroid Build Coastguard Workerexceptions make it easier to handle errors in routines that have various
121*9880d681SAndroid Build Coastguard Workerreturn types.
122*9880d681SAndroid Build Coastguard Worker
123*9880d681SAndroid Build Coastguard WorkerWith these basic types and exceptions, we can implement the first piece
124*9880d681SAndroid Build Coastguard Workerof our grammar: numeric literals.
125*9880d681SAndroid Build Coastguard Worker
126*9880d681SAndroid Build Coastguard WorkerBasic Expression Parsing
127*9880d681SAndroid Build Coastguard Worker========================
128*9880d681SAndroid Build Coastguard Worker
129*9880d681SAndroid Build Coastguard WorkerWe start with numeric literals, because they are the simplest to
130*9880d681SAndroid Build Coastguard Workerprocess. For each production in our grammar, we'll define a function
131*9880d681SAndroid Build Coastguard Workerwhich parses that production. We call this class of expressions
132*9880d681SAndroid Build Coastguard Worker"primary" expressions, for reasons that will become more clear `later in
133*9880d681SAndroid Build Coastguard Workerthe tutorial <OCamlLangImpl6.html#user-defined-unary-operators>`_. In order to parse an
134*9880d681SAndroid Build Coastguard Workerarbitrary primary expression, we need to determine what sort of
135*9880d681SAndroid Build Coastguard Workerexpression it is. For numeric literals, we have:
136*9880d681SAndroid Build Coastguard Worker
137*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
138*9880d681SAndroid Build Coastguard Worker
139*9880d681SAndroid Build Coastguard Worker    (* primary
140*9880d681SAndroid Build Coastguard Worker     *   ::= identifier
141*9880d681SAndroid Build Coastguard Worker     *   ::= numberexpr
142*9880d681SAndroid Build Coastguard Worker     *   ::= parenexpr *)
143*9880d681SAndroid Build Coastguard Worker    parse_primary = parser
144*9880d681SAndroid Build Coastguard Worker      (* numberexpr ::= number *)
145*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Number n >] -> Ast.Number n
146*9880d681SAndroid Build Coastguard Worker
147*9880d681SAndroid Build Coastguard WorkerThis routine is very simple: it expects to be called when the current
148*9880d681SAndroid Build Coastguard Workertoken is a ``Token.Number`` token. It takes the current number value,
149*9880d681SAndroid Build Coastguard Workercreates a ``Ast.Number`` node, advances the lexer to the next token, and
150*9880d681SAndroid Build Coastguard Workerfinally returns.
151*9880d681SAndroid Build Coastguard Worker
152*9880d681SAndroid Build Coastguard WorkerThere are some interesting aspects to this. The most important one is
153*9880d681SAndroid Build Coastguard Workerthat this routine eats all of the tokens that correspond to the
154*9880d681SAndroid Build Coastguard Workerproduction and returns the lexer buffer with the next token (which is
155*9880d681SAndroid Build Coastguard Workernot part of the grammar production) ready to go. This is a fairly
156*9880d681SAndroid Build Coastguard Workerstandard way to go for recursive descent parsers. For a better example,
157*9880d681SAndroid Build Coastguard Workerthe parenthesis operator is defined like this:
158*9880d681SAndroid Build Coastguard Worker
159*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
160*9880d681SAndroid Build Coastguard Worker
161*9880d681SAndroid Build Coastguard Worker      (* parenexpr ::= '(' expression ')' *)
162*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
163*9880d681SAndroid Build Coastguard Worker
164*9880d681SAndroid Build Coastguard WorkerThis function illustrates a number of interesting things about the
165*9880d681SAndroid Build Coastguard Workerparser:
166*9880d681SAndroid Build Coastguard Worker
167*9880d681SAndroid Build Coastguard Worker1) It shows how we use the ``Stream.Error`` exception. When called, this
168*9880d681SAndroid Build Coastguard Workerfunction expects that the current token is a '(' token, but after
169*9880d681SAndroid Build Coastguard Workerparsing the subexpression, it is possible that there is no ')' waiting.
170*9880d681SAndroid Build Coastguard WorkerFor example, if the user types in "(4 x" instead of "(4)", the parser
171*9880d681SAndroid Build Coastguard Workershould emit an error. Because errors can occur, the parser needs a way
172*9880d681SAndroid Build Coastguard Workerto indicate that they happened. In our parser, we use the camlp4
173*9880d681SAndroid Build Coastguard Workershortcut syntax ``token ?? "parse error"``, where if the token before
174*9880d681SAndroid Build Coastguard Workerthe ``??`` does not match, then ``Stream.Error "parse error"`` will be
175*9880d681SAndroid Build Coastguard Workerraised.
176*9880d681SAndroid Build Coastguard Worker
177*9880d681SAndroid Build Coastguard Worker2) Another interesting aspect of this function is that it uses recursion
178*9880d681SAndroid Build Coastguard Workerby calling ``Parser.parse_primary`` (we will soon see that
179*9880d681SAndroid Build Coastguard Worker``Parser.parse_primary`` can call ``Parser.parse_primary``). This is
180*9880d681SAndroid Build Coastguard Workerpowerful because it allows us to handle recursive grammars, and keeps
181*9880d681SAndroid Build Coastguard Workereach production very simple. Note that parentheses do not cause
182*9880d681SAndroid Build Coastguard Workerconstruction of AST nodes themselves. While we could do it this way, the
183*9880d681SAndroid Build Coastguard Workermost important role of parentheses are to guide the parser and provide
184*9880d681SAndroid Build Coastguard Workergrouping. Once the parser constructs the AST, parentheses are not
185*9880d681SAndroid Build Coastguard Workerneeded.
186*9880d681SAndroid Build Coastguard Worker
187*9880d681SAndroid Build Coastguard WorkerThe next simple production is for handling variable references and
188*9880d681SAndroid Build Coastguard Workerfunction calls:
189*9880d681SAndroid Build Coastguard Worker
190*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
191*9880d681SAndroid Build Coastguard Worker
192*9880d681SAndroid Build Coastguard Worker      (* identifierexpr
193*9880d681SAndroid Build Coastguard Worker       *   ::= identifier
194*9880d681SAndroid Build Coastguard Worker       *   ::= identifier '(' argumentexpr ')' *)
195*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Ident id; stream >] ->
196*9880d681SAndroid Build Coastguard Worker          let rec parse_args accumulator = parser
197*9880d681SAndroid Build Coastguard Worker            | [< e=parse_expr; stream >] ->
198*9880d681SAndroid Build Coastguard Worker                begin parser
199*9880d681SAndroid Build Coastguard Worker                  | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
200*9880d681SAndroid Build Coastguard Worker                  | [< >] -> e :: accumulator
201*9880d681SAndroid Build Coastguard Worker                end stream
202*9880d681SAndroid Build Coastguard Worker            | [< >] -> accumulator
203*9880d681SAndroid Build Coastguard Worker          in
204*9880d681SAndroid Build Coastguard Worker          let rec parse_ident id = parser
205*9880d681SAndroid Build Coastguard Worker            (* Call. *)
206*9880d681SAndroid Build Coastguard Worker            | [< 'Token.Kwd '(';
207*9880d681SAndroid Build Coastguard Worker                 args=parse_args [];
208*9880d681SAndroid Build Coastguard Worker                 'Token.Kwd ')' ?? "expected ')'">] ->
209*9880d681SAndroid Build Coastguard Worker                Ast.Call (id, Array.of_list (List.rev args))
210*9880d681SAndroid Build Coastguard Worker
211*9880d681SAndroid Build Coastguard Worker            (* Simple variable ref. *)
212*9880d681SAndroid Build Coastguard Worker            | [< >] -> Ast.Variable id
213*9880d681SAndroid Build Coastguard Worker          in
214*9880d681SAndroid Build Coastguard Worker          parse_ident id stream
215*9880d681SAndroid Build Coastguard Worker
216*9880d681SAndroid Build Coastguard WorkerThis routine follows the same style as the other routines. (It expects
217*9880d681SAndroid Build Coastguard Workerto be called if the current token is a ``Token.Ident`` token). It also
218*9880d681SAndroid Build Coastguard Workerhas recursion and error handling. One interesting aspect of this is that
219*9880d681SAndroid Build Coastguard Workerit uses *look-ahead* to determine if the current identifier is a stand
220*9880d681SAndroid Build Coastguard Workeralone variable reference or if it is a function call expression. It
221*9880d681SAndroid Build Coastguard Workerhandles this by checking to see if the token after the identifier is a
222*9880d681SAndroid Build Coastguard Worker'(' token, constructing either a ``Ast.Variable`` or ``Ast.Call`` node
223*9880d681SAndroid Build Coastguard Workeras appropriate.
224*9880d681SAndroid Build Coastguard Worker
225*9880d681SAndroid Build Coastguard WorkerWe finish up by raising an exception if we received a token we didn't
226*9880d681SAndroid Build Coastguard Workerexpect:
227*9880d681SAndroid Build Coastguard Worker
228*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
229*9880d681SAndroid Build Coastguard Worker
230*9880d681SAndroid Build Coastguard Worker      | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
231*9880d681SAndroid Build Coastguard Worker
232*9880d681SAndroid Build Coastguard WorkerNow that basic expressions are handled, we need to handle binary
233*9880d681SAndroid Build Coastguard Workerexpressions. They are a bit more complex.
234*9880d681SAndroid Build Coastguard Worker
235*9880d681SAndroid Build Coastguard WorkerBinary Expression Parsing
236*9880d681SAndroid Build Coastguard Worker=========================
237*9880d681SAndroid Build Coastguard Worker
238*9880d681SAndroid Build Coastguard WorkerBinary expressions are significantly harder to parse because they are
239*9880d681SAndroid Build Coastguard Workeroften ambiguous. For example, when given the string "x+y\*z", the parser
240*9880d681SAndroid Build Coastguard Workercan choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common
241*9880d681SAndroid Build Coastguard Workerdefinitions from mathematics, we expect the later parse, because "\*"
242*9880d681SAndroid Build Coastguard Worker(multiplication) has higher *precedence* than "+" (addition).
243*9880d681SAndroid Build Coastguard Worker
244*9880d681SAndroid Build Coastguard WorkerThere are many ways to handle this, but an elegant and efficient way is
245*9880d681SAndroid Build Coastguard Workerto use `Operator-Precedence
246*9880d681SAndroid Build Coastguard WorkerParsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
247*9880d681SAndroid Build Coastguard WorkerThis parsing technique uses the precedence of binary operators to guide
248*9880d681SAndroid Build Coastguard Workerrecursion. To start with, we need a table of precedences:
249*9880d681SAndroid Build Coastguard Worker
250*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
251*9880d681SAndroid Build Coastguard Worker
252*9880d681SAndroid Build Coastguard Worker    (* binop_precedence - This holds the precedence for each binary operator that is
253*9880d681SAndroid Build Coastguard Worker     * defined *)
254*9880d681SAndroid Build Coastguard Worker    let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
255*9880d681SAndroid Build Coastguard Worker
256*9880d681SAndroid Build Coastguard Worker    (* precedence - Get the precedence of the pending binary operator token. *)
257*9880d681SAndroid Build Coastguard Worker    let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
258*9880d681SAndroid Build Coastguard Worker
259*9880d681SAndroid Build Coastguard Worker    ...
260*9880d681SAndroid Build Coastguard Worker
261*9880d681SAndroid Build Coastguard Worker    let main () =
262*9880d681SAndroid Build Coastguard Worker      (* Install standard binary operators.
263*9880d681SAndroid Build Coastguard Worker       * 1 is the lowest precedence. *)
264*9880d681SAndroid Build Coastguard Worker      Hashtbl.add Parser.binop_precedence '<' 10;
265*9880d681SAndroid Build Coastguard Worker      Hashtbl.add Parser.binop_precedence '+' 20;
266*9880d681SAndroid Build Coastguard Worker      Hashtbl.add Parser.binop_precedence '-' 20;
267*9880d681SAndroid Build Coastguard Worker      Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
268*9880d681SAndroid Build Coastguard Worker      ...
269*9880d681SAndroid Build Coastguard Worker
270*9880d681SAndroid Build Coastguard WorkerFor the basic form of Kaleidoscope, we will only support 4 binary
271*9880d681SAndroid Build Coastguard Workeroperators (this can obviously be extended by you, our brave and intrepid
272*9880d681SAndroid Build Coastguard Workerreader). The ``Parser.precedence`` function returns the precedence for
273*9880d681SAndroid Build Coastguard Workerthe current token, or -1 if the token is not a binary operator. Having a
274*9880d681SAndroid Build Coastguard Worker``Hashtbl.t`` makes it easy to add new operators and makes it clear that
275*9880d681SAndroid Build Coastguard Workerthe algorithm doesn't depend on the specific operators involved, but it
276*9880d681SAndroid Build Coastguard Workerwould be easy enough to eliminate the ``Hashtbl.t`` and do the
277*9880d681SAndroid Build Coastguard Workercomparisons in the ``Parser.precedence`` function. (Or just use a
278*9880d681SAndroid Build Coastguard Workerfixed-size array).
279*9880d681SAndroid Build Coastguard Worker
280*9880d681SAndroid Build Coastguard WorkerWith the helper above defined, we can now start parsing binary
281*9880d681SAndroid Build Coastguard Workerexpressions. The basic idea of operator precedence parsing is to break
282*9880d681SAndroid Build Coastguard Workerdown an expression with potentially ambiguous binary operators into
283*9880d681SAndroid Build Coastguard Workerpieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g".
284*9880d681SAndroid Build Coastguard WorkerOperator precedence parsing considers this as a stream of primary
285*9880d681SAndroid Build Coastguard Workerexpressions separated by binary operators. As such, it will first parse
286*9880d681SAndroid Build Coastguard Workerthe leading primary expression "a", then it will see the pairs [+, b]
287*9880d681SAndroid Build Coastguard Worker[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are
288*9880d681SAndroid Build Coastguard Workerprimary expressions, the binary expression parser doesn't need to worry
289*9880d681SAndroid Build Coastguard Workerabout nested subexpressions like (c+d) at all.
290*9880d681SAndroid Build Coastguard Worker
291*9880d681SAndroid Build Coastguard WorkerTo start, an expression is a primary expression potentially followed by
292*9880d681SAndroid Build Coastguard Workera sequence of [binop,primaryexpr] pairs:
293*9880d681SAndroid Build Coastguard Worker
294*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
295*9880d681SAndroid Build Coastguard Worker
296*9880d681SAndroid Build Coastguard Worker    (* expression
297*9880d681SAndroid Build Coastguard Worker     *   ::= primary binoprhs *)
298*9880d681SAndroid Build Coastguard Worker    and parse_expr = parser
299*9880d681SAndroid Build Coastguard Worker      | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
300*9880d681SAndroid Build Coastguard Worker
301*9880d681SAndroid Build Coastguard Worker``Parser.parse_bin_rhs`` is the function that parses the sequence of
302*9880d681SAndroid Build Coastguard Workerpairs for us. It takes a precedence and a pointer to an expression for
303*9880d681SAndroid Build Coastguard Workerthe part that has been parsed so far. Note that "x" is a perfectly valid
304*9880d681SAndroid Build Coastguard Workerexpression: As such, "binoprhs" is allowed to be empty, in which case it
305*9880d681SAndroid Build Coastguard Workerreturns the expression that is passed into it. In our example above, the
306*9880d681SAndroid Build Coastguard Workercode passes the expression for "a" into ``Parser.parse_bin_rhs`` and the
307*9880d681SAndroid Build Coastguard Workercurrent token is "+".
308*9880d681SAndroid Build Coastguard Worker
309*9880d681SAndroid Build Coastguard WorkerThe precedence value passed into ``Parser.parse_bin_rhs`` indicates the
310*9880d681SAndroid Build Coastguard Worker*minimal operator precedence* that the function is allowed to eat. For
311*9880d681SAndroid Build Coastguard Workerexample, if the current pair stream is [+, x] and
312*9880d681SAndroid Build Coastguard Worker``Parser.parse_bin_rhs`` is passed in a precedence of 40, it will not
313*9880d681SAndroid Build Coastguard Workerconsume any tokens (because the precedence of '+' is only 20). With this
314*9880d681SAndroid Build Coastguard Workerin mind, ``Parser.parse_bin_rhs`` starts with:
315*9880d681SAndroid Build Coastguard Worker
316*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
317*9880d681SAndroid Build Coastguard Worker
318*9880d681SAndroid Build Coastguard Worker    (* binoprhs
319*9880d681SAndroid Build Coastguard Worker     *   ::= ('+' primary)* *)
320*9880d681SAndroid Build Coastguard Worker    and parse_bin_rhs expr_prec lhs stream =
321*9880d681SAndroid Build Coastguard Worker      match Stream.peek stream with
322*9880d681SAndroid Build Coastguard Worker      (* If this is a binop, find its precedence. *)
323*9880d681SAndroid Build Coastguard Worker      | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
324*9880d681SAndroid Build Coastguard Worker          let token_prec = precedence c in
325*9880d681SAndroid Build Coastguard Worker
326*9880d681SAndroid Build Coastguard Worker          (* If this is a binop that binds at least as tightly as the current binop,
327*9880d681SAndroid Build Coastguard Worker           * consume it, otherwise we are done. *)
328*9880d681SAndroid Build Coastguard Worker          if token_prec < expr_prec then lhs else begin
329*9880d681SAndroid Build Coastguard Worker
330*9880d681SAndroid Build Coastguard WorkerThis code gets the precedence of the current token and checks to see if
331*9880d681SAndroid Build Coastguard Workerif is too low. Because we defined invalid tokens to have a precedence of
332*9880d681SAndroid Build Coastguard Worker-1, this check implicitly knows that the pair-stream ends when the token
333*9880d681SAndroid Build Coastguard Workerstream runs out of binary operators. If this check succeeds, we know
334*9880d681SAndroid Build Coastguard Workerthat the token is a binary operator and that it will be included in this
335*9880d681SAndroid Build Coastguard Workerexpression:
336*9880d681SAndroid Build Coastguard Worker
337*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
338*9880d681SAndroid Build Coastguard Worker
339*9880d681SAndroid Build Coastguard Worker            (* Eat the binop. *)
340*9880d681SAndroid Build Coastguard Worker            Stream.junk stream;
341*9880d681SAndroid Build Coastguard Worker
342*9880d681SAndroid Build Coastguard Worker            (* Parse the primary expression after the binary operator *)
343*9880d681SAndroid Build Coastguard Worker            let rhs = parse_primary stream in
344*9880d681SAndroid Build Coastguard Worker
345*9880d681SAndroid Build Coastguard Worker            (* Okay, we know this is a binop. *)
346*9880d681SAndroid Build Coastguard Worker            let rhs =
347*9880d681SAndroid Build Coastguard Worker              match Stream.peek stream with
348*9880d681SAndroid Build Coastguard Worker              | Some (Token.Kwd c2) ->
349*9880d681SAndroid Build Coastguard Worker
350*9880d681SAndroid Build Coastguard WorkerAs such, this code eats (and remembers) the binary operator and then
351*9880d681SAndroid Build Coastguard Workerparses the primary expression that follows. This builds up the whole
352*9880d681SAndroid Build Coastguard Workerpair, the first of which is [+, b] for the running example.
353*9880d681SAndroid Build Coastguard Worker
354*9880d681SAndroid Build Coastguard WorkerNow that we parsed the left-hand side of an expression and one pair of
355*9880d681SAndroid Build Coastguard Workerthe RHS sequence, we have to decide which way the expression associates.
356*9880d681SAndroid Build Coastguard WorkerIn particular, we could have "(a+b) binop unparsed" or "a + (b binop
357*9880d681SAndroid Build Coastguard Workerunparsed)". To determine this, we look ahead at "binop" to determine its
358*9880d681SAndroid Build Coastguard Workerprecedence and compare it to BinOp's precedence (which is '+' in this
359*9880d681SAndroid Build Coastguard Workercase):
360*9880d681SAndroid Build Coastguard Worker
361*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
362*9880d681SAndroid Build Coastguard Worker
363*9880d681SAndroid Build Coastguard Worker                  (* If BinOp binds less tightly with rhs than the operator after
364*9880d681SAndroid Build Coastguard Worker                   * rhs, let the pending operator take rhs as its lhs. *)
365*9880d681SAndroid Build Coastguard Worker                  let next_prec = precedence c2 in
366*9880d681SAndroid Build Coastguard Worker                  if token_prec < next_prec
367*9880d681SAndroid Build Coastguard Worker
368*9880d681SAndroid Build Coastguard WorkerIf the precedence of the binop to the right of "RHS" is lower or equal
369*9880d681SAndroid Build Coastguard Workerto the precedence of our current operator, then we know that the
370*9880d681SAndroid Build Coastguard Workerparentheses associate as "(a+b) binop ...". In our example, the current
371*9880d681SAndroid Build Coastguard Workeroperator is "+" and the next operator is "+", we know that they have the
372*9880d681SAndroid Build Coastguard Workersame precedence. In this case we'll create the AST node for "a+b", and
373*9880d681SAndroid Build Coastguard Workerthen continue parsing:
374*9880d681SAndroid Build Coastguard Worker
375*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
376*9880d681SAndroid Build Coastguard Worker
377*9880d681SAndroid Build Coastguard Worker              ... if body omitted ...
378*9880d681SAndroid Build Coastguard Worker            in
379*9880d681SAndroid Build Coastguard Worker
380*9880d681SAndroid Build Coastguard Worker            (* Merge lhs/rhs. *)
381*9880d681SAndroid Build Coastguard Worker            let lhs = Ast.Binary (c, lhs, rhs) in
382*9880d681SAndroid Build Coastguard Worker            parse_bin_rhs expr_prec lhs stream
383*9880d681SAndroid Build Coastguard Worker          end
384*9880d681SAndroid Build Coastguard Worker
385*9880d681SAndroid Build Coastguard WorkerIn our example above, this will turn "a+b+" into "(a+b)" and execute the
386*9880d681SAndroid Build Coastguard Workernext iteration of the loop, with "+" as the current token. The code
387*9880d681SAndroid Build Coastguard Workerabove will eat, remember, and parse "(c+d)" as the primary expression,
388*9880d681SAndroid Build Coastguard Workerwhich makes the current pair equal to [+, (c+d)]. It will then evaluate
389*9880d681SAndroid Build Coastguard Workerthe 'if' conditional above with "\*" as the binop to the right of the
390*9880d681SAndroid Build Coastguard Workerprimary. In this case, the precedence of "\*" is higher than the
391*9880d681SAndroid Build Coastguard Workerprecedence of "+" so the if condition will be entered.
392*9880d681SAndroid Build Coastguard Worker
393*9880d681SAndroid Build Coastguard WorkerThe critical question left here is "how can the if condition parse the
394*9880d681SAndroid Build Coastguard Workerright hand side in full"? In particular, to build the AST correctly for
395*9880d681SAndroid Build Coastguard Workerour example, it needs to get all of "(c+d)\*e\*f" as the RHS expression
396*9880d681SAndroid Build Coastguard Workervariable. The code to do this is surprisingly simple (code from the
397*9880d681SAndroid Build Coastguard Workerabove two blocks duplicated for context):
398*9880d681SAndroid Build Coastguard Worker
399*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
400*9880d681SAndroid Build Coastguard Worker
401*9880d681SAndroid Build Coastguard Worker              match Stream.peek stream with
402*9880d681SAndroid Build Coastguard Worker              | Some (Token.Kwd c2) ->
403*9880d681SAndroid Build Coastguard Worker                  (* If BinOp binds less tightly with rhs than the operator after
404*9880d681SAndroid Build Coastguard Worker                   * rhs, let the pending operator take rhs as its lhs. *)
405*9880d681SAndroid Build Coastguard Worker                  if token_prec < precedence c2
406*9880d681SAndroid Build Coastguard Worker                  then parse_bin_rhs (token_prec + 1) rhs stream
407*9880d681SAndroid Build Coastguard Worker                  else rhs
408*9880d681SAndroid Build Coastguard Worker              | _ -> rhs
409*9880d681SAndroid Build Coastguard Worker            in
410*9880d681SAndroid Build Coastguard Worker
411*9880d681SAndroid Build Coastguard Worker            (* Merge lhs/rhs. *)
412*9880d681SAndroid Build Coastguard Worker            let lhs = Ast.Binary (c, lhs, rhs) in
413*9880d681SAndroid Build Coastguard Worker            parse_bin_rhs expr_prec lhs stream
414*9880d681SAndroid Build Coastguard Worker          end
415*9880d681SAndroid Build Coastguard Worker
416*9880d681SAndroid Build Coastguard WorkerAt this point, we know that the binary operator to the RHS of our
417*9880d681SAndroid Build Coastguard Workerprimary has higher precedence than the binop we are currently parsing.
418*9880d681SAndroid Build Coastguard WorkerAs such, we know that any sequence of pairs whose operators are all
419*9880d681SAndroid Build Coastguard Workerhigher precedence than "+" should be parsed together and returned as
420*9880d681SAndroid Build Coastguard Worker"RHS". To do this, we recursively invoke the ``Parser.parse_bin_rhs``
421*9880d681SAndroid Build Coastguard Workerfunction specifying "token\_prec+1" as the minimum precedence required
422*9880d681SAndroid Build Coastguard Workerfor it to continue. In our example above, this will cause it to return
423*9880d681SAndroid Build Coastguard Workerthe AST node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of
424*9880d681SAndroid Build Coastguard Workerthe '+' expression.
425*9880d681SAndroid Build Coastguard Worker
426*9880d681SAndroid Build Coastguard WorkerFinally, on the next iteration of the while loop, the "+g" piece is
427*9880d681SAndroid Build Coastguard Workerparsed and added to the AST. With this little bit of code (14
428*9880d681SAndroid Build Coastguard Workernon-trivial lines), we correctly handle fully general binary expression
429*9880d681SAndroid Build Coastguard Workerparsing in a very elegant way. This was a whirlwind tour of this code,
430*9880d681SAndroid Build Coastguard Workerand it is somewhat subtle. I recommend running through it with a few
431*9880d681SAndroid Build Coastguard Workertough examples to see how it works.
432*9880d681SAndroid Build Coastguard Worker
433*9880d681SAndroid Build Coastguard WorkerThis wraps up handling of expressions. At this point, we can point the
434*9880d681SAndroid Build Coastguard Workerparser at an arbitrary token stream and build an expression from it,
435*9880d681SAndroid Build Coastguard Workerstopping at the first token that is not part of the expression. Next up
436*9880d681SAndroid Build Coastguard Workerwe need to handle function definitions, etc.
437*9880d681SAndroid Build Coastguard Worker
438*9880d681SAndroid Build Coastguard WorkerParsing the Rest
439*9880d681SAndroid Build Coastguard Worker================
440*9880d681SAndroid Build Coastguard Worker
441*9880d681SAndroid Build Coastguard WorkerThe next thing missing is handling of function prototypes. In
442*9880d681SAndroid Build Coastguard WorkerKaleidoscope, these are used both for 'extern' function declarations as
443*9880d681SAndroid Build Coastguard Workerwell as function body definitions. The code to do this is
444*9880d681SAndroid Build Coastguard Workerstraight-forward and not very interesting (once you've survived
445*9880d681SAndroid Build Coastguard Workerexpressions):
446*9880d681SAndroid Build Coastguard Worker
447*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
448*9880d681SAndroid Build Coastguard Worker
449*9880d681SAndroid Build Coastguard Worker    (* prototype
450*9880d681SAndroid Build Coastguard Worker     *   ::= id '(' id* ')' *)
451*9880d681SAndroid Build Coastguard Worker    let parse_prototype =
452*9880d681SAndroid Build Coastguard Worker      let rec parse_args accumulator = parser
453*9880d681SAndroid Build Coastguard Worker        | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
454*9880d681SAndroid Build Coastguard Worker        | [< >] -> accumulator
455*9880d681SAndroid Build Coastguard Worker      in
456*9880d681SAndroid Build Coastguard Worker
457*9880d681SAndroid Build Coastguard Worker      parser
458*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Ident id;
459*9880d681SAndroid Build Coastguard Worker           'Token.Kwd '(' ?? "expected '(' in prototype";
460*9880d681SAndroid Build Coastguard Worker           args=parse_args [];
461*9880d681SAndroid Build Coastguard Worker           'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
462*9880d681SAndroid Build Coastguard Worker          (* success. *)
463*9880d681SAndroid Build Coastguard Worker          Ast.Prototype (id, Array.of_list (List.rev args))
464*9880d681SAndroid Build Coastguard Worker
465*9880d681SAndroid Build Coastguard Worker      | [< >] ->
466*9880d681SAndroid Build Coastguard Worker          raise (Stream.Error "expected function name in prototype")
467*9880d681SAndroid Build Coastguard Worker
468*9880d681SAndroid Build Coastguard WorkerGiven this, a function definition is very simple, just a prototype plus
469*9880d681SAndroid Build Coastguard Workeran expression to implement the body:
470*9880d681SAndroid Build Coastguard Worker
471*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
472*9880d681SAndroid Build Coastguard Worker
473*9880d681SAndroid Build Coastguard Worker    (* definition ::= 'def' prototype expression *)
474*9880d681SAndroid Build Coastguard Worker    let parse_definition = parser
475*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
476*9880d681SAndroid Build Coastguard Worker          Ast.Function (p, e)
477*9880d681SAndroid Build Coastguard Worker
478*9880d681SAndroid Build Coastguard WorkerIn addition, we support 'extern' to declare functions like 'sin' and
479*9880d681SAndroid Build Coastguard Worker'cos' as well as to support forward declaration of user functions. These
480*9880d681SAndroid Build Coastguard Worker'extern's are just prototypes with no body:
481*9880d681SAndroid Build Coastguard Worker
482*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
483*9880d681SAndroid Build Coastguard Worker
484*9880d681SAndroid Build Coastguard Worker    (*  external ::= 'extern' prototype *)
485*9880d681SAndroid Build Coastguard Worker    let parse_extern = parser
486*9880d681SAndroid Build Coastguard Worker      | [< 'Token.Extern; e=parse_prototype >] -> e
487*9880d681SAndroid Build Coastguard Worker
488*9880d681SAndroid Build Coastguard WorkerFinally, we'll also let the user type in arbitrary top-level expressions
489*9880d681SAndroid Build Coastguard Workerand evaluate them on the fly. We will handle this by defining anonymous
490*9880d681SAndroid Build Coastguard Workernullary (zero argument) functions for them:
491*9880d681SAndroid Build Coastguard Worker
492*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
493*9880d681SAndroid Build Coastguard Worker
494*9880d681SAndroid Build Coastguard Worker    (* toplevelexpr ::= expression *)
495*9880d681SAndroid Build Coastguard Worker    let parse_toplevel = parser
496*9880d681SAndroid Build Coastguard Worker      | [< e=parse_expr >] ->
497*9880d681SAndroid Build Coastguard Worker          (* Make an anonymous proto. *)
498*9880d681SAndroid Build Coastguard Worker          Ast.Function (Ast.Prototype ("", [||]), e)
499*9880d681SAndroid Build Coastguard Worker
500*9880d681SAndroid Build Coastguard WorkerNow that we have all the pieces, let's build a little driver that will
501*9880d681SAndroid Build Coastguard Workerlet us actually *execute* this code we've built!
502*9880d681SAndroid Build Coastguard Worker
503*9880d681SAndroid Build Coastguard WorkerThe Driver
504*9880d681SAndroid Build Coastguard Worker==========
505*9880d681SAndroid Build Coastguard Worker
506*9880d681SAndroid Build Coastguard WorkerThe driver for this simply invokes all of the parsing pieces with a
507*9880d681SAndroid Build Coastguard Workertop-level dispatch loop. There isn't much interesting here, so I'll just
508*9880d681SAndroid Build Coastguard Workerinclude the top-level loop. See `below <#full-code-listing>`_ for full code in the
509*9880d681SAndroid Build Coastguard Worker"Top-Level Parsing" section.
510*9880d681SAndroid Build Coastguard Worker
511*9880d681SAndroid Build Coastguard Worker.. code-block:: ocaml
512*9880d681SAndroid Build Coastguard Worker
513*9880d681SAndroid Build Coastguard Worker    (* top ::= definition | external | expression | ';' *)
514*9880d681SAndroid Build Coastguard Worker    let rec main_loop stream =
515*9880d681SAndroid Build Coastguard Worker      match Stream.peek stream with
516*9880d681SAndroid Build Coastguard Worker      | None -> ()
517*9880d681SAndroid Build Coastguard Worker
518*9880d681SAndroid Build Coastguard Worker      (* ignore top-level semicolons. *)
519*9880d681SAndroid Build Coastguard Worker      | Some (Token.Kwd ';') ->
520*9880d681SAndroid Build Coastguard Worker          Stream.junk stream;
521*9880d681SAndroid Build Coastguard Worker          main_loop stream
522*9880d681SAndroid Build Coastguard Worker
523*9880d681SAndroid Build Coastguard Worker      | Some token ->
524*9880d681SAndroid Build Coastguard Worker          begin
525*9880d681SAndroid Build Coastguard Worker            try match token with
526*9880d681SAndroid Build Coastguard Worker            | Token.Def ->
527*9880d681SAndroid Build Coastguard Worker                ignore(Parser.parse_definition stream);
528*9880d681SAndroid Build Coastguard Worker                print_endline "parsed a function definition.";
529*9880d681SAndroid Build Coastguard Worker            | Token.Extern ->
530*9880d681SAndroid Build Coastguard Worker                ignore(Parser.parse_extern stream);
531*9880d681SAndroid Build Coastguard Worker                print_endline "parsed an extern.";
532*9880d681SAndroid Build Coastguard Worker            | _ ->
533*9880d681SAndroid Build Coastguard Worker                (* Evaluate a top-level expression into an anonymous function. *)
534*9880d681SAndroid Build Coastguard Worker                ignore(Parser.parse_toplevel stream);
535*9880d681SAndroid Build Coastguard Worker                print_endline "parsed a top-level expr";
536*9880d681SAndroid Build Coastguard Worker            with Stream.Error s ->
537*9880d681SAndroid Build Coastguard Worker              (* Skip token for error recovery. *)
538*9880d681SAndroid Build Coastguard Worker              Stream.junk stream;
539*9880d681SAndroid Build Coastguard Worker              print_endline s;
540*9880d681SAndroid Build Coastguard Worker          end;
541*9880d681SAndroid Build Coastguard Worker          print_string "ready> "; flush stdout;
542*9880d681SAndroid Build Coastguard Worker          main_loop stream
543*9880d681SAndroid Build Coastguard Worker
544*9880d681SAndroid Build Coastguard WorkerThe most interesting part of this is that we ignore top-level
545*9880d681SAndroid Build Coastguard Workersemicolons. Why is this, you ask? The basic reason is that if you type
546*9880d681SAndroid Build Coastguard Worker"4 + 5" at the command line, the parser doesn't know whether that is the
547*9880d681SAndroid Build Coastguard Workerend of what you will type or not. For example, on the next line you
548*9880d681SAndroid Build Coastguard Workercould type "def foo..." in which case 4+5 is the end of a top-level
549*9880d681SAndroid Build Coastguard Workerexpression. Alternatively you could type "\* 6", which would continue
550*9880d681SAndroid Build Coastguard Workerthe expression. Having top-level semicolons allows you to type "4+5;",
551*9880d681SAndroid Build Coastguard Workerand the parser will know you are done.
552*9880d681SAndroid Build Coastguard Worker
553*9880d681SAndroid Build Coastguard WorkerConclusions
554*9880d681SAndroid Build Coastguard Worker===========
555*9880d681SAndroid Build Coastguard Worker
556*9880d681SAndroid Build Coastguard WorkerWith just under 300 lines of commented code (240 lines of non-comment,
557*9880d681SAndroid Build Coastguard Workernon-blank code), we fully defined our minimal language, including a
558*9880d681SAndroid Build Coastguard Workerlexer, parser, and AST builder. With this done, the executable will
559*9880d681SAndroid Build Coastguard Workervalidate Kaleidoscope code and tell us if it is grammatically invalid.
560*9880d681SAndroid Build Coastguard WorkerFor example, here is a sample interaction:
561*9880d681SAndroid Build Coastguard Worker
562*9880d681SAndroid Build Coastguard Worker.. code-block:: bash
563*9880d681SAndroid Build Coastguard Worker
564*9880d681SAndroid Build Coastguard Worker    $ ./toy.byte
565*9880d681SAndroid Build Coastguard Worker    ready> def foo(x y) x+foo(y, 4.0);
566*9880d681SAndroid Build Coastguard Worker    Parsed a function definition.
567*9880d681SAndroid Build Coastguard Worker    ready> def foo(x y) x+y y;
568*9880d681SAndroid Build Coastguard Worker    Parsed a function definition.
569*9880d681SAndroid Build Coastguard Worker    Parsed a top-level expr
570*9880d681SAndroid Build Coastguard Worker    ready> def foo(x y) x+y );
571*9880d681SAndroid Build Coastguard Worker    Parsed a function definition.
572*9880d681SAndroid Build Coastguard Worker    Error: unknown token when expecting an expression
573*9880d681SAndroid Build Coastguard Worker    ready> extern sin(a);
574*9880d681SAndroid Build Coastguard Worker    ready> Parsed an extern
575*9880d681SAndroid Build Coastguard Worker    ready> ^D
576*9880d681SAndroid Build Coastguard Worker    $
577*9880d681SAndroid Build Coastguard Worker
578*9880d681SAndroid Build Coastguard WorkerThere is a lot of room for extension here. You can define new AST nodes,
579*9880d681SAndroid Build Coastguard Workerextend the language in many ways, etc. In the `next
580*9880d681SAndroid Build Coastguard Workerinstallment <OCamlLangImpl3.html>`_, we will describe how to generate
581*9880d681SAndroid Build Coastguard WorkerLLVM Intermediate Representation (IR) from the AST.
582*9880d681SAndroid Build Coastguard Worker
583*9880d681SAndroid Build Coastguard WorkerFull Code Listing
584*9880d681SAndroid Build Coastguard Worker=================
585*9880d681SAndroid Build Coastguard Worker
586*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for this and the previous chapter.
587*9880d681SAndroid Build Coastguard WorkerNote that it is fully self-contained: you don't need LLVM or any
588*9880d681SAndroid Build Coastguard Workerexternal libraries at all for this. (Besides the ocaml standard
589*9880d681SAndroid Build Coastguard Workerlibraries, of course.) To build this, just compile with:
590*9880d681SAndroid Build Coastguard Worker
591*9880d681SAndroid Build Coastguard Worker.. code-block:: bash
592*9880d681SAndroid Build Coastguard Worker
593*9880d681SAndroid Build Coastguard Worker    # Compile
594*9880d681SAndroid Build Coastguard Worker    ocamlbuild toy.byte
595*9880d681SAndroid Build Coastguard Worker    # Run
596*9880d681SAndroid Build Coastguard Worker    ./toy.byte
597*9880d681SAndroid Build Coastguard Worker
598*9880d681SAndroid Build Coastguard WorkerHere is the code:
599*9880d681SAndroid Build Coastguard Worker
600*9880d681SAndroid Build Coastguard Worker\_tags:
601*9880d681SAndroid Build Coastguard Worker    ::
602*9880d681SAndroid Build Coastguard Worker
603*9880d681SAndroid Build Coastguard Worker        <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
604*9880d681SAndroid Build Coastguard Worker
605*9880d681SAndroid Build Coastguard Workertoken.ml:
606*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
607*9880d681SAndroid Build Coastguard Worker
608*9880d681SAndroid Build Coastguard Worker        (*===----------------------------------------------------------------------===
609*9880d681SAndroid Build Coastguard Worker         * Lexer Tokens
610*9880d681SAndroid Build Coastguard Worker         *===----------------------------------------------------------------------===*)
611*9880d681SAndroid Build Coastguard Worker
612*9880d681SAndroid Build Coastguard Worker        (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
613*9880d681SAndroid Build Coastguard Worker         * these others for known things. *)
614*9880d681SAndroid Build Coastguard Worker        type token =
615*9880d681SAndroid Build Coastguard Worker          (* commands *)
616*9880d681SAndroid Build Coastguard Worker          | Def | Extern
617*9880d681SAndroid Build Coastguard Worker
618*9880d681SAndroid Build Coastguard Worker          (* primary *)
619*9880d681SAndroid Build Coastguard Worker          | Ident of string | Number of float
620*9880d681SAndroid Build Coastguard Worker
621*9880d681SAndroid Build Coastguard Worker          (* unknown *)
622*9880d681SAndroid Build Coastguard Worker          | Kwd of char
623*9880d681SAndroid Build Coastguard Worker
624*9880d681SAndroid Build Coastguard Workerlexer.ml:
625*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
626*9880d681SAndroid Build Coastguard Worker
627*9880d681SAndroid Build Coastguard Worker        (*===----------------------------------------------------------------------===
628*9880d681SAndroid Build Coastguard Worker         * Lexer
629*9880d681SAndroid Build Coastguard Worker         *===----------------------------------------------------------------------===*)
630*9880d681SAndroid Build Coastguard Worker
631*9880d681SAndroid Build Coastguard Worker        let rec lex = parser
632*9880d681SAndroid Build Coastguard Worker          (* Skip any whitespace. *)
633*9880d681SAndroid Build Coastguard Worker          | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
634*9880d681SAndroid Build Coastguard Worker
635*9880d681SAndroid Build Coastguard Worker          (* identifier: [a-zA-Z][a-zA-Z0-9] *)
636*9880d681SAndroid Build Coastguard Worker          | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
637*9880d681SAndroid Build Coastguard Worker              let buffer = Buffer.create 1 in
638*9880d681SAndroid Build Coastguard Worker              Buffer.add_char buffer c;
639*9880d681SAndroid Build Coastguard Worker              lex_ident buffer stream
640*9880d681SAndroid Build Coastguard Worker
641*9880d681SAndroid Build Coastguard Worker          (* number: [0-9.]+ *)
642*9880d681SAndroid Build Coastguard Worker          | [< ' ('0' .. '9' as c); stream >] ->
643*9880d681SAndroid Build Coastguard Worker              let buffer = Buffer.create 1 in
644*9880d681SAndroid Build Coastguard Worker              Buffer.add_char buffer c;
645*9880d681SAndroid Build Coastguard Worker              lex_number buffer stream
646*9880d681SAndroid Build Coastguard Worker
647*9880d681SAndroid Build Coastguard Worker          (* Comment until end of line. *)
648*9880d681SAndroid Build Coastguard Worker          | [< ' ('#'); stream >] ->
649*9880d681SAndroid Build Coastguard Worker              lex_comment stream
650*9880d681SAndroid Build Coastguard Worker
651*9880d681SAndroid Build Coastguard Worker          (* Otherwise, just return the character as its ascii value. *)
652*9880d681SAndroid Build Coastguard Worker          | [< 'c; stream >] ->
653*9880d681SAndroid Build Coastguard Worker              [< 'Token.Kwd c; lex stream >]
654*9880d681SAndroid Build Coastguard Worker
655*9880d681SAndroid Build Coastguard Worker          (* end of stream. *)
656*9880d681SAndroid Build Coastguard Worker          | [< >] -> [< >]
657*9880d681SAndroid Build Coastguard Worker
658*9880d681SAndroid Build Coastguard Worker        and lex_number buffer = parser
659*9880d681SAndroid Build Coastguard Worker          | [< ' ('0' .. '9' | '.' as c); stream >] ->
660*9880d681SAndroid Build Coastguard Worker              Buffer.add_char buffer c;
661*9880d681SAndroid Build Coastguard Worker              lex_number buffer stream
662*9880d681SAndroid Build Coastguard Worker          | [< stream=lex >] ->
663*9880d681SAndroid Build Coastguard Worker              [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
664*9880d681SAndroid Build Coastguard Worker
665*9880d681SAndroid Build Coastguard Worker        and lex_ident buffer = parser
666*9880d681SAndroid Build Coastguard Worker          | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
667*9880d681SAndroid Build Coastguard Worker              Buffer.add_char buffer c;
668*9880d681SAndroid Build Coastguard Worker              lex_ident buffer stream
669*9880d681SAndroid Build Coastguard Worker          | [< stream=lex >] ->
670*9880d681SAndroid Build Coastguard Worker              match Buffer.contents buffer with
671*9880d681SAndroid Build Coastguard Worker              | "def" -> [< 'Token.Def; stream >]
672*9880d681SAndroid Build Coastguard Worker              | "extern" -> [< 'Token.Extern; stream >]
673*9880d681SAndroid Build Coastguard Worker              | id -> [< 'Token.Ident id; stream >]
674*9880d681SAndroid Build Coastguard Worker
675*9880d681SAndroid Build Coastguard Worker        and lex_comment = parser
676*9880d681SAndroid Build Coastguard Worker          | [< ' ('\n'); stream=lex >] -> stream
677*9880d681SAndroid Build Coastguard Worker          | [< 'c; e=lex_comment >] -> e
678*9880d681SAndroid Build Coastguard Worker          | [< >] -> [< >]
679*9880d681SAndroid Build Coastguard Worker
680*9880d681SAndroid Build Coastguard Workerast.ml:
681*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
682*9880d681SAndroid Build Coastguard Worker
683*9880d681SAndroid Build Coastguard Worker        (*===----------------------------------------------------------------------===
684*9880d681SAndroid Build Coastguard Worker         * Abstract Syntax Tree (aka Parse Tree)
685*9880d681SAndroid Build Coastguard Worker         *===----------------------------------------------------------------------===*)
686*9880d681SAndroid Build Coastguard Worker
687*9880d681SAndroid Build Coastguard Worker        (* expr - Base type for all expression nodes. *)
688*9880d681SAndroid Build Coastguard Worker        type expr =
689*9880d681SAndroid Build Coastguard Worker          (* variant for numeric literals like "1.0". *)
690*9880d681SAndroid Build Coastguard Worker          | Number of float
691*9880d681SAndroid Build Coastguard Worker
692*9880d681SAndroid Build Coastguard Worker          (* variant for referencing a variable, like "a". *)
693*9880d681SAndroid Build Coastguard Worker          | Variable of string
694*9880d681SAndroid Build Coastguard Worker
695*9880d681SAndroid Build Coastguard Worker          (* variant for a binary operator. *)
696*9880d681SAndroid Build Coastguard Worker          | Binary of char * expr * expr
697*9880d681SAndroid Build Coastguard Worker
698*9880d681SAndroid Build Coastguard Worker          (* variant for function calls. *)
699*9880d681SAndroid Build Coastguard Worker          | Call of string * expr array
700*9880d681SAndroid Build Coastguard Worker
701*9880d681SAndroid Build Coastguard Worker        (* proto - This type represents the "prototype" for a function, which captures
702*9880d681SAndroid Build Coastguard Worker         * its name, and its argument names (thus implicitly the number of arguments the
703*9880d681SAndroid Build Coastguard Worker         * function takes). *)
704*9880d681SAndroid Build Coastguard Worker        type proto = Prototype of string * string array
705*9880d681SAndroid Build Coastguard Worker
706*9880d681SAndroid Build Coastguard Worker        (* func - This type represents a function definition itself. *)
707*9880d681SAndroid Build Coastguard Worker        type func = Function of proto * expr
708*9880d681SAndroid Build Coastguard Worker
709*9880d681SAndroid Build Coastguard Workerparser.ml:
710*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
711*9880d681SAndroid Build Coastguard Worker
712*9880d681SAndroid Build Coastguard Worker        (*===---------------------------------------------------------------------===
713*9880d681SAndroid Build Coastguard Worker         * Parser
714*9880d681SAndroid Build Coastguard Worker         *===---------------------------------------------------------------------===*)
715*9880d681SAndroid Build Coastguard Worker
716*9880d681SAndroid Build Coastguard Worker        (* binop_precedence - This holds the precedence for each binary operator that is
717*9880d681SAndroid Build Coastguard Worker         * defined *)
718*9880d681SAndroid Build Coastguard Worker        let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
719*9880d681SAndroid Build Coastguard Worker
720*9880d681SAndroid Build Coastguard Worker        (* precedence - Get the precedence of the pending binary operator token. *)
721*9880d681SAndroid Build Coastguard Worker        let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
722*9880d681SAndroid Build Coastguard Worker
723*9880d681SAndroid Build Coastguard Worker        (* primary
724*9880d681SAndroid Build Coastguard Worker         *   ::= identifier
725*9880d681SAndroid Build Coastguard Worker         *   ::= numberexpr
726*9880d681SAndroid Build Coastguard Worker         *   ::= parenexpr *)
727*9880d681SAndroid Build Coastguard Worker        let rec parse_primary = parser
728*9880d681SAndroid Build Coastguard Worker          (* numberexpr ::= number *)
729*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Number n >] -> Ast.Number n
730*9880d681SAndroid Build Coastguard Worker
731*9880d681SAndroid Build Coastguard Worker          (* parenexpr ::= '(' expression ')' *)
732*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
733*9880d681SAndroid Build Coastguard Worker
734*9880d681SAndroid Build Coastguard Worker          (* identifierexpr
735*9880d681SAndroid Build Coastguard Worker           *   ::= identifier
736*9880d681SAndroid Build Coastguard Worker           *   ::= identifier '(' argumentexpr ')' *)
737*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Ident id; stream >] ->
738*9880d681SAndroid Build Coastguard Worker              let rec parse_args accumulator = parser
739*9880d681SAndroid Build Coastguard Worker                | [< e=parse_expr; stream >] ->
740*9880d681SAndroid Build Coastguard Worker                    begin parser
741*9880d681SAndroid Build Coastguard Worker                      | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
742*9880d681SAndroid Build Coastguard Worker                      | [< >] -> e :: accumulator
743*9880d681SAndroid Build Coastguard Worker                    end stream
744*9880d681SAndroid Build Coastguard Worker                | [< >] -> accumulator
745*9880d681SAndroid Build Coastguard Worker              in
746*9880d681SAndroid Build Coastguard Worker              let rec parse_ident id = parser
747*9880d681SAndroid Build Coastguard Worker                (* Call. *)
748*9880d681SAndroid Build Coastguard Worker                | [< 'Token.Kwd '(';
749*9880d681SAndroid Build Coastguard Worker                     args=parse_args [];
750*9880d681SAndroid Build Coastguard Worker                     'Token.Kwd ')' ?? "expected ')'">] ->
751*9880d681SAndroid Build Coastguard Worker                    Ast.Call (id, Array.of_list (List.rev args))
752*9880d681SAndroid Build Coastguard Worker
753*9880d681SAndroid Build Coastguard Worker                (* Simple variable ref. *)
754*9880d681SAndroid Build Coastguard Worker                | [< >] -> Ast.Variable id
755*9880d681SAndroid Build Coastguard Worker              in
756*9880d681SAndroid Build Coastguard Worker              parse_ident id stream
757*9880d681SAndroid Build Coastguard Worker
758*9880d681SAndroid Build Coastguard Worker          | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
759*9880d681SAndroid Build Coastguard Worker
760*9880d681SAndroid Build Coastguard Worker        (* binoprhs
761*9880d681SAndroid Build Coastguard Worker         *   ::= ('+' primary)* *)
762*9880d681SAndroid Build Coastguard Worker        and parse_bin_rhs expr_prec lhs stream =
763*9880d681SAndroid Build Coastguard Worker          match Stream.peek stream with
764*9880d681SAndroid Build Coastguard Worker          (* If this is a binop, find its precedence. *)
765*9880d681SAndroid Build Coastguard Worker          | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
766*9880d681SAndroid Build Coastguard Worker              let token_prec = precedence c in
767*9880d681SAndroid Build Coastguard Worker
768*9880d681SAndroid Build Coastguard Worker              (* If this is a binop that binds at least as tightly as the current binop,
769*9880d681SAndroid Build Coastguard Worker               * consume it, otherwise we are done. *)
770*9880d681SAndroid Build Coastguard Worker              if token_prec < expr_prec then lhs else begin
771*9880d681SAndroid Build Coastguard Worker                (* Eat the binop. *)
772*9880d681SAndroid Build Coastguard Worker                Stream.junk stream;
773*9880d681SAndroid Build Coastguard Worker
774*9880d681SAndroid Build Coastguard Worker                (* Parse the primary expression after the binary operator. *)
775*9880d681SAndroid Build Coastguard Worker                let rhs = parse_primary stream in
776*9880d681SAndroid Build Coastguard Worker
777*9880d681SAndroid Build Coastguard Worker                (* Okay, we know this is a binop. *)
778*9880d681SAndroid Build Coastguard Worker                let rhs =
779*9880d681SAndroid Build Coastguard Worker                  match Stream.peek stream with
780*9880d681SAndroid Build Coastguard Worker                  | Some (Token.Kwd c2) ->
781*9880d681SAndroid Build Coastguard Worker                      (* If BinOp binds less tightly with rhs than the operator after
782*9880d681SAndroid Build Coastguard Worker                       * rhs, let the pending operator take rhs as its lhs. *)
783*9880d681SAndroid Build Coastguard Worker                      let next_prec = precedence c2 in
784*9880d681SAndroid Build Coastguard Worker                      if token_prec < next_prec
785*9880d681SAndroid Build Coastguard Worker                      then parse_bin_rhs (token_prec + 1) rhs stream
786*9880d681SAndroid Build Coastguard Worker                      else rhs
787*9880d681SAndroid Build Coastguard Worker                  | _ -> rhs
788*9880d681SAndroid Build Coastguard Worker                in
789*9880d681SAndroid Build Coastguard Worker
790*9880d681SAndroid Build Coastguard Worker                (* Merge lhs/rhs. *)
791*9880d681SAndroid Build Coastguard Worker                let lhs = Ast.Binary (c, lhs, rhs) in
792*9880d681SAndroid Build Coastguard Worker                parse_bin_rhs expr_prec lhs stream
793*9880d681SAndroid Build Coastguard Worker              end
794*9880d681SAndroid Build Coastguard Worker          | _ -> lhs
795*9880d681SAndroid Build Coastguard Worker
796*9880d681SAndroid Build Coastguard Worker        (* expression
797*9880d681SAndroid Build Coastguard Worker         *   ::= primary binoprhs *)
798*9880d681SAndroid Build Coastguard Worker        and parse_expr = parser
799*9880d681SAndroid Build Coastguard Worker          | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
800*9880d681SAndroid Build Coastguard Worker
801*9880d681SAndroid Build Coastguard Worker        (* prototype
802*9880d681SAndroid Build Coastguard Worker         *   ::= id '(' id* ')' *)
803*9880d681SAndroid Build Coastguard Worker        let parse_prototype =
804*9880d681SAndroid Build Coastguard Worker          let rec parse_args accumulator = parser
805*9880d681SAndroid Build Coastguard Worker            | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
806*9880d681SAndroid Build Coastguard Worker            | [< >] -> accumulator
807*9880d681SAndroid Build Coastguard Worker          in
808*9880d681SAndroid Build Coastguard Worker
809*9880d681SAndroid Build Coastguard Worker          parser
810*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Ident id;
811*9880d681SAndroid Build Coastguard Worker               'Token.Kwd '(' ?? "expected '(' in prototype";
812*9880d681SAndroid Build Coastguard Worker               args=parse_args [];
813*9880d681SAndroid Build Coastguard Worker               'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
814*9880d681SAndroid Build Coastguard Worker              (* success. *)
815*9880d681SAndroid Build Coastguard Worker              Ast.Prototype (id, Array.of_list (List.rev args))
816*9880d681SAndroid Build Coastguard Worker
817*9880d681SAndroid Build Coastguard Worker          | [< >] ->
818*9880d681SAndroid Build Coastguard Worker              raise (Stream.Error "expected function name in prototype")
819*9880d681SAndroid Build Coastguard Worker
820*9880d681SAndroid Build Coastguard Worker        (* definition ::= 'def' prototype expression *)
821*9880d681SAndroid Build Coastguard Worker        let parse_definition = parser
822*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
823*9880d681SAndroid Build Coastguard Worker              Ast.Function (p, e)
824*9880d681SAndroid Build Coastguard Worker
825*9880d681SAndroid Build Coastguard Worker        (* toplevelexpr ::= expression *)
826*9880d681SAndroid Build Coastguard Worker        let parse_toplevel = parser
827*9880d681SAndroid Build Coastguard Worker          | [< e=parse_expr >] ->
828*9880d681SAndroid Build Coastguard Worker              (* Make an anonymous proto. *)
829*9880d681SAndroid Build Coastguard Worker              Ast.Function (Ast.Prototype ("", [||]), e)
830*9880d681SAndroid Build Coastguard Worker
831*9880d681SAndroid Build Coastguard Worker        (*  external ::= 'extern' prototype *)
832*9880d681SAndroid Build Coastguard Worker        let parse_extern = parser
833*9880d681SAndroid Build Coastguard Worker          | [< 'Token.Extern; e=parse_prototype >] -> e
834*9880d681SAndroid Build Coastguard Worker
835*9880d681SAndroid Build Coastguard Workertoplevel.ml:
836*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
837*9880d681SAndroid Build Coastguard Worker
838*9880d681SAndroid Build Coastguard Worker        (*===----------------------------------------------------------------------===
839*9880d681SAndroid Build Coastguard Worker         * Top-Level parsing and JIT Driver
840*9880d681SAndroid Build Coastguard Worker         *===----------------------------------------------------------------------===*)
841*9880d681SAndroid Build Coastguard Worker
842*9880d681SAndroid Build Coastguard Worker        (* top ::= definition | external | expression | ';' *)
843*9880d681SAndroid Build Coastguard Worker        let rec main_loop stream =
844*9880d681SAndroid Build Coastguard Worker          match Stream.peek stream with
845*9880d681SAndroid Build Coastguard Worker          | None -> ()
846*9880d681SAndroid Build Coastguard Worker
847*9880d681SAndroid Build Coastguard Worker          (* ignore top-level semicolons. *)
848*9880d681SAndroid Build Coastguard Worker          | Some (Token.Kwd ';') ->
849*9880d681SAndroid Build Coastguard Worker              Stream.junk stream;
850*9880d681SAndroid Build Coastguard Worker              main_loop stream
851*9880d681SAndroid Build Coastguard Worker
852*9880d681SAndroid Build Coastguard Worker          | Some token ->
853*9880d681SAndroid Build Coastguard Worker              begin
854*9880d681SAndroid Build Coastguard Worker                try match token with
855*9880d681SAndroid Build Coastguard Worker                | Token.Def ->
856*9880d681SAndroid Build Coastguard Worker                    ignore(Parser.parse_definition stream);
857*9880d681SAndroid Build Coastguard Worker                    print_endline "parsed a function definition.";
858*9880d681SAndroid Build Coastguard Worker                | Token.Extern ->
859*9880d681SAndroid Build Coastguard Worker                    ignore(Parser.parse_extern stream);
860*9880d681SAndroid Build Coastguard Worker                    print_endline "parsed an extern.";
861*9880d681SAndroid Build Coastguard Worker                | _ ->
862*9880d681SAndroid Build Coastguard Worker                    (* Evaluate a top-level expression into an anonymous function. *)
863*9880d681SAndroid Build Coastguard Worker                    ignore(Parser.parse_toplevel stream);
864*9880d681SAndroid Build Coastguard Worker                    print_endline "parsed a top-level expr";
865*9880d681SAndroid Build Coastguard Worker                with Stream.Error s ->
866*9880d681SAndroid Build Coastguard Worker                  (* Skip token for error recovery. *)
867*9880d681SAndroid Build Coastguard Worker                  Stream.junk stream;
868*9880d681SAndroid Build Coastguard Worker                  print_endline s;
869*9880d681SAndroid Build Coastguard Worker              end;
870*9880d681SAndroid Build Coastguard Worker              print_string "ready> "; flush stdout;
871*9880d681SAndroid Build Coastguard Worker              main_loop stream
872*9880d681SAndroid Build Coastguard Worker
873*9880d681SAndroid Build Coastguard Workertoy.ml:
874*9880d681SAndroid Build Coastguard Worker    .. code-block:: ocaml
875*9880d681SAndroid Build Coastguard Worker
876*9880d681SAndroid Build Coastguard Worker        (*===----------------------------------------------------------------------===
877*9880d681SAndroid Build Coastguard Worker         * Main driver code.
878*9880d681SAndroid Build Coastguard Worker         *===----------------------------------------------------------------------===*)
879*9880d681SAndroid Build Coastguard Worker
880*9880d681SAndroid Build Coastguard Worker        let main () =
881*9880d681SAndroid Build Coastguard Worker          (* Install standard binary operators.
882*9880d681SAndroid Build Coastguard Worker           * 1 is the lowest precedence. *)
883*9880d681SAndroid Build Coastguard Worker          Hashtbl.add Parser.binop_precedence '<' 10;
884*9880d681SAndroid Build Coastguard Worker          Hashtbl.add Parser.binop_precedence '+' 20;
885*9880d681SAndroid Build Coastguard Worker          Hashtbl.add Parser.binop_precedence '-' 20;
886*9880d681SAndroid Build Coastguard Worker          Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
887*9880d681SAndroid Build Coastguard Worker
888*9880d681SAndroid Build Coastguard Worker          (* Prime the first token. *)
889*9880d681SAndroid Build Coastguard Worker          print_string "ready> "; flush stdout;
890*9880d681SAndroid Build Coastguard Worker          let stream = Lexer.lex (Stream.of_channel stdin) in
891*9880d681SAndroid Build Coastguard Worker
892*9880d681SAndroid Build Coastguard Worker          (* Run the main "interpreter loop" now. *)
893*9880d681SAndroid Build Coastguard Worker          Toplevel.main_loop stream;
894*9880d681SAndroid Build Coastguard Worker        ;;
895*9880d681SAndroid Build Coastguard Worker
896*9880d681SAndroid Build Coastguard Worker        main ()
897*9880d681SAndroid Build Coastguard Worker
898*9880d681SAndroid Build Coastguard Worker`Next: Implementing Code Generation to LLVM IR <OCamlLangImpl3.html>`_
899*9880d681SAndroid Build Coastguard Worker
900