xref: /aosp_15_r20/external/llvm/docs/tutorial/LangImpl07.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker=======================================================
2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Extending the Language: Mutable Variables
3*9880d681SAndroid Build Coastguard Worker=======================================================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard Worker.. contents::
6*9880d681SAndroid Build Coastguard Worker   :local:
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard WorkerChapter 7 Introduction
9*9880d681SAndroid Build Coastguard Worker======================
10*9880d681SAndroid Build Coastguard Worker
11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 7 of the "`Implementing a language with
12*9880d681SAndroid Build Coastguard WorkerLLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
13*9880d681SAndroid Build Coastguard Workervery respectable, albeit simple, `functional programming
14*9880d681SAndroid Build Coastguard Workerlanguage <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
15*9880d681SAndroid Build Coastguard Workerjourney, we learned some parsing techniques, how to build and represent
16*9880d681SAndroid Build Coastguard Workeran AST, how to build LLVM IR, and how to optimize the resultant code as
17*9880d681SAndroid Build Coastguard Workerwell as JIT compile it.
18*9880d681SAndroid Build Coastguard Worker
19*9880d681SAndroid Build Coastguard WorkerWhile Kaleidoscope is interesting as a functional language, the fact
20*9880d681SAndroid Build Coastguard Workerthat it is functional makes it "too easy" to generate LLVM IR for it. In
21*9880d681SAndroid Build Coastguard Workerparticular, a functional language makes it very easy to build LLVM IR
22*9880d681SAndroid Build Coastguard Workerdirectly in `SSA
23*9880d681SAndroid Build Coastguard Workerform <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
24*9880d681SAndroid Build Coastguard WorkerSince LLVM requires that the input code be in SSA form, this is a very
25*9880d681SAndroid Build Coastguard Workernice property and it is often unclear to newcomers how to generate code
26*9880d681SAndroid Build Coastguard Workerfor an imperative language with mutable variables.
27*9880d681SAndroid Build Coastguard Worker
28*9880d681SAndroid Build Coastguard WorkerThe short (and happy) summary of this chapter is that there is no need
29*9880d681SAndroid Build Coastguard Workerfor your front-end to build SSA form: LLVM provides highly tuned and
30*9880d681SAndroid Build Coastguard Workerwell tested support for this, though the way it works is a bit
31*9880d681SAndroid Build Coastguard Workerunexpected for some.
32*9880d681SAndroid Build Coastguard Worker
33*9880d681SAndroid Build Coastguard WorkerWhy is this a hard problem?
34*9880d681SAndroid Build Coastguard Worker===========================
35*9880d681SAndroid Build Coastguard Worker
36*9880d681SAndroid Build Coastguard WorkerTo understand why mutable variables cause complexities in SSA
37*9880d681SAndroid Build Coastguard Workerconstruction, consider this extremely simple C example:
38*9880d681SAndroid Build Coastguard Worker
39*9880d681SAndroid Build Coastguard Worker.. code-block:: c
40*9880d681SAndroid Build Coastguard Worker
41*9880d681SAndroid Build Coastguard Worker    int G, H;
42*9880d681SAndroid Build Coastguard Worker    int test(_Bool Condition) {
43*9880d681SAndroid Build Coastguard Worker      int X;
44*9880d681SAndroid Build Coastguard Worker      if (Condition)
45*9880d681SAndroid Build Coastguard Worker        X = G;
46*9880d681SAndroid Build Coastguard Worker      else
47*9880d681SAndroid Build Coastguard Worker        X = H;
48*9880d681SAndroid Build Coastguard Worker      return X;
49*9880d681SAndroid Build Coastguard Worker    }
50*9880d681SAndroid Build Coastguard Worker
51*9880d681SAndroid Build Coastguard WorkerIn this case, we have the variable "X", whose value depends on the path
52*9880d681SAndroid Build Coastguard Workerexecuted in the program. Because there are two different possible values
53*9880d681SAndroid Build Coastguard Workerfor X before the return instruction, a PHI node is inserted to merge the
54*9880d681SAndroid Build Coastguard Workertwo values. The LLVM IR that we want for this example looks like this:
55*9880d681SAndroid Build Coastguard Worker
56*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
57*9880d681SAndroid Build Coastguard Worker
58*9880d681SAndroid Build Coastguard Worker    @G = weak global i32 0   ; type of @G is i32*
59*9880d681SAndroid Build Coastguard Worker    @H = weak global i32 0   ; type of @H is i32*
60*9880d681SAndroid Build Coastguard Worker
61*9880d681SAndroid Build Coastguard Worker    define i32 @test(i1 %Condition) {
62*9880d681SAndroid Build Coastguard Worker    entry:
63*9880d681SAndroid Build Coastguard Worker      br i1 %Condition, label %cond_true, label %cond_false
64*9880d681SAndroid Build Coastguard Worker
65*9880d681SAndroid Build Coastguard Worker    cond_true:
66*9880d681SAndroid Build Coastguard Worker      %X.0 = load i32* @G
67*9880d681SAndroid Build Coastguard Worker      br label %cond_next
68*9880d681SAndroid Build Coastguard Worker
69*9880d681SAndroid Build Coastguard Worker    cond_false:
70*9880d681SAndroid Build Coastguard Worker      %X.1 = load i32* @H
71*9880d681SAndroid Build Coastguard Worker      br label %cond_next
72*9880d681SAndroid Build Coastguard Worker
73*9880d681SAndroid Build Coastguard Worker    cond_next:
74*9880d681SAndroid Build Coastguard Worker      %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
75*9880d681SAndroid Build Coastguard Worker      ret i32 %X.2
76*9880d681SAndroid Build Coastguard Worker    }
77*9880d681SAndroid Build Coastguard Worker
78*9880d681SAndroid Build Coastguard WorkerIn this example, the loads from the G and H global variables are
79*9880d681SAndroid Build Coastguard Workerexplicit in the LLVM IR, and they live in the then/else branches of the
80*9880d681SAndroid Build Coastguard Workerif statement (cond\_true/cond\_false). In order to merge the incoming
81*9880d681SAndroid Build Coastguard Workervalues, the X.2 phi node in the cond\_next block selects the right value
82*9880d681SAndroid Build Coastguard Workerto use based on where control flow is coming from: if control flow comes
83*9880d681SAndroid Build Coastguard Workerfrom the cond\_false block, X.2 gets the value of X.1. Alternatively, if
84*9880d681SAndroid Build Coastguard Workercontrol flow comes from cond\_true, it gets the value of X.0. The intent
85*9880d681SAndroid Build Coastguard Workerof this chapter is not to explain the details of SSA form. For more
86*9880d681SAndroid Build Coastguard Workerinformation, see one of the many `online
87*9880d681SAndroid Build Coastguard Workerreferences <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
88*9880d681SAndroid Build Coastguard Worker
89*9880d681SAndroid Build Coastguard WorkerThe question for this article is "who places the phi nodes when lowering
90*9880d681SAndroid Build Coastguard Workerassignments to mutable variables?". The issue here is that LLVM
91*9880d681SAndroid Build Coastguard Worker*requires* that its IR be in SSA form: there is no "non-ssa" mode for
92*9880d681SAndroid Build Coastguard Workerit. However, SSA construction requires non-trivial algorithms and data
93*9880d681SAndroid Build Coastguard Workerstructures, so it is inconvenient and wasteful for every front-end to
94*9880d681SAndroid Build Coastguard Workerhave to reproduce this logic.
95*9880d681SAndroid Build Coastguard Worker
96*9880d681SAndroid Build Coastguard WorkerMemory in LLVM
97*9880d681SAndroid Build Coastguard Worker==============
98*9880d681SAndroid Build Coastguard Worker
99*9880d681SAndroid Build Coastguard WorkerThe 'trick' here is that while LLVM does require all register values to
100*9880d681SAndroid Build Coastguard Workerbe in SSA form, it does not require (or permit) memory objects to be in
101*9880d681SAndroid Build Coastguard WorkerSSA form. In the example above, note that the loads from G and H are
102*9880d681SAndroid Build Coastguard Workerdirect accesses to G and H: they are not renamed or versioned. This
103*9880d681SAndroid Build Coastguard Workerdiffers from some other compiler systems, which do try to version memory
104*9880d681SAndroid Build Coastguard Workerobjects. In LLVM, instead of encoding dataflow analysis of memory into
105*9880d681SAndroid Build Coastguard Workerthe LLVM IR, it is handled with `Analysis
106*9880d681SAndroid Build Coastguard WorkerPasses <../WritingAnLLVMPass.html>`_ which are computed on demand.
107*9880d681SAndroid Build Coastguard Worker
108*9880d681SAndroid Build Coastguard WorkerWith this in mind, the high-level idea is that we want to make a stack
109*9880d681SAndroid Build Coastguard Workervariable (which lives in memory, because it is on the stack) for each
110*9880d681SAndroid Build Coastguard Workermutable object in a function. To take advantage of this trick, we need
111*9880d681SAndroid Build Coastguard Workerto talk about how LLVM represents stack variables.
112*9880d681SAndroid Build Coastguard Worker
113*9880d681SAndroid Build Coastguard WorkerIn LLVM, all memory accesses are explicit with load/store instructions,
114*9880d681SAndroid Build Coastguard Workerand it is carefully designed not to have (or need) an "address-of"
115*9880d681SAndroid Build Coastguard Workeroperator. Notice how the type of the @G/@H global variables is actually
116*9880d681SAndroid Build Coastguard Worker"i32\*" even though the variable is defined as "i32". What this means is
117*9880d681SAndroid Build Coastguard Workerthat @G defines *space* for an i32 in the global data area, but its
118*9880d681SAndroid Build Coastguard Worker*name* actually refers to the address for that space. Stack variables
119*9880d681SAndroid Build Coastguard Workerwork the same way, except that instead of being declared with global
120*9880d681SAndroid Build Coastguard Workervariable definitions, they are declared with the `LLVM alloca
121*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#alloca-instruction>`_:
122*9880d681SAndroid Build Coastguard Worker
123*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
124*9880d681SAndroid Build Coastguard Worker
125*9880d681SAndroid Build Coastguard Worker    define i32 @example() {
126*9880d681SAndroid Build Coastguard Worker    entry:
127*9880d681SAndroid Build Coastguard Worker      %X = alloca i32           ; type of %X is i32*.
128*9880d681SAndroid Build Coastguard Worker      ...
129*9880d681SAndroid Build Coastguard Worker      %tmp = load i32* %X       ; load the stack value %X from the stack.
130*9880d681SAndroid Build Coastguard Worker      %tmp2 = add i32 %tmp, 1   ; increment it
131*9880d681SAndroid Build Coastguard Worker      store i32 %tmp2, i32* %X  ; store it back
132*9880d681SAndroid Build Coastguard Worker      ...
133*9880d681SAndroid Build Coastguard Worker
134*9880d681SAndroid Build Coastguard WorkerThis code shows an example of how you can declare and manipulate a stack
135*9880d681SAndroid Build Coastguard Workervariable in the LLVM IR. Stack memory allocated with the alloca
136*9880d681SAndroid Build Coastguard Workerinstruction is fully general: you can pass the address of the stack slot
137*9880d681SAndroid Build Coastguard Workerto functions, you can store it in other variables, etc. In our example
138*9880d681SAndroid Build Coastguard Workerabove, we could rewrite the example to use the alloca technique to avoid
139*9880d681SAndroid Build Coastguard Workerusing a PHI node:
140*9880d681SAndroid Build Coastguard Worker
141*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
142*9880d681SAndroid Build Coastguard Worker
143*9880d681SAndroid Build Coastguard Worker    @G = weak global i32 0   ; type of @G is i32*
144*9880d681SAndroid Build Coastguard Worker    @H = weak global i32 0   ; type of @H is i32*
145*9880d681SAndroid Build Coastguard Worker
146*9880d681SAndroid Build Coastguard Worker    define i32 @test(i1 %Condition) {
147*9880d681SAndroid Build Coastguard Worker    entry:
148*9880d681SAndroid Build Coastguard Worker      %X = alloca i32           ; type of %X is i32*.
149*9880d681SAndroid Build Coastguard Worker      br i1 %Condition, label %cond_true, label %cond_false
150*9880d681SAndroid Build Coastguard Worker
151*9880d681SAndroid Build Coastguard Worker    cond_true:
152*9880d681SAndroid Build Coastguard Worker      %X.0 = load i32* @G
153*9880d681SAndroid Build Coastguard Worker      store i32 %X.0, i32* %X   ; Update X
154*9880d681SAndroid Build Coastguard Worker      br label %cond_next
155*9880d681SAndroid Build Coastguard Worker
156*9880d681SAndroid Build Coastguard Worker    cond_false:
157*9880d681SAndroid Build Coastguard Worker      %X.1 = load i32* @H
158*9880d681SAndroid Build Coastguard Worker      store i32 %X.1, i32* %X   ; Update X
159*9880d681SAndroid Build Coastguard Worker      br label %cond_next
160*9880d681SAndroid Build Coastguard Worker
161*9880d681SAndroid Build Coastguard Worker    cond_next:
162*9880d681SAndroid Build Coastguard Worker      %X.2 = load i32* %X       ; Read X
163*9880d681SAndroid Build Coastguard Worker      ret i32 %X.2
164*9880d681SAndroid Build Coastguard Worker    }
165*9880d681SAndroid Build Coastguard Worker
166*9880d681SAndroid Build Coastguard WorkerWith this, we have discovered a way to handle arbitrary mutable
167*9880d681SAndroid Build Coastguard Workervariables without the need to create Phi nodes at all:
168*9880d681SAndroid Build Coastguard Worker
169*9880d681SAndroid Build Coastguard Worker#. Each mutable variable becomes a stack allocation.
170*9880d681SAndroid Build Coastguard Worker#. Each read of the variable becomes a load from the stack.
171*9880d681SAndroid Build Coastguard Worker#. Each update of the variable becomes a store to the stack.
172*9880d681SAndroid Build Coastguard Worker#. Taking the address of a variable just uses the stack address
173*9880d681SAndroid Build Coastguard Worker   directly.
174*9880d681SAndroid Build Coastguard Worker
175*9880d681SAndroid Build Coastguard WorkerWhile this solution has solved our immediate problem, it introduced
176*9880d681SAndroid Build Coastguard Workeranother one: we have now apparently introduced a lot of stack traffic
177*9880d681SAndroid Build Coastguard Workerfor very simple and common operations, a major performance problem.
178*9880d681SAndroid Build Coastguard WorkerFortunately for us, the LLVM optimizer has a highly-tuned optimization
179*9880d681SAndroid Build Coastguard Workerpass named "mem2reg" that handles this case, promoting allocas like this
180*9880d681SAndroid Build Coastguard Workerinto SSA registers, inserting Phi nodes as appropriate. If you run this
181*9880d681SAndroid Build Coastguard Workerexample through the pass, for example, you'll get:
182*9880d681SAndroid Build Coastguard Worker
183*9880d681SAndroid Build Coastguard Worker.. code-block:: bash
184*9880d681SAndroid Build Coastguard Worker
185*9880d681SAndroid Build Coastguard Worker    $ llvm-as < example.ll | opt -mem2reg | llvm-dis
186*9880d681SAndroid Build Coastguard Worker    @G = weak global i32 0
187*9880d681SAndroid Build Coastguard Worker    @H = weak global i32 0
188*9880d681SAndroid Build Coastguard Worker
189*9880d681SAndroid Build Coastguard Worker    define i32 @test(i1 %Condition) {
190*9880d681SAndroid Build Coastguard Worker    entry:
191*9880d681SAndroid Build Coastguard Worker      br i1 %Condition, label %cond_true, label %cond_false
192*9880d681SAndroid Build Coastguard Worker
193*9880d681SAndroid Build Coastguard Worker    cond_true:
194*9880d681SAndroid Build Coastguard Worker      %X.0 = load i32* @G
195*9880d681SAndroid Build Coastguard Worker      br label %cond_next
196*9880d681SAndroid Build Coastguard Worker
197*9880d681SAndroid Build Coastguard Worker    cond_false:
198*9880d681SAndroid Build Coastguard Worker      %X.1 = load i32* @H
199*9880d681SAndroid Build Coastguard Worker      br label %cond_next
200*9880d681SAndroid Build Coastguard Worker
201*9880d681SAndroid Build Coastguard Worker    cond_next:
202*9880d681SAndroid Build Coastguard Worker      %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
203*9880d681SAndroid Build Coastguard Worker      ret i32 %X.01
204*9880d681SAndroid Build Coastguard Worker    }
205*9880d681SAndroid Build Coastguard Worker
206*9880d681SAndroid Build Coastguard WorkerThe mem2reg pass implements the standard "iterated dominance frontier"
207*9880d681SAndroid Build Coastguard Workeralgorithm for constructing SSA form and has a number of optimizations
208*9880d681SAndroid Build Coastguard Workerthat speed up (very common) degenerate cases. The mem2reg optimization
209*9880d681SAndroid Build Coastguard Workerpass is the answer to dealing with mutable variables, and we highly
210*9880d681SAndroid Build Coastguard Workerrecommend that you depend on it. Note that mem2reg only works on
211*9880d681SAndroid Build Coastguard Workervariables in certain circumstances:
212*9880d681SAndroid Build Coastguard Worker
213*9880d681SAndroid Build Coastguard Worker#. mem2reg is alloca-driven: it looks for allocas and if it can handle
214*9880d681SAndroid Build Coastguard Worker   them, it promotes them. It does not apply to global variables or heap
215*9880d681SAndroid Build Coastguard Worker   allocations.
216*9880d681SAndroid Build Coastguard Worker#. mem2reg only looks for alloca instructions in the entry block of the
217*9880d681SAndroid Build Coastguard Worker   function. Being in the entry block guarantees that the alloca is only
218*9880d681SAndroid Build Coastguard Worker   executed once, which makes analysis simpler.
219*9880d681SAndroid Build Coastguard Worker#. mem2reg only promotes allocas whose uses are direct loads and stores.
220*9880d681SAndroid Build Coastguard Worker   If the address of the stack object is passed to a function, or if any
221*9880d681SAndroid Build Coastguard Worker   funny pointer arithmetic is involved, the alloca will not be
222*9880d681SAndroid Build Coastguard Worker   promoted.
223*9880d681SAndroid Build Coastguard Worker#. mem2reg only works on allocas of `first
224*9880d681SAndroid Build Coastguard Worker   class <../LangRef.html#first-class-types>`_ values (such as pointers,
225*9880d681SAndroid Build Coastguard Worker   scalars and vectors), and only if the array size of the allocation is
226*9880d681SAndroid Build Coastguard Worker   1 (or missing in the .ll file). mem2reg is not capable of promoting
227*9880d681SAndroid Build Coastguard Worker   structs or arrays to registers. Note that the "sroa" pass is
228*9880d681SAndroid Build Coastguard Worker   more powerful and can promote structs, "unions", and arrays in many
229*9880d681SAndroid Build Coastguard Worker   cases.
230*9880d681SAndroid Build Coastguard Worker
231*9880d681SAndroid Build Coastguard WorkerAll of these properties are easy to satisfy for most imperative
232*9880d681SAndroid Build Coastguard Workerlanguages, and we'll illustrate it below with Kaleidoscope. The final
233*9880d681SAndroid Build Coastguard Workerquestion you may be asking is: should I bother with this nonsense for my
234*9880d681SAndroid Build Coastguard Workerfront-end? Wouldn't it be better if I just did SSA construction
235*9880d681SAndroid Build Coastguard Workerdirectly, avoiding use of the mem2reg optimization pass? In short, we
236*9880d681SAndroid Build Coastguard Workerstrongly recommend that you use this technique for building SSA form,
237*9880d681SAndroid Build Coastguard Workerunless there is an extremely good reason not to. Using this technique
238*9880d681SAndroid Build Coastguard Workeris:
239*9880d681SAndroid Build Coastguard Worker
240*9880d681SAndroid Build Coastguard Worker-  Proven and well tested: clang uses this technique
241*9880d681SAndroid Build Coastguard Worker   for local mutable variables. As such, the most common clients of LLVM
242*9880d681SAndroid Build Coastguard Worker   are using this to handle a bulk of their variables. You can be sure
243*9880d681SAndroid Build Coastguard Worker   that bugs are found fast and fixed early.
244*9880d681SAndroid Build Coastguard Worker-  Extremely Fast: mem2reg has a number of special cases that make it
245*9880d681SAndroid Build Coastguard Worker   fast in common cases as well as fully general. For example, it has
246*9880d681SAndroid Build Coastguard Worker   fast-paths for variables that are only used in a single block,
247*9880d681SAndroid Build Coastguard Worker   variables that only have one assignment point, good heuristics to
248*9880d681SAndroid Build Coastguard Worker   avoid insertion of unneeded phi nodes, etc.
249*9880d681SAndroid Build Coastguard Worker-  Needed for debug info generation: `Debug information in
250*9880d681SAndroid Build Coastguard Worker   LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
251*9880d681SAndroid Build Coastguard Worker   the variable exposed so that debug info can be attached to it. This
252*9880d681SAndroid Build Coastguard Worker   technique dovetails very naturally with this style of debug info.
253*9880d681SAndroid Build Coastguard Worker
254*9880d681SAndroid Build Coastguard WorkerIf nothing else, this makes it much easier to get your front-end up and
255*9880d681SAndroid Build Coastguard Workerrunning, and is very simple to implement. Let's extend Kaleidoscope with
256*9880d681SAndroid Build Coastguard Workermutable variables now!
257*9880d681SAndroid Build Coastguard Worker
258*9880d681SAndroid Build Coastguard WorkerMutable Variables in Kaleidoscope
259*9880d681SAndroid Build Coastguard Worker=================================
260*9880d681SAndroid Build Coastguard Worker
261*9880d681SAndroid Build Coastguard WorkerNow that we know the sort of problem we want to tackle, let's see what
262*9880d681SAndroid Build Coastguard Workerthis looks like in the context of our little Kaleidoscope language.
263*9880d681SAndroid Build Coastguard WorkerWe're going to add two features:
264*9880d681SAndroid Build Coastguard Worker
265*9880d681SAndroid Build Coastguard Worker#. The ability to mutate variables with the '=' operator.
266*9880d681SAndroid Build Coastguard Worker#. The ability to define new variables.
267*9880d681SAndroid Build Coastguard Worker
268*9880d681SAndroid Build Coastguard WorkerWhile the first item is really what this is about, we only have
269*9880d681SAndroid Build Coastguard Workervariables for incoming arguments as well as for induction variables, and
270*9880d681SAndroid Build Coastguard Workerredefining those only goes so far :). Also, the ability to define new
271*9880d681SAndroid Build Coastguard Workervariables is a useful thing regardless of whether you will be mutating
272*9880d681SAndroid Build Coastguard Workerthem. Here's a motivating example that shows how we could use these:
273*9880d681SAndroid Build Coastguard Worker
274*9880d681SAndroid Build Coastguard Worker::
275*9880d681SAndroid Build Coastguard Worker
276*9880d681SAndroid Build Coastguard Worker    # Define ':' for sequencing: as a low-precedence operator that ignores operands
277*9880d681SAndroid Build Coastguard Worker    # and just returns the RHS.
278*9880d681SAndroid Build Coastguard Worker    def binary : 1 (x y) y;
279*9880d681SAndroid Build Coastguard Worker
280*9880d681SAndroid Build Coastguard Worker    # Recursive fib, we could do this before.
281*9880d681SAndroid Build Coastguard Worker    def fib(x)
282*9880d681SAndroid Build Coastguard Worker      if (x < 3) then
283*9880d681SAndroid Build Coastguard Worker        1
284*9880d681SAndroid Build Coastguard Worker      else
285*9880d681SAndroid Build Coastguard Worker        fib(x-1)+fib(x-2);
286*9880d681SAndroid Build Coastguard Worker
287*9880d681SAndroid Build Coastguard Worker    # Iterative fib.
288*9880d681SAndroid Build Coastguard Worker    def fibi(x)
289*9880d681SAndroid Build Coastguard Worker      var a = 1, b = 1, c in
290*9880d681SAndroid Build Coastguard Worker      (for i = 3, i < x in
291*9880d681SAndroid Build Coastguard Worker         c = a + b :
292*9880d681SAndroid Build Coastguard Worker         a = b :
293*9880d681SAndroid Build Coastguard Worker         b = c) :
294*9880d681SAndroid Build Coastguard Worker      b;
295*9880d681SAndroid Build Coastguard Worker
296*9880d681SAndroid Build Coastguard Worker    # Call it.
297*9880d681SAndroid Build Coastguard Worker    fibi(10);
298*9880d681SAndroid Build Coastguard Worker
299*9880d681SAndroid Build Coastguard WorkerIn order to mutate variables, we have to change our existing variables
300*9880d681SAndroid Build Coastguard Workerto use the "alloca trick". Once we have that, we'll add our new
301*9880d681SAndroid Build Coastguard Workeroperator, then extend Kaleidoscope to support new variable definitions.
302*9880d681SAndroid Build Coastguard Worker
303*9880d681SAndroid Build Coastguard WorkerAdjusting Existing Variables for Mutation
304*9880d681SAndroid Build Coastguard Worker=========================================
305*9880d681SAndroid Build Coastguard Worker
306*9880d681SAndroid Build Coastguard WorkerThe symbol table in Kaleidoscope is managed at code generation time by
307*9880d681SAndroid Build Coastguard Workerthe '``NamedValues``' map. This map currently keeps track of the LLVM
308*9880d681SAndroid Build Coastguard Worker"Value\*" that holds the double value for the named variable. In order
309*9880d681SAndroid Build Coastguard Workerto support mutation, we need to change this slightly, so that
310*9880d681SAndroid Build Coastguard Worker``NamedValues`` holds the *memory location* of the variable in question.
311*9880d681SAndroid Build Coastguard WorkerNote that this change is a refactoring: it changes the structure of the
312*9880d681SAndroid Build Coastguard Workercode, but does not (by itself) change the behavior of the compiler. All
313*9880d681SAndroid Build Coastguard Workerof these changes are isolated in the Kaleidoscope code generator.
314*9880d681SAndroid Build Coastguard Worker
315*9880d681SAndroid Build Coastguard WorkerAt this point in Kaleidoscope's development, it only supports variables
316*9880d681SAndroid Build Coastguard Workerfor two things: incoming arguments to functions and the induction
317*9880d681SAndroid Build Coastguard Workervariable of 'for' loops. For consistency, we'll allow mutation of these
318*9880d681SAndroid Build Coastguard Workervariables in addition to other user-defined variables. This means that
319*9880d681SAndroid Build Coastguard Workerthese will both need memory locations.
320*9880d681SAndroid Build Coastguard Worker
321*9880d681SAndroid Build Coastguard WorkerTo start our transformation of Kaleidoscope, we'll change the
322*9880d681SAndroid Build Coastguard WorkerNamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
323*9880d681SAndroid Build Coastguard Workerwe do this, the C++ compiler will tell us what parts of the code we need
324*9880d681SAndroid Build Coastguard Workerto update:
325*9880d681SAndroid Build Coastguard Worker
326*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
327*9880d681SAndroid Build Coastguard Worker
328*9880d681SAndroid Build Coastguard Worker    static std::map<std::string, AllocaInst*> NamedValues;
329*9880d681SAndroid Build Coastguard Worker
330*9880d681SAndroid Build Coastguard WorkerAlso, since we will need to create these alloca's, we'll use a helper
331*9880d681SAndroid Build Coastguard Workerfunction that ensures that the allocas are created in the entry block of
332*9880d681SAndroid Build Coastguard Workerthe function:
333*9880d681SAndroid Build Coastguard Worker
334*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
335*9880d681SAndroid Build Coastguard Worker
336*9880d681SAndroid Build Coastguard Worker    /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
337*9880d681SAndroid Build Coastguard Worker    /// the function.  This is used for mutable variables etc.
338*9880d681SAndroid Build Coastguard Worker    static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
339*9880d681SAndroid Build Coastguard Worker                                              const std::string &VarName) {
340*9880d681SAndroid Build Coastguard Worker      IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
341*9880d681SAndroid Build Coastguard Worker                     TheFunction->getEntryBlock().begin());
342*9880d681SAndroid Build Coastguard Worker      return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0,
343*9880d681SAndroid Build Coastguard Worker                               VarName.c_str());
344*9880d681SAndroid Build Coastguard Worker    }
345*9880d681SAndroid Build Coastguard Worker
346*9880d681SAndroid Build Coastguard WorkerThis funny looking code creates an IRBuilder object that is pointing at
347*9880d681SAndroid Build Coastguard Workerthe first instruction (.begin()) of the entry block. It then creates an
348*9880d681SAndroid Build Coastguard Workeralloca with the expected name and returns it. Because all values in
349*9880d681SAndroid Build Coastguard WorkerKaleidoscope are doubles, there is no need to pass in a type to use.
350*9880d681SAndroid Build Coastguard Worker
351*9880d681SAndroid Build Coastguard WorkerWith this in place, the first functionality change we want to make is to
352*9880d681SAndroid Build Coastguard Workervariable references. In our new scheme, variables live on the stack, so
353*9880d681SAndroid Build Coastguard Workercode generating a reference to them actually needs to produce a load
354*9880d681SAndroid Build Coastguard Workerfrom the stack slot:
355*9880d681SAndroid Build Coastguard Worker
356*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
357*9880d681SAndroid Build Coastguard Worker
358*9880d681SAndroid Build Coastguard Worker    Value *VariableExprAST::codegen() {
359*9880d681SAndroid Build Coastguard Worker      // Look this variable up in the function.
360*9880d681SAndroid Build Coastguard Worker      Value *V = NamedValues[Name];
361*9880d681SAndroid Build Coastguard Worker      if (!V)
362*9880d681SAndroid Build Coastguard Worker        return LogErrorV("Unknown variable name");
363*9880d681SAndroid Build Coastguard Worker
364*9880d681SAndroid Build Coastguard Worker      // Load the value.
365*9880d681SAndroid Build Coastguard Worker      return Builder.CreateLoad(V, Name.c_str());
366*9880d681SAndroid Build Coastguard Worker    }
367*9880d681SAndroid Build Coastguard Worker
368*9880d681SAndroid Build Coastguard WorkerAs you can see, this is pretty straightforward. Now we need to update
369*9880d681SAndroid Build Coastguard Workerthe things that define the variables to set up the alloca. We'll start
370*9880d681SAndroid Build Coastguard Workerwith ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
371*9880d681SAndroid Build Coastguard Workerthe unabridged code):
372*9880d681SAndroid Build Coastguard Worker
373*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
374*9880d681SAndroid Build Coastguard Worker
375*9880d681SAndroid Build Coastguard Worker      Function *TheFunction = Builder.GetInsertBlock()->getParent();
376*9880d681SAndroid Build Coastguard Worker
377*9880d681SAndroid Build Coastguard Worker      // Create an alloca for the variable in the entry block.
378*9880d681SAndroid Build Coastguard Worker      AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
379*9880d681SAndroid Build Coastguard Worker
380*9880d681SAndroid Build Coastguard Worker        // Emit the start code first, without 'variable' in scope.
381*9880d681SAndroid Build Coastguard Worker      Value *StartVal = Start->codegen();
382*9880d681SAndroid Build Coastguard Worker      if (!StartVal)
383*9880d681SAndroid Build Coastguard Worker        return nullptr;
384*9880d681SAndroid Build Coastguard Worker
385*9880d681SAndroid Build Coastguard Worker      // Store the value into the alloca.
386*9880d681SAndroid Build Coastguard Worker      Builder.CreateStore(StartVal, Alloca);
387*9880d681SAndroid Build Coastguard Worker      ...
388*9880d681SAndroid Build Coastguard Worker
389*9880d681SAndroid Build Coastguard Worker      // Compute the end condition.
390*9880d681SAndroid Build Coastguard Worker      Value *EndCond = End->codegen();
391*9880d681SAndroid Build Coastguard Worker      if (!EndCond)
392*9880d681SAndroid Build Coastguard Worker        return nullptr;
393*9880d681SAndroid Build Coastguard Worker
394*9880d681SAndroid Build Coastguard Worker      // Reload, increment, and restore the alloca.  This handles the case where
395*9880d681SAndroid Build Coastguard Worker      // the body of the loop mutates the variable.
396*9880d681SAndroid Build Coastguard Worker      Value *CurVar = Builder.CreateLoad(Alloca);
397*9880d681SAndroid Build Coastguard Worker      Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
398*9880d681SAndroid Build Coastguard Worker      Builder.CreateStore(NextVar, Alloca);
399*9880d681SAndroid Build Coastguard Worker      ...
400*9880d681SAndroid Build Coastguard Worker
401*9880d681SAndroid Build Coastguard WorkerThis code is virtually identical to the code `before we allowed mutable
402*9880d681SAndroid Build Coastguard Workervariables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
403*9880d681SAndroid Build Coastguard Workerno longer have to construct a PHI node, and we use load/store to access
404*9880d681SAndroid Build Coastguard Workerthe variable as needed.
405*9880d681SAndroid Build Coastguard Worker
406*9880d681SAndroid Build Coastguard WorkerTo support mutable argument variables, we need to also make allocas for
407*9880d681SAndroid Build Coastguard Workerthem. The code for this is also pretty simple:
408*9880d681SAndroid Build Coastguard Worker
409*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
410*9880d681SAndroid Build Coastguard Worker
411*9880d681SAndroid Build Coastguard Worker    /// CreateArgumentAllocas - Create an alloca for each argument and register the
412*9880d681SAndroid Build Coastguard Worker    /// argument in the symbol table so that references to it will succeed.
413*9880d681SAndroid Build Coastguard Worker    void PrototypeAST::CreateArgumentAllocas(Function *F) {
414*9880d681SAndroid Build Coastguard Worker      Function::arg_iterator AI = F->arg_begin();
415*9880d681SAndroid Build Coastguard Worker      for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
416*9880d681SAndroid Build Coastguard Worker        // Create an alloca for this variable.
417*9880d681SAndroid Build Coastguard Worker        AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
418*9880d681SAndroid Build Coastguard Worker
419*9880d681SAndroid Build Coastguard Worker        // Store the initial value into the alloca.
420*9880d681SAndroid Build Coastguard Worker        Builder.CreateStore(AI, Alloca);
421*9880d681SAndroid Build Coastguard Worker
422*9880d681SAndroid Build Coastguard Worker        // Add arguments to variable symbol table.
423*9880d681SAndroid Build Coastguard Worker        NamedValues[Args[Idx]] = Alloca;
424*9880d681SAndroid Build Coastguard Worker      }
425*9880d681SAndroid Build Coastguard Worker    }
426*9880d681SAndroid Build Coastguard Worker
427*9880d681SAndroid Build Coastguard WorkerFor each argument, we make an alloca, store the input value to the
428*9880d681SAndroid Build Coastguard Workerfunction into the alloca, and register the alloca as the memory location
429*9880d681SAndroid Build Coastguard Workerfor the argument. This method gets invoked by ``FunctionAST::codegen()``
430*9880d681SAndroid Build Coastguard Workerright after it sets up the entry block for the function.
431*9880d681SAndroid Build Coastguard Worker
432*9880d681SAndroid Build Coastguard WorkerThe final missing piece is adding the mem2reg pass, which allows us to
433*9880d681SAndroid Build Coastguard Workerget good codegen once again:
434*9880d681SAndroid Build Coastguard Worker
435*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
436*9880d681SAndroid Build Coastguard Worker
437*9880d681SAndroid Build Coastguard Worker        // Set up the optimizer pipeline.  Start with registering info about how the
438*9880d681SAndroid Build Coastguard Worker        // target lays out data structures.
439*9880d681SAndroid Build Coastguard Worker        OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
440*9880d681SAndroid Build Coastguard Worker        // Promote allocas to registers.
441*9880d681SAndroid Build Coastguard Worker        OurFPM.add(createPromoteMemoryToRegisterPass());
442*9880d681SAndroid Build Coastguard Worker        // Do simple "peephole" optimizations and bit-twiddling optzns.
443*9880d681SAndroid Build Coastguard Worker        OurFPM.add(createInstructionCombiningPass());
444*9880d681SAndroid Build Coastguard Worker        // Reassociate expressions.
445*9880d681SAndroid Build Coastguard Worker        OurFPM.add(createReassociatePass());
446*9880d681SAndroid Build Coastguard Worker
447*9880d681SAndroid Build Coastguard WorkerIt is interesting to see what the code looks like before and after the
448*9880d681SAndroid Build Coastguard Workermem2reg optimization runs. For example, this is the before/after code
449*9880d681SAndroid Build Coastguard Workerfor our recursive fib function. Before the optimization:
450*9880d681SAndroid Build Coastguard Worker
451*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
452*9880d681SAndroid Build Coastguard Worker
453*9880d681SAndroid Build Coastguard Worker    define double @fib(double %x) {
454*9880d681SAndroid Build Coastguard Worker    entry:
455*9880d681SAndroid Build Coastguard Worker      %x1 = alloca double
456*9880d681SAndroid Build Coastguard Worker      store double %x, double* %x1
457*9880d681SAndroid Build Coastguard Worker      %x2 = load double* %x1
458*9880d681SAndroid Build Coastguard Worker      %cmptmp = fcmp ult double %x2, 3.000000e+00
459*9880d681SAndroid Build Coastguard Worker      %booltmp = uitofp i1 %cmptmp to double
460*9880d681SAndroid Build Coastguard Worker      %ifcond = fcmp one double %booltmp, 0.000000e+00
461*9880d681SAndroid Build Coastguard Worker      br i1 %ifcond, label %then, label %else
462*9880d681SAndroid Build Coastguard Worker
463*9880d681SAndroid Build Coastguard Worker    then:       ; preds = %entry
464*9880d681SAndroid Build Coastguard Worker      br label %ifcont
465*9880d681SAndroid Build Coastguard Worker
466*9880d681SAndroid Build Coastguard Worker    else:       ; preds = %entry
467*9880d681SAndroid Build Coastguard Worker      %x3 = load double* %x1
468*9880d681SAndroid Build Coastguard Worker      %subtmp = fsub double %x3, 1.000000e+00
469*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @fib(double %subtmp)
470*9880d681SAndroid Build Coastguard Worker      %x4 = load double* %x1
471*9880d681SAndroid Build Coastguard Worker      %subtmp5 = fsub double %x4, 2.000000e+00
472*9880d681SAndroid Build Coastguard Worker      %calltmp6 = call double @fib(double %subtmp5)
473*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %calltmp, %calltmp6
474*9880d681SAndroid Build Coastguard Worker      br label %ifcont
475*9880d681SAndroid Build Coastguard Worker
476*9880d681SAndroid Build Coastguard Worker    ifcont:     ; preds = %else, %then
477*9880d681SAndroid Build Coastguard Worker      %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
478*9880d681SAndroid Build Coastguard Worker      ret double %iftmp
479*9880d681SAndroid Build Coastguard Worker    }
480*9880d681SAndroid Build Coastguard Worker
481*9880d681SAndroid Build Coastguard WorkerHere there is only one variable (x, the input argument) but you can
482*9880d681SAndroid Build Coastguard Workerstill see the extremely simple-minded code generation strategy we are
483*9880d681SAndroid Build Coastguard Workerusing. In the entry block, an alloca is created, and the initial input
484*9880d681SAndroid Build Coastguard Workervalue is stored into it. Each reference to the variable does a reload
485*9880d681SAndroid Build Coastguard Workerfrom the stack. Also, note that we didn't modify the if/then/else
486*9880d681SAndroid Build Coastguard Workerexpression, so it still inserts a PHI node. While we could make an
487*9880d681SAndroid Build Coastguard Workeralloca for it, it is actually easier to create a PHI node for it, so we
488*9880d681SAndroid Build Coastguard Workerstill just make the PHI.
489*9880d681SAndroid Build Coastguard Worker
490*9880d681SAndroid Build Coastguard WorkerHere is the code after the mem2reg pass runs:
491*9880d681SAndroid Build Coastguard Worker
492*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
493*9880d681SAndroid Build Coastguard Worker
494*9880d681SAndroid Build Coastguard Worker    define double @fib(double %x) {
495*9880d681SAndroid Build Coastguard Worker    entry:
496*9880d681SAndroid Build Coastguard Worker      %cmptmp = fcmp ult double %x, 3.000000e+00
497*9880d681SAndroid Build Coastguard Worker      %booltmp = uitofp i1 %cmptmp to double
498*9880d681SAndroid Build Coastguard Worker      %ifcond = fcmp one double %booltmp, 0.000000e+00
499*9880d681SAndroid Build Coastguard Worker      br i1 %ifcond, label %then, label %else
500*9880d681SAndroid Build Coastguard Worker
501*9880d681SAndroid Build Coastguard Worker    then:
502*9880d681SAndroid Build Coastguard Worker      br label %ifcont
503*9880d681SAndroid Build Coastguard Worker
504*9880d681SAndroid Build Coastguard Worker    else:
505*9880d681SAndroid Build Coastguard Worker      %subtmp = fsub double %x, 1.000000e+00
506*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @fib(double %subtmp)
507*9880d681SAndroid Build Coastguard Worker      %subtmp5 = fsub double %x, 2.000000e+00
508*9880d681SAndroid Build Coastguard Worker      %calltmp6 = call double @fib(double %subtmp5)
509*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %calltmp, %calltmp6
510*9880d681SAndroid Build Coastguard Worker      br label %ifcont
511*9880d681SAndroid Build Coastguard Worker
512*9880d681SAndroid Build Coastguard Worker    ifcont:     ; preds = %else, %then
513*9880d681SAndroid Build Coastguard Worker      %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
514*9880d681SAndroid Build Coastguard Worker      ret double %iftmp
515*9880d681SAndroid Build Coastguard Worker    }
516*9880d681SAndroid Build Coastguard Worker
517*9880d681SAndroid Build Coastguard WorkerThis is a trivial case for mem2reg, since there are no redefinitions of
518*9880d681SAndroid Build Coastguard Workerthe variable. The point of showing this is to calm your tension about
519*9880d681SAndroid Build Coastguard Workerinserting such blatent inefficiencies :).
520*9880d681SAndroid Build Coastguard Worker
521*9880d681SAndroid Build Coastguard WorkerAfter the rest of the optimizers run, we get:
522*9880d681SAndroid Build Coastguard Worker
523*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm
524*9880d681SAndroid Build Coastguard Worker
525*9880d681SAndroid Build Coastguard Worker    define double @fib(double %x) {
526*9880d681SAndroid Build Coastguard Worker    entry:
527*9880d681SAndroid Build Coastguard Worker      %cmptmp = fcmp ult double %x, 3.000000e+00
528*9880d681SAndroid Build Coastguard Worker      %booltmp = uitofp i1 %cmptmp to double
529*9880d681SAndroid Build Coastguard Worker      %ifcond = fcmp ueq double %booltmp, 0.000000e+00
530*9880d681SAndroid Build Coastguard Worker      br i1 %ifcond, label %else, label %ifcont
531*9880d681SAndroid Build Coastguard Worker
532*9880d681SAndroid Build Coastguard Worker    else:
533*9880d681SAndroid Build Coastguard Worker      %subtmp = fsub double %x, 1.000000e+00
534*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @fib(double %subtmp)
535*9880d681SAndroid Build Coastguard Worker      %subtmp5 = fsub double %x, 2.000000e+00
536*9880d681SAndroid Build Coastguard Worker      %calltmp6 = call double @fib(double %subtmp5)
537*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %calltmp, %calltmp6
538*9880d681SAndroid Build Coastguard Worker      ret double %addtmp
539*9880d681SAndroid Build Coastguard Worker
540*9880d681SAndroid Build Coastguard Worker    ifcont:
541*9880d681SAndroid Build Coastguard Worker      ret double 1.000000e+00
542*9880d681SAndroid Build Coastguard Worker    }
543*9880d681SAndroid Build Coastguard Worker
544*9880d681SAndroid Build Coastguard WorkerHere we see that the simplifycfg pass decided to clone the return
545*9880d681SAndroid Build Coastguard Workerinstruction into the end of the 'else' block. This allowed it to
546*9880d681SAndroid Build Coastguard Workereliminate some branches and the PHI node.
547*9880d681SAndroid Build Coastguard Worker
548*9880d681SAndroid Build Coastguard WorkerNow that all symbol table references are updated to use stack variables,
549*9880d681SAndroid Build Coastguard Workerwe'll add the assignment operator.
550*9880d681SAndroid Build Coastguard Worker
551*9880d681SAndroid Build Coastguard WorkerNew Assignment Operator
552*9880d681SAndroid Build Coastguard Worker=======================
553*9880d681SAndroid Build Coastguard Worker
554*9880d681SAndroid Build Coastguard WorkerWith our current framework, adding a new assignment operator is really
555*9880d681SAndroid Build Coastguard Workersimple. We will parse it just like any other binary operator, but handle
556*9880d681SAndroid Build Coastguard Workerit internally (instead of allowing the user to define it). The first
557*9880d681SAndroid Build Coastguard Workerstep is to set a precedence:
558*9880d681SAndroid Build Coastguard Worker
559*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
560*9880d681SAndroid Build Coastguard Worker
561*9880d681SAndroid Build Coastguard Worker     int main() {
562*9880d681SAndroid Build Coastguard Worker       // Install standard binary operators.
563*9880d681SAndroid Build Coastguard Worker       // 1 is lowest precedence.
564*9880d681SAndroid Build Coastguard Worker       BinopPrecedence['='] = 2;
565*9880d681SAndroid Build Coastguard Worker       BinopPrecedence['<'] = 10;
566*9880d681SAndroid Build Coastguard Worker       BinopPrecedence['+'] = 20;
567*9880d681SAndroid Build Coastguard Worker       BinopPrecedence['-'] = 20;
568*9880d681SAndroid Build Coastguard Worker
569*9880d681SAndroid Build Coastguard WorkerNow that the parser knows the precedence of the binary operator, it
570*9880d681SAndroid Build Coastguard Workertakes care of all the parsing and AST generation. We just need to
571*9880d681SAndroid Build Coastguard Workerimplement codegen for the assignment operator. This looks like:
572*9880d681SAndroid Build Coastguard Worker
573*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
574*9880d681SAndroid Build Coastguard Worker
575*9880d681SAndroid Build Coastguard Worker    Value *BinaryExprAST::codegen() {
576*9880d681SAndroid Build Coastguard Worker      // Special case '=' because we don't want to emit the LHS as an expression.
577*9880d681SAndroid Build Coastguard Worker      if (Op == '=') {
578*9880d681SAndroid Build Coastguard Worker        // Assignment requires the LHS to be an identifier.
579*9880d681SAndroid Build Coastguard Worker        VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get());
580*9880d681SAndroid Build Coastguard Worker        if (!LHSE)
581*9880d681SAndroid Build Coastguard Worker          return LogErrorV("destination of '=' must be a variable");
582*9880d681SAndroid Build Coastguard Worker
583*9880d681SAndroid Build Coastguard WorkerUnlike the rest of the binary operators, our assignment operator doesn't
584*9880d681SAndroid Build Coastguard Workerfollow the "emit LHS, emit RHS, do computation" model. As such, it is
585*9880d681SAndroid Build Coastguard Workerhandled as a special case before the other binary operators are handled.
586*9880d681SAndroid Build Coastguard WorkerThe other strange thing is that it requires the LHS to be a variable. It
587*9880d681SAndroid Build Coastguard Workeris invalid to have "(x+1) = expr" - only things like "x = expr" are
588*9880d681SAndroid Build Coastguard Workerallowed.
589*9880d681SAndroid Build Coastguard Worker
590*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
591*9880d681SAndroid Build Coastguard Worker
592*9880d681SAndroid Build Coastguard Worker        // Codegen the RHS.
593*9880d681SAndroid Build Coastguard Worker        Value *Val = RHS->codegen();
594*9880d681SAndroid Build Coastguard Worker        if (!Val)
595*9880d681SAndroid Build Coastguard Worker          return nullptr;
596*9880d681SAndroid Build Coastguard Worker
597*9880d681SAndroid Build Coastguard Worker        // Look up the name.
598*9880d681SAndroid Build Coastguard Worker        Value *Variable = NamedValues[LHSE->getName()];
599*9880d681SAndroid Build Coastguard Worker        if (!Variable)
600*9880d681SAndroid Build Coastguard Worker          return LogErrorV("Unknown variable name");
601*9880d681SAndroid Build Coastguard Worker
602*9880d681SAndroid Build Coastguard Worker        Builder.CreateStore(Val, Variable);
603*9880d681SAndroid Build Coastguard Worker        return Val;
604*9880d681SAndroid Build Coastguard Worker      }
605*9880d681SAndroid Build Coastguard Worker      ...
606*9880d681SAndroid Build Coastguard Worker
607*9880d681SAndroid Build Coastguard WorkerOnce we have the variable, codegen'ing the assignment is
608*9880d681SAndroid Build Coastguard Workerstraightforward: we emit the RHS of the assignment, create a store, and
609*9880d681SAndroid Build Coastguard Workerreturn the computed value. Returning a value allows for chained
610*9880d681SAndroid Build Coastguard Workerassignments like "X = (Y = Z)".
611*9880d681SAndroid Build Coastguard Worker
612*9880d681SAndroid Build Coastguard WorkerNow that we have an assignment operator, we can mutate loop variables
613*9880d681SAndroid Build Coastguard Workerand arguments. For example, we can now run code like this:
614*9880d681SAndroid Build Coastguard Worker
615*9880d681SAndroid Build Coastguard Worker::
616*9880d681SAndroid Build Coastguard Worker
617*9880d681SAndroid Build Coastguard Worker    # Function to print a double.
618*9880d681SAndroid Build Coastguard Worker    extern printd(x);
619*9880d681SAndroid Build Coastguard Worker
620*9880d681SAndroid Build Coastguard Worker    # Define ':' for sequencing: as a low-precedence operator that ignores operands
621*9880d681SAndroid Build Coastguard Worker    # and just returns the RHS.
622*9880d681SAndroid Build Coastguard Worker    def binary : 1 (x y) y;
623*9880d681SAndroid Build Coastguard Worker
624*9880d681SAndroid Build Coastguard Worker    def test(x)
625*9880d681SAndroid Build Coastguard Worker      printd(x) :
626*9880d681SAndroid Build Coastguard Worker      x = 4 :
627*9880d681SAndroid Build Coastguard Worker      printd(x);
628*9880d681SAndroid Build Coastguard Worker
629*9880d681SAndroid Build Coastguard Worker    test(123);
630*9880d681SAndroid Build Coastguard Worker
631*9880d681SAndroid Build Coastguard WorkerWhen run, this example prints "123" and then "4", showing that we did
632*9880d681SAndroid Build Coastguard Workeractually mutate the value! Okay, we have now officially implemented our
633*9880d681SAndroid Build Coastguard Workergoal: getting this to work requires SSA construction in the general
634*9880d681SAndroid Build Coastguard Workercase. However, to be really useful, we want the ability to define our
635*9880d681SAndroid Build Coastguard Workerown local variables, let's add this next!
636*9880d681SAndroid Build Coastguard Worker
637*9880d681SAndroid Build Coastguard WorkerUser-defined Local Variables
638*9880d681SAndroid Build Coastguard Worker============================
639*9880d681SAndroid Build Coastguard Worker
640*9880d681SAndroid Build Coastguard WorkerAdding var/in is just like any other extension we made to
641*9880d681SAndroid Build Coastguard WorkerKaleidoscope: we extend the lexer, the parser, the AST and the code
642*9880d681SAndroid Build Coastguard Workergenerator. The first step for adding our new 'var/in' construct is to
643*9880d681SAndroid Build Coastguard Workerextend the lexer. As before, this is pretty trivial, the code looks like
644*9880d681SAndroid Build Coastguard Workerthis:
645*9880d681SAndroid Build Coastguard Worker
646*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
647*9880d681SAndroid Build Coastguard Worker
648*9880d681SAndroid Build Coastguard Worker    enum Token {
649*9880d681SAndroid Build Coastguard Worker      ...
650*9880d681SAndroid Build Coastguard Worker      // var definition
651*9880d681SAndroid Build Coastguard Worker      tok_var = -13
652*9880d681SAndroid Build Coastguard Worker    ...
653*9880d681SAndroid Build Coastguard Worker    }
654*9880d681SAndroid Build Coastguard Worker    ...
655*9880d681SAndroid Build Coastguard Worker    static int gettok() {
656*9880d681SAndroid Build Coastguard Worker    ...
657*9880d681SAndroid Build Coastguard Worker        if (IdentifierStr == "in")
658*9880d681SAndroid Build Coastguard Worker          return tok_in;
659*9880d681SAndroid Build Coastguard Worker        if (IdentifierStr == "binary")
660*9880d681SAndroid Build Coastguard Worker          return tok_binary;
661*9880d681SAndroid Build Coastguard Worker        if (IdentifierStr == "unary")
662*9880d681SAndroid Build Coastguard Worker          return tok_unary;
663*9880d681SAndroid Build Coastguard Worker        if (IdentifierStr == "var")
664*9880d681SAndroid Build Coastguard Worker          return tok_var;
665*9880d681SAndroid Build Coastguard Worker        return tok_identifier;
666*9880d681SAndroid Build Coastguard Worker    ...
667*9880d681SAndroid Build Coastguard Worker
668*9880d681SAndroid Build Coastguard WorkerThe next step is to define the AST node that we will construct. For
669*9880d681SAndroid Build Coastguard Workervar/in, it looks like this:
670*9880d681SAndroid Build Coastguard Worker
671*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
672*9880d681SAndroid Build Coastguard Worker
673*9880d681SAndroid Build Coastguard Worker    /// VarExprAST - Expression class for var/in
674*9880d681SAndroid Build Coastguard Worker    class VarExprAST : public ExprAST {
675*9880d681SAndroid Build Coastguard Worker      std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
676*9880d681SAndroid Build Coastguard Worker      std::unique_ptr<ExprAST> Body;
677*9880d681SAndroid Build Coastguard Worker
678*9880d681SAndroid Build Coastguard Worker    public:
679*9880d681SAndroid Build Coastguard Worker      VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
680*9880d681SAndroid Build Coastguard Worker                 std::unique_ptr<ExprAST> body)
681*9880d681SAndroid Build Coastguard Worker      : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
682*9880d681SAndroid Build Coastguard Worker
683*9880d681SAndroid Build Coastguard Worker      virtual Value *codegen();
684*9880d681SAndroid Build Coastguard Worker    };
685*9880d681SAndroid Build Coastguard Worker
686*9880d681SAndroid Build Coastguard Workervar/in allows a list of names to be defined all at once, and each name
687*9880d681SAndroid Build Coastguard Workercan optionally have an initializer value. As such, we capture this
688*9880d681SAndroid Build Coastguard Workerinformation in the VarNames vector. Also, var/in has a body, this body
689*9880d681SAndroid Build Coastguard Workeris allowed to access the variables defined by the var/in.
690*9880d681SAndroid Build Coastguard Worker
691*9880d681SAndroid Build Coastguard WorkerWith this in place, we can define the parser pieces. The first thing we
692*9880d681SAndroid Build Coastguard Workerdo is add it as a primary expression:
693*9880d681SAndroid Build Coastguard Worker
694*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
695*9880d681SAndroid Build Coastguard Worker
696*9880d681SAndroid Build Coastguard Worker    /// primary
697*9880d681SAndroid Build Coastguard Worker    ///   ::= identifierexpr
698*9880d681SAndroid Build Coastguard Worker    ///   ::= numberexpr
699*9880d681SAndroid Build Coastguard Worker    ///   ::= parenexpr
700*9880d681SAndroid Build Coastguard Worker    ///   ::= ifexpr
701*9880d681SAndroid Build Coastguard Worker    ///   ::= forexpr
702*9880d681SAndroid Build Coastguard Worker    ///   ::= varexpr
703*9880d681SAndroid Build Coastguard Worker    static std::unique_ptr<ExprAST> ParsePrimary() {
704*9880d681SAndroid Build Coastguard Worker      switch (CurTok) {
705*9880d681SAndroid Build Coastguard Worker      default:
706*9880d681SAndroid Build Coastguard Worker        return LogError("unknown token when expecting an expression");
707*9880d681SAndroid Build Coastguard Worker      case tok_identifier:
708*9880d681SAndroid Build Coastguard Worker        return ParseIdentifierExpr();
709*9880d681SAndroid Build Coastguard Worker      case tok_number:
710*9880d681SAndroid Build Coastguard Worker        return ParseNumberExpr();
711*9880d681SAndroid Build Coastguard Worker      case '(':
712*9880d681SAndroid Build Coastguard Worker        return ParseParenExpr();
713*9880d681SAndroid Build Coastguard Worker      case tok_if:
714*9880d681SAndroid Build Coastguard Worker        return ParseIfExpr();
715*9880d681SAndroid Build Coastguard Worker      case tok_for:
716*9880d681SAndroid Build Coastguard Worker        return ParseForExpr();
717*9880d681SAndroid Build Coastguard Worker      case tok_var:
718*9880d681SAndroid Build Coastguard Worker        return ParseVarExpr();
719*9880d681SAndroid Build Coastguard Worker      }
720*9880d681SAndroid Build Coastguard Worker    }
721*9880d681SAndroid Build Coastguard Worker
722*9880d681SAndroid Build Coastguard WorkerNext we define ParseVarExpr:
723*9880d681SAndroid Build Coastguard Worker
724*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
725*9880d681SAndroid Build Coastguard Worker
726*9880d681SAndroid Build Coastguard Worker    /// varexpr ::= 'var' identifier ('=' expression)?
727*9880d681SAndroid Build Coastguard Worker    //                    (',' identifier ('=' expression)?)* 'in' expression
728*9880d681SAndroid Build Coastguard Worker    static std::unique_ptr<ExprAST> ParseVarExpr() {
729*9880d681SAndroid Build Coastguard Worker      getNextToken();  // eat the var.
730*9880d681SAndroid Build Coastguard Worker
731*9880d681SAndroid Build Coastguard Worker      std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
732*9880d681SAndroid Build Coastguard Worker
733*9880d681SAndroid Build Coastguard Worker      // At least one variable name is required.
734*9880d681SAndroid Build Coastguard Worker      if (CurTok != tok_identifier)
735*9880d681SAndroid Build Coastguard Worker        return LogError("expected identifier after var");
736*9880d681SAndroid Build Coastguard Worker
737*9880d681SAndroid Build Coastguard WorkerThe first part of this code parses the list of identifier/expr pairs
738*9880d681SAndroid Build Coastguard Workerinto the local ``VarNames`` vector.
739*9880d681SAndroid Build Coastguard Worker
740*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
741*9880d681SAndroid Build Coastguard Worker
742*9880d681SAndroid Build Coastguard Worker      while (1) {
743*9880d681SAndroid Build Coastguard Worker        std::string Name = IdentifierStr;
744*9880d681SAndroid Build Coastguard Worker        getNextToken();  // eat identifier.
745*9880d681SAndroid Build Coastguard Worker
746*9880d681SAndroid Build Coastguard Worker        // Read the optional initializer.
747*9880d681SAndroid Build Coastguard Worker        std::unique_ptr<ExprAST> Init;
748*9880d681SAndroid Build Coastguard Worker        if (CurTok == '=') {
749*9880d681SAndroid Build Coastguard Worker          getNextToken(); // eat the '='.
750*9880d681SAndroid Build Coastguard Worker
751*9880d681SAndroid Build Coastguard Worker          Init = ParseExpression();
752*9880d681SAndroid Build Coastguard Worker          if (!Init) return nullptr;
753*9880d681SAndroid Build Coastguard Worker        }
754*9880d681SAndroid Build Coastguard Worker
755*9880d681SAndroid Build Coastguard Worker        VarNames.push_back(std::make_pair(Name, std::move(Init)));
756*9880d681SAndroid Build Coastguard Worker
757*9880d681SAndroid Build Coastguard Worker        // End of var list, exit loop.
758*9880d681SAndroid Build Coastguard Worker        if (CurTok != ',') break;
759*9880d681SAndroid Build Coastguard Worker        getNextToken(); // eat the ','.
760*9880d681SAndroid Build Coastguard Worker
761*9880d681SAndroid Build Coastguard Worker        if (CurTok != tok_identifier)
762*9880d681SAndroid Build Coastguard Worker          return LogError("expected identifier list after var");
763*9880d681SAndroid Build Coastguard Worker      }
764*9880d681SAndroid Build Coastguard Worker
765*9880d681SAndroid Build Coastguard WorkerOnce all the variables are parsed, we then parse the body and create the
766*9880d681SAndroid Build Coastguard WorkerAST node:
767*9880d681SAndroid Build Coastguard Worker
768*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
769*9880d681SAndroid Build Coastguard Worker
770*9880d681SAndroid Build Coastguard Worker      // At this point, we have to have 'in'.
771*9880d681SAndroid Build Coastguard Worker      if (CurTok != tok_in)
772*9880d681SAndroid Build Coastguard Worker        return LogError("expected 'in' keyword after 'var'");
773*9880d681SAndroid Build Coastguard Worker      getNextToken();  // eat 'in'.
774*9880d681SAndroid Build Coastguard Worker
775*9880d681SAndroid Build Coastguard Worker      auto Body = ParseExpression();
776*9880d681SAndroid Build Coastguard Worker      if (!Body)
777*9880d681SAndroid Build Coastguard Worker        return nullptr;
778*9880d681SAndroid Build Coastguard Worker
779*9880d681SAndroid Build Coastguard Worker      return llvm::make_unique<VarExprAST>(std::move(VarNames),
780*9880d681SAndroid Build Coastguard Worker                                           std::move(Body));
781*9880d681SAndroid Build Coastguard Worker    }
782*9880d681SAndroid Build Coastguard Worker
783*9880d681SAndroid Build Coastguard WorkerNow that we can parse and represent the code, we need to support
784*9880d681SAndroid Build Coastguard Workeremission of LLVM IR for it. This code starts out with:
785*9880d681SAndroid Build Coastguard Worker
786*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
787*9880d681SAndroid Build Coastguard Worker
788*9880d681SAndroid Build Coastguard Worker    Value *VarExprAST::codegen() {
789*9880d681SAndroid Build Coastguard Worker      std::vector<AllocaInst *> OldBindings;
790*9880d681SAndroid Build Coastguard Worker
791*9880d681SAndroid Build Coastguard Worker      Function *TheFunction = Builder.GetInsertBlock()->getParent();
792*9880d681SAndroid Build Coastguard Worker
793*9880d681SAndroid Build Coastguard Worker      // Register all variables and emit their initializer.
794*9880d681SAndroid Build Coastguard Worker      for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
795*9880d681SAndroid Build Coastguard Worker        const std::string &VarName = VarNames[i].first;
796*9880d681SAndroid Build Coastguard Worker        ExprAST *Init = VarNames[i].second.get();
797*9880d681SAndroid Build Coastguard Worker
798*9880d681SAndroid Build Coastguard WorkerBasically it loops over all the variables, installing them one at a
799*9880d681SAndroid Build Coastguard Workertime. For each variable we put into the symbol table, we remember the
800*9880d681SAndroid Build Coastguard Workerprevious value that we replace in OldBindings.
801*9880d681SAndroid Build Coastguard Worker
802*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
803*9880d681SAndroid Build Coastguard Worker
804*9880d681SAndroid Build Coastguard Worker        // Emit the initializer before adding the variable to scope, this prevents
805*9880d681SAndroid Build Coastguard Worker        // the initializer from referencing the variable itself, and permits stuff
806*9880d681SAndroid Build Coastguard Worker        // like this:
807*9880d681SAndroid Build Coastguard Worker        //  var a = 1 in
808*9880d681SAndroid Build Coastguard Worker        //    var a = a in ...   # refers to outer 'a'.
809*9880d681SAndroid Build Coastguard Worker        Value *InitVal;
810*9880d681SAndroid Build Coastguard Worker        if (Init) {
811*9880d681SAndroid Build Coastguard Worker          InitVal = Init->codegen();
812*9880d681SAndroid Build Coastguard Worker          if (!InitVal)
813*9880d681SAndroid Build Coastguard Worker            return nullptr;
814*9880d681SAndroid Build Coastguard Worker        } else { // If not specified, use 0.0.
815*9880d681SAndroid Build Coastguard Worker          InitVal = ConstantFP::get(LLVMContext, APFloat(0.0));
816*9880d681SAndroid Build Coastguard Worker        }
817*9880d681SAndroid Build Coastguard Worker
818*9880d681SAndroid Build Coastguard Worker        AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
819*9880d681SAndroid Build Coastguard Worker        Builder.CreateStore(InitVal, Alloca);
820*9880d681SAndroid Build Coastguard Worker
821*9880d681SAndroid Build Coastguard Worker        // Remember the old variable binding so that we can restore the binding when
822*9880d681SAndroid Build Coastguard Worker        // we unrecurse.
823*9880d681SAndroid Build Coastguard Worker        OldBindings.push_back(NamedValues[VarName]);
824*9880d681SAndroid Build Coastguard Worker
825*9880d681SAndroid Build Coastguard Worker        // Remember this binding.
826*9880d681SAndroid Build Coastguard Worker        NamedValues[VarName] = Alloca;
827*9880d681SAndroid Build Coastguard Worker      }
828*9880d681SAndroid Build Coastguard Worker
829*9880d681SAndroid Build Coastguard WorkerThere are more comments here than code. The basic idea is that we emit
830*9880d681SAndroid Build Coastguard Workerthe initializer, create the alloca, then update the symbol table to
831*9880d681SAndroid Build Coastguard Workerpoint to it. Once all the variables are installed in the symbol table,
832*9880d681SAndroid Build Coastguard Workerwe evaluate the body of the var/in expression:
833*9880d681SAndroid Build Coastguard Worker
834*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
835*9880d681SAndroid Build Coastguard Worker
836*9880d681SAndroid Build Coastguard Worker      // Codegen the body, now that all vars are in scope.
837*9880d681SAndroid Build Coastguard Worker      Value *BodyVal = Body->codegen();
838*9880d681SAndroid Build Coastguard Worker      if (!BodyVal)
839*9880d681SAndroid Build Coastguard Worker        return nullptr;
840*9880d681SAndroid Build Coastguard Worker
841*9880d681SAndroid Build Coastguard WorkerFinally, before returning, we restore the previous variable bindings:
842*9880d681SAndroid Build Coastguard Worker
843*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
844*9880d681SAndroid Build Coastguard Worker
845*9880d681SAndroid Build Coastguard Worker      // Pop all our variables from scope.
846*9880d681SAndroid Build Coastguard Worker      for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
847*9880d681SAndroid Build Coastguard Worker        NamedValues[VarNames[i].first] = OldBindings[i];
848*9880d681SAndroid Build Coastguard Worker
849*9880d681SAndroid Build Coastguard Worker      // Return the body computation.
850*9880d681SAndroid Build Coastguard Worker      return BodyVal;
851*9880d681SAndroid Build Coastguard Worker    }
852*9880d681SAndroid Build Coastguard Worker
853*9880d681SAndroid Build Coastguard WorkerThe end result of all of this is that we get properly scoped variable
854*9880d681SAndroid Build Coastguard Workerdefinitions, and we even (trivially) allow mutation of them :).
855*9880d681SAndroid Build Coastguard Worker
856*9880d681SAndroid Build Coastguard WorkerWith this, we completed what we set out to do. Our nice iterative fib
857*9880d681SAndroid Build Coastguard Workerexample from the intro compiles and runs just fine. The mem2reg pass
858*9880d681SAndroid Build Coastguard Workeroptimizes all of our stack variables into SSA registers, inserting PHI
859*9880d681SAndroid Build Coastguard Workernodes where needed, and our front-end remains simple: no "iterated
860*9880d681SAndroid Build Coastguard Workerdominance frontier" computation anywhere in sight.
861*9880d681SAndroid Build Coastguard Worker
862*9880d681SAndroid Build Coastguard WorkerFull Code Listing
863*9880d681SAndroid Build Coastguard Worker=================
864*9880d681SAndroid Build Coastguard Worker
865*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for our running example, enhanced with
866*9880d681SAndroid Build Coastguard Workermutable variables and var/in support. To build this example, use:
867*9880d681SAndroid Build Coastguard Worker
868*9880d681SAndroid Build Coastguard Worker.. code-block:: bash
869*9880d681SAndroid Build Coastguard Worker
870*9880d681SAndroid Build Coastguard Worker    # Compile
871*9880d681SAndroid Build Coastguard Worker    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
872*9880d681SAndroid Build Coastguard Worker    # Run
873*9880d681SAndroid Build Coastguard Worker    ./toy
874*9880d681SAndroid Build Coastguard Worker
875*9880d681SAndroid Build Coastguard WorkerHere is the code:
876*9880d681SAndroid Build Coastguard Worker
877*9880d681SAndroid Build Coastguard Worker.. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp
878*9880d681SAndroid Build Coastguard Worker   :language: c++
879*9880d681SAndroid Build Coastguard Worker
880*9880d681SAndroid Build Coastguard Worker`Next: Compiling to Object Code <LangImpl08.html>`_
881*9880d681SAndroid Build Coastguard Worker
882