1*9880d681SAndroid Build Coastguard Worker======================================================= 2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Extending the Language: Mutable Variables 3*9880d681SAndroid Build Coastguard Worker======================================================= 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard Worker.. contents:: 6*9880d681SAndroid Build Coastguard Worker :local: 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard WorkerChapter 7 Introduction 9*9880d681SAndroid Build Coastguard Worker====================== 10*9880d681SAndroid Build Coastguard Worker 11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 7 of the "`Implementing a language with 12*9880d681SAndroid Build Coastguard WorkerLLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a 13*9880d681SAndroid Build Coastguard Workervery respectable, albeit simple, `functional programming 14*9880d681SAndroid Build Coastguard Workerlanguage <http://en.wikipedia.org/wiki/Functional_programming>`_. In our 15*9880d681SAndroid Build Coastguard Workerjourney, we learned some parsing techniques, how to build and represent 16*9880d681SAndroid Build Coastguard Workeran AST, how to build LLVM IR, and how to optimize the resultant code as 17*9880d681SAndroid Build Coastguard Workerwell as JIT compile it. 18*9880d681SAndroid Build Coastguard Worker 19*9880d681SAndroid Build Coastguard WorkerWhile Kaleidoscope is interesting as a functional language, the fact 20*9880d681SAndroid Build Coastguard Workerthat it is functional makes it "too easy" to generate LLVM IR for it. In 21*9880d681SAndroid Build Coastguard Workerparticular, a functional language makes it very easy to build LLVM IR 22*9880d681SAndroid Build Coastguard Workerdirectly in `SSA 23*9880d681SAndroid Build Coastguard Workerform <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 24*9880d681SAndroid Build Coastguard WorkerSince LLVM requires that the input code be in SSA form, this is a very 25*9880d681SAndroid Build Coastguard Workernice property and it is often unclear to newcomers how to generate code 26*9880d681SAndroid Build Coastguard Workerfor an imperative language with mutable variables. 27*9880d681SAndroid Build Coastguard Worker 28*9880d681SAndroid Build Coastguard WorkerThe short (and happy) summary of this chapter is that there is no need 29*9880d681SAndroid Build Coastguard Workerfor your front-end to build SSA form: LLVM provides highly tuned and 30*9880d681SAndroid Build Coastguard Workerwell tested support for this, though the way it works is a bit 31*9880d681SAndroid Build Coastguard Workerunexpected for some. 32*9880d681SAndroid Build Coastguard Worker 33*9880d681SAndroid Build Coastguard WorkerWhy is this a hard problem? 34*9880d681SAndroid Build Coastguard Worker=========================== 35*9880d681SAndroid Build Coastguard Worker 36*9880d681SAndroid Build Coastguard WorkerTo understand why mutable variables cause complexities in SSA 37*9880d681SAndroid Build Coastguard Workerconstruction, consider this extremely simple C example: 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard Worker.. code-block:: c 40*9880d681SAndroid Build Coastguard Worker 41*9880d681SAndroid Build Coastguard Worker int G, H; 42*9880d681SAndroid Build Coastguard Worker int test(_Bool Condition) { 43*9880d681SAndroid Build Coastguard Worker int X; 44*9880d681SAndroid Build Coastguard Worker if (Condition) 45*9880d681SAndroid Build Coastguard Worker X = G; 46*9880d681SAndroid Build Coastguard Worker else 47*9880d681SAndroid Build Coastguard Worker X = H; 48*9880d681SAndroid Build Coastguard Worker return X; 49*9880d681SAndroid Build Coastguard Worker } 50*9880d681SAndroid Build Coastguard Worker 51*9880d681SAndroid Build Coastguard WorkerIn this case, we have the variable "X", whose value depends on the path 52*9880d681SAndroid Build Coastguard Workerexecuted in the program. Because there are two different possible values 53*9880d681SAndroid Build Coastguard Workerfor X before the return instruction, a PHI node is inserted to merge the 54*9880d681SAndroid Build Coastguard Workertwo values. The LLVM IR that we want for this example looks like this: 55*9880d681SAndroid Build Coastguard Worker 56*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 57*9880d681SAndroid Build Coastguard Worker 58*9880d681SAndroid Build Coastguard Worker @G = weak global i32 0 ; type of @G is i32* 59*9880d681SAndroid Build Coastguard Worker @H = weak global i32 0 ; type of @H is i32* 60*9880d681SAndroid Build Coastguard Worker 61*9880d681SAndroid Build Coastguard Worker define i32 @test(i1 %Condition) { 62*9880d681SAndroid Build Coastguard Worker entry: 63*9880d681SAndroid Build Coastguard Worker br i1 %Condition, label %cond_true, label %cond_false 64*9880d681SAndroid Build Coastguard Worker 65*9880d681SAndroid Build Coastguard Worker cond_true: 66*9880d681SAndroid Build Coastguard Worker %X.0 = load i32* @G 67*9880d681SAndroid Build Coastguard Worker br label %cond_next 68*9880d681SAndroid Build Coastguard Worker 69*9880d681SAndroid Build Coastguard Worker cond_false: 70*9880d681SAndroid Build Coastguard Worker %X.1 = load i32* @H 71*9880d681SAndroid Build Coastguard Worker br label %cond_next 72*9880d681SAndroid Build Coastguard Worker 73*9880d681SAndroid Build Coastguard Worker cond_next: 74*9880d681SAndroid Build Coastguard Worker %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 75*9880d681SAndroid Build Coastguard Worker ret i32 %X.2 76*9880d681SAndroid Build Coastguard Worker } 77*9880d681SAndroid Build Coastguard Worker 78*9880d681SAndroid Build Coastguard WorkerIn this example, the loads from the G and H global variables are 79*9880d681SAndroid Build Coastguard Workerexplicit in the LLVM IR, and they live in the then/else branches of the 80*9880d681SAndroid Build Coastguard Workerif statement (cond\_true/cond\_false). In order to merge the incoming 81*9880d681SAndroid Build Coastguard Workervalues, the X.2 phi node in the cond\_next block selects the right value 82*9880d681SAndroid Build Coastguard Workerto use based on where control flow is coming from: if control flow comes 83*9880d681SAndroid Build Coastguard Workerfrom the cond\_false block, X.2 gets the value of X.1. Alternatively, if 84*9880d681SAndroid Build Coastguard Workercontrol flow comes from cond\_true, it gets the value of X.0. The intent 85*9880d681SAndroid Build Coastguard Workerof this chapter is not to explain the details of SSA form. For more 86*9880d681SAndroid Build Coastguard Workerinformation, see one of the many `online 87*9880d681SAndroid Build Coastguard Workerreferences <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. 88*9880d681SAndroid Build Coastguard Worker 89*9880d681SAndroid Build Coastguard WorkerThe question for this article is "who places the phi nodes when lowering 90*9880d681SAndroid Build Coastguard Workerassignments to mutable variables?". The issue here is that LLVM 91*9880d681SAndroid Build Coastguard Worker*requires* that its IR be in SSA form: there is no "non-ssa" mode for 92*9880d681SAndroid Build Coastguard Workerit. However, SSA construction requires non-trivial algorithms and data 93*9880d681SAndroid Build Coastguard Workerstructures, so it is inconvenient and wasteful for every front-end to 94*9880d681SAndroid Build Coastguard Workerhave to reproduce this logic. 95*9880d681SAndroid Build Coastguard Worker 96*9880d681SAndroid Build Coastguard WorkerMemory in LLVM 97*9880d681SAndroid Build Coastguard Worker============== 98*9880d681SAndroid Build Coastguard Worker 99*9880d681SAndroid Build Coastguard WorkerThe 'trick' here is that while LLVM does require all register values to 100*9880d681SAndroid Build Coastguard Workerbe in SSA form, it does not require (or permit) memory objects to be in 101*9880d681SAndroid Build Coastguard WorkerSSA form. In the example above, note that the loads from G and H are 102*9880d681SAndroid Build Coastguard Workerdirect accesses to G and H: they are not renamed or versioned. This 103*9880d681SAndroid Build Coastguard Workerdiffers from some other compiler systems, which do try to version memory 104*9880d681SAndroid Build Coastguard Workerobjects. In LLVM, instead of encoding dataflow analysis of memory into 105*9880d681SAndroid Build Coastguard Workerthe LLVM IR, it is handled with `Analysis 106*9880d681SAndroid Build Coastguard WorkerPasses <../WritingAnLLVMPass.html>`_ which are computed on demand. 107*9880d681SAndroid Build Coastguard Worker 108*9880d681SAndroid Build Coastguard WorkerWith this in mind, the high-level idea is that we want to make a stack 109*9880d681SAndroid Build Coastguard Workervariable (which lives in memory, because it is on the stack) for each 110*9880d681SAndroid Build Coastguard Workermutable object in a function. To take advantage of this trick, we need 111*9880d681SAndroid Build Coastguard Workerto talk about how LLVM represents stack variables. 112*9880d681SAndroid Build Coastguard Worker 113*9880d681SAndroid Build Coastguard WorkerIn LLVM, all memory accesses are explicit with load/store instructions, 114*9880d681SAndroid Build Coastguard Workerand it is carefully designed not to have (or need) an "address-of" 115*9880d681SAndroid Build Coastguard Workeroperator. Notice how the type of the @G/@H global variables is actually 116*9880d681SAndroid Build Coastguard Worker"i32\*" even though the variable is defined as "i32". What this means is 117*9880d681SAndroid Build Coastguard Workerthat @G defines *space* for an i32 in the global data area, but its 118*9880d681SAndroid Build Coastguard Worker*name* actually refers to the address for that space. Stack variables 119*9880d681SAndroid Build Coastguard Workerwork the same way, except that instead of being declared with global 120*9880d681SAndroid Build Coastguard Workervariable definitions, they are declared with the `LLVM alloca 121*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#alloca-instruction>`_: 122*9880d681SAndroid Build Coastguard Worker 123*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 124*9880d681SAndroid Build Coastguard Worker 125*9880d681SAndroid Build Coastguard Worker define i32 @example() { 126*9880d681SAndroid Build Coastguard Worker entry: 127*9880d681SAndroid Build Coastguard Worker %X = alloca i32 ; type of %X is i32*. 128*9880d681SAndroid Build Coastguard Worker ... 129*9880d681SAndroid Build Coastguard Worker %tmp = load i32* %X ; load the stack value %X from the stack. 130*9880d681SAndroid Build Coastguard Worker %tmp2 = add i32 %tmp, 1 ; increment it 131*9880d681SAndroid Build Coastguard Worker store i32 %tmp2, i32* %X ; store it back 132*9880d681SAndroid Build Coastguard Worker ... 133*9880d681SAndroid Build Coastguard Worker 134*9880d681SAndroid Build Coastguard WorkerThis code shows an example of how you can declare and manipulate a stack 135*9880d681SAndroid Build Coastguard Workervariable in the LLVM IR. Stack memory allocated with the alloca 136*9880d681SAndroid Build Coastguard Workerinstruction is fully general: you can pass the address of the stack slot 137*9880d681SAndroid Build Coastguard Workerto functions, you can store it in other variables, etc. In our example 138*9880d681SAndroid Build Coastguard Workerabove, we could rewrite the example to use the alloca technique to avoid 139*9880d681SAndroid Build Coastguard Workerusing a PHI node: 140*9880d681SAndroid Build Coastguard Worker 141*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 142*9880d681SAndroid Build Coastguard Worker 143*9880d681SAndroid Build Coastguard Worker @G = weak global i32 0 ; type of @G is i32* 144*9880d681SAndroid Build Coastguard Worker @H = weak global i32 0 ; type of @H is i32* 145*9880d681SAndroid Build Coastguard Worker 146*9880d681SAndroid Build Coastguard Worker define i32 @test(i1 %Condition) { 147*9880d681SAndroid Build Coastguard Worker entry: 148*9880d681SAndroid Build Coastguard Worker %X = alloca i32 ; type of %X is i32*. 149*9880d681SAndroid Build Coastguard Worker br i1 %Condition, label %cond_true, label %cond_false 150*9880d681SAndroid Build Coastguard Worker 151*9880d681SAndroid Build Coastguard Worker cond_true: 152*9880d681SAndroid Build Coastguard Worker %X.0 = load i32* @G 153*9880d681SAndroid Build Coastguard Worker store i32 %X.0, i32* %X ; Update X 154*9880d681SAndroid Build Coastguard Worker br label %cond_next 155*9880d681SAndroid Build Coastguard Worker 156*9880d681SAndroid Build Coastguard Worker cond_false: 157*9880d681SAndroid Build Coastguard Worker %X.1 = load i32* @H 158*9880d681SAndroid Build Coastguard Worker store i32 %X.1, i32* %X ; Update X 159*9880d681SAndroid Build Coastguard Worker br label %cond_next 160*9880d681SAndroid Build Coastguard Worker 161*9880d681SAndroid Build Coastguard Worker cond_next: 162*9880d681SAndroid Build Coastguard Worker %X.2 = load i32* %X ; Read X 163*9880d681SAndroid Build Coastguard Worker ret i32 %X.2 164*9880d681SAndroid Build Coastguard Worker } 165*9880d681SAndroid Build Coastguard Worker 166*9880d681SAndroid Build Coastguard WorkerWith this, we have discovered a way to handle arbitrary mutable 167*9880d681SAndroid Build Coastguard Workervariables without the need to create Phi nodes at all: 168*9880d681SAndroid Build Coastguard Worker 169*9880d681SAndroid Build Coastguard Worker#. Each mutable variable becomes a stack allocation. 170*9880d681SAndroid Build Coastguard Worker#. Each read of the variable becomes a load from the stack. 171*9880d681SAndroid Build Coastguard Worker#. Each update of the variable becomes a store to the stack. 172*9880d681SAndroid Build Coastguard Worker#. Taking the address of a variable just uses the stack address 173*9880d681SAndroid Build Coastguard Worker directly. 174*9880d681SAndroid Build Coastguard Worker 175*9880d681SAndroid Build Coastguard WorkerWhile this solution has solved our immediate problem, it introduced 176*9880d681SAndroid Build Coastguard Workeranother one: we have now apparently introduced a lot of stack traffic 177*9880d681SAndroid Build Coastguard Workerfor very simple and common operations, a major performance problem. 178*9880d681SAndroid Build Coastguard WorkerFortunately for us, the LLVM optimizer has a highly-tuned optimization 179*9880d681SAndroid Build Coastguard Workerpass named "mem2reg" that handles this case, promoting allocas like this 180*9880d681SAndroid Build Coastguard Workerinto SSA registers, inserting Phi nodes as appropriate. If you run this 181*9880d681SAndroid Build Coastguard Workerexample through the pass, for example, you'll get: 182*9880d681SAndroid Build Coastguard Worker 183*9880d681SAndroid Build Coastguard Worker.. code-block:: bash 184*9880d681SAndroid Build Coastguard Worker 185*9880d681SAndroid Build Coastguard Worker $ llvm-as < example.ll | opt -mem2reg | llvm-dis 186*9880d681SAndroid Build Coastguard Worker @G = weak global i32 0 187*9880d681SAndroid Build Coastguard Worker @H = weak global i32 0 188*9880d681SAndroid Build Coastguard Worker 189*9880d681SAndroid Build Coastguard Worker define i32 @test(i1 %Condition) { 190*9880d681SAndroid Build Coastguard Worker entry: 191*9880d681SAndroid Build Coastguard Worker br i1 %Condition, label %cond_true, label %cond_false 192*9880d681SAndroid Build Coastguard Worker 193*9880d681SAndroid Build Coastguard Worker cond_true: 194*9880d681SAndroid Build Coastguard Worker %X.0 = load i32* @G 195*9880d681SAndroid Build Coastguard Worker br label %cond_next 196*9880d681SAndroid Build Coastguard Worker 197*9880d681SAndroid Build Coastguard Worker cond_false: 198*9880d681SAndroid Build Coastguard Worker %X.1 = load i32* @H 199*9880d681SAndroid Build Coastguard Worker br label %cond_next 200*9880d681SAndroid Build Coastguard Worker 201*9880d681SAndroid Build Coastguard Worker cond_next: 202*9880d681SAndroid Build Coastguard Worker %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] 203*9880d681SAndroid Build Coastguard Worker ret i32 %X.01 204*9880d681SAndroid Build Coastguard Worker } 205*9880d681SAndroid Build Coastguard Worker 206*9880d681SAndroid Build Coastguard WorkerThe mem2reg pass implements the standard "iterated dominance frontier" 207*9880d681SAndroid Build Coastguard Workeralgorithm for constructing SSA form and has a number of optimizations 208*9880d681SAndroid Build Coastguard Workerthat speed up (very common) degenerate cases. The mem2reg optimization 209*9880d681SAndroid Build Coastguard Workerpass is the answer to dealing with mutable variables, and we highly 210*9880d681SAndroid Build Coastguard Workerrecommend that you depend on it. Note that mem2reg only works on 211*9880d681SAndroid Build Coastguard Workervariables in certain circumstances: 212*9880d681SAndroid Build Coastguard Worker 213*9880d681SAndroid Build Coastguard Worker#. mem2reg is alloca-driven: it looks for allocas and if it can handle 214*9880d681SAndroid Build Coastguard Worker them, it promotes them. It does not apply to global variables or heap 215*9880d681SAndroid Build Coastguard Worker allocations. 216*9880d681SAndroid Build Coastguard Worker#. mem2reg only looks for alloca instructions in the entry block of the 217*9880d681SAndroid Build Coastguard Worker function. Being in the entry block guarantees that the alloca is only 218*9880d681SAndroid Build Coastguard Worker executed once, which makes analysis simpler. 219*9880d681SAndroid Build Coastguard Worker#. mem2reg only promotes allocas whose uses are direct loads and stores. 220*9880d681SAndroid Build Coastguard Worker If the address of the stack object is passed to a function, or if any 221*9880d681SAndroid Build Coastguard Worker funny pointer arithmetic is involved, the alloca will not be 222*9880d681SAndroid Build Coastguard Worker promoted. 223*9880d681SAndroid Build Coastguard Worker#. mem2reg only works on allocas of `first 224*9880d681SAndroid Build Coastguard Worker class <../LangRef.html#first-class-types>`_ values (such as pointers, 225*9880d681SAndroid Build Coastguard Worker scalars and vectors), and only if the array size of the allocation is 226*9880d681SAndroid Build Coastguard Worker 1 (or missing in the .ll file). mem2reg is not capable of promoting 227*9880d681SAndroid Build Coastguard Worker structs or arrays to registers. Note that the "sroa" pass is 228*9880d681SAndroid Build Coastguard Worker more powerful and can promote structs, "unions", and arrays in many 229*9880d681SAndroid Build Coastguard Worker cases. 230*9880d681SAndroid Build Coastguard Worker 231*9880d681SAndroid Build Coastguard WorkerAll of these properties are easy to satisfy for most imperative 232*9880d681SAndroid Build Coastguard Workerlanguages, and we'll illustrate it below with Kaleidoscope. The final 233*9880d681SAndroid Build Coastguard Workerquestion you may be asking is: should I bother with this nonsense for my 234*9880d681SAndroid Build Coastguard Workerfront-end? Wouldn't it be better if I just did SSA construction 235*9880d681SAndroid Build Coastguard Workerdirectly, avoiding use of the mem2reg optimization pass? In short, we 236*9880d681SAndroid Build Coastguard Workerstrongly recommend that you use this technique for building SSA form, 237*9880d681SAndroid Build Coastguard Workerunless there is an extremely good reason not to. Using this technique 238*9880d681SAndroid Build Coastguard Workeris: 239*9880d681SAndroid Build Coastguard Worker 240*9880d681SAndroid Build Coastguard Worker- Proven and well tested: clang uses this technique 241*9880d681SAndroid Build Coastguard Worker for local mutable variables. As such, the most common clients of LLVM 242*9880d681SAndroid Build Coastguard Worker are using this to handle a bulk of their variables. You can be sure 243*9880d681SAndroid Build Coastguard Worker that bugs are found fast and fixed early. 244*9880d681SAndroid Build Coastguard Worker- Extremely Fast: mem2reg has a number of special cases that make it 245*9880d681SAndroid Build Coastguard Worker fast in common cases as well as fully general. For example, it has 246*9880d681SAndroid Build Coastguard Worker fast-paths for variables that are only used in a single block, 247*9880d681SAndroid Build Coastguard Worker variables that only have one assignment point, good heuristics to 248*9880d681SAndroid Build Coastguard Worker avoid insertion of unneeded phi nodes, etc. 249*9880d681SAndroid Build Coastguard Worker- Needed for debug info generation: `Debug information in 250*9880d681SAndroid Build Coastguard Worker LLVM <../SourceLevelDebugging.html>`_ relies on having the address of 251*9880d681SAndroid Build Coastguard Worker the variable exposed so that debug info can be attached to it. This 252*9880d681SAndroid Build Coastguard Worker technique dovetails very naturally with this style of debug info. 253*9880d681SAndroid Build Coastguard Worker 254*9880d681SAndroid Build Coastguard WorkerIf nothing else, this makes it much easier to get your front-end up and 255*9880d681SAndroid Build Coastguard Workerrunning, and is very simple to implement. Let's extend Kaleidoscope with 256*9880d681SAndroid Build Coastguard Workermutable variables now! 257*9880d681SAndroid Build Coastguard Worker 258*9880d681SAndroid Build Coastguard WorkerMutable Variables in Kaleidoscope 259*9880d681SAndroid Build Coastguard Worker================================= 260*9880d681SAndroid Build Coastguard Worker 261*9880d681SAndroid Build Coastguard WorkerNow that we know the sort of problem we want to tackle, let's see what 262*9880d681SAndroid Build Coastguard Workerthis looks like in the context of our little Kaleidoscope language. 263*9880d681SAndroid Build Coastguard WorkerWe're going to add two features: 264*9880d681SAndroid Build Coastguard Worker 265*9880d681SAndroid Build Coastguard Worker#. The ability to mutate variables with the '=' operator. 266*9880d681SAndroid Build Coastguard Worker#. The ability to define new variables. 267*9880d681SAndroid Build Coastguard Worker 268*9880d681SAndroid Build Coastguard WorkerWhile the first item is really what this is about, we only have 269*9880d681SAndroid Build Coastguard Workervariables for incoming arguments as well as for induction variables, and 270*9880d681SAndroid Build Coastguard Workerredefining those only goes so far :). Also, the ability to define new 271*9880d681SAndroid Build Coastguard Workervariables is a useful thing regardless of whether you will be mutating 272*9880d681SAndroid Build Coastguard Workerthem. Here's a motivating example that shows how we could use these: 273*9880d681SAndroid Build Coastguard Worker 274*9880d681SAndroid Build Coastguard Worker:: 275*9880d681SAndroid Build Coastguard Worker 276*9880d681SAndroid Build Coastguard Worker # Define ':' for sequencing: as a low-precedence operator that ignores operands 277*9880d681SAndroid Build Coastguard Worker # and just returns the RHS. 278*9880d681SAndroid Build Coastguard Worker def binary : 1 (x y) y; 279*9880d681SAndroid Build Coastguard Worker 280*9880d681SAndroid Build Coastguard Worker # Recursive fib, we could do this before. 281*9880d681SAndroid Build Coastguard Worker def fib(x) 282*9880d681SAndroid Build Coastguard Worker if (x < 3) then 283*9880d681SAndroid Build Coastguard Worker 1 284*9880d681SAndroid Build Coastguard Worker else 285*9880d681SAndroid Build Coastguard Worker fib(x-1)+fib(x-2); 286*9880d681SAndroid Build Coastguard Worker 287*9880d681SAndroid Build Coastguard Worker # Iterative fib. 288*9880d681SAndroid Build Coastguard Worker def fibi(x) 289*9880d681SAndroid Build Coastguard Worker var a = 1, b = 1, c in 290*9880d681SAndroid Build Coastguard Worker (for i = 3, i < x in 291*9880d681SAndroid Build Coastguard Worker c = a + b : 292*9880d681SAndroid Build Coastguard Worker a = b : 293*9880d681SAndroid Build Coastguard Worker b = c) : 294*9880d681SAndroid Build Coastguard Worker b; 295*9880d681SAndroid Build Coastguard Worker 296*9880d681SAndroid Build Coastguard Worker # Call it. 297*9880d681SAndroid Build Coastguard Worker fibi(10); 298*9880d681SAndroid Build Coastguard Worker 299*9880d681SAndroid Build Coastguard WorkerIn order to mutate variables, we have to change our existing variables 300*9880d681SAndroid Build Coastguard Workerto use the "alloca trick". Once we have that, we'll add our new 301*9880d681SAndroid Build Coastguard Workeroperator, then extend Kaleidoscope to support new variable definitions. 302*9880d681SAndroid Build Coastguard Worker 303*9880d681SAndroid Build Coastguard WorkerAdjusting Existing Variables for Mutation 304*9880d681SAndroid Build Coastguard Worker========================================= 305*9880d681SAndroid Build Coastguard Worker 306*9880d681SAndroid Build Coastguard WorkerThe symbol table in Kaleidoscope is managed at code generation time by 307*9880d681SAndroid Build Coastguard Workerthe '``NamedValues``' map. This map currently keeps track of the LLVM 308*9880d681SAndroid Build Coastguard Worker"Value\*" that holds the double value for the named variable. In order 309*9880d681SAndroid Build Coastguard Workerto support mutation, we need to change this slightly, so that 310*9880d681SAndroid Build Coastguard Worker``NamedValues`` holds the *memory location* of the variable in question. 311*9880d681SAndroid Build Coastguard WorkerNote that this change is a refactoring: it changes the structure of the 312*9880d681SAndroid Build Coastguard Workercode, but does not (by itself) change the behavior of the compiler. All 313*9880d681SAndroid Build Coastguard Workerof these changes are isolated in the Kaleidoscope code generator. 314*9880d681SAndroid Build Coastguard Worker 315*9880d681SAndroid Build Coastguard WorkerAt this point in Kaleidoscope's development, it only supports variables 316*9880d681SAndroid Build Coastguard Workerfor two things: incoming arguments to functions and the induction 317*9880d681SAndroid Build Coastguard Workervariable of 'for' loops. For consistency, we'll allow mutation of these 318*9880d681SAndroid Build Coastguard Workervariables in addition to other user-defined variables. This means that 319*9880d681SAndroid Build Coastguard Workerthese will both need memory locations. 320*9880d681SAndroid Build Coastguard Worker 321*9880d681SAndroid Build Coastguard WorkerTo start our transformation of Kaleidoscope, we'll change the 322*9880d681SAndroid Build Coastguard WorkerNamedValues map so that it maps to AllocaInst\* instead of Value\*. Once 323*9880d681SAndroid Build Coastguard Workerwe do this, the C++ compiler will tell us what parts of the code we need 324*9880d681SAndroid Build Coastguard Workerto update: 325*9880d681SAndroid Build Coastguard Worker 326*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 327*9880d681SAndroid Build Coastguard Worker 328*9880d681SAndroid Build Coastguard Worker static std::map<std::string, AllocaInst*> NamedValues; 329*9880d681SAndroid Build Coastguard Worker 330*9880d681SAndroid Build Coastguard WorkerAlso, since we will need to create these alloca's, we'll use a helper 331*9880d681SAndroid Build Coastguard Workerfunction that ensures that the allocas are created in the entry block of 332*9880d681SAndroid Build Coastguard Workerthe function: 333*9880d681SAndroid Build Coastguard Worker 334*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 335*9880d681SAndroid Build Coastguard Worker 336*9880d681SAndroid Build Coastguard Worker /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of 337*9880d681SAndroid Build Coastguard Worker /// the function. This is used for mutable variables etc. 338*9880d681SAndroid Build Coastguard Worker static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction, 339*9880d681SAndroid Build Coastguard Worker const std::string &VarName) { 340*9880d681SAndroid Build Coastguard Worker IRBuilder<> TmpB(&TheFunction->getEntryBlock(), 341*9880d681SAndroid Build Coastguard Worker TheFunction->getEntryBlock().begin()); 342*9880d681SAndroid Build Coastguard Worker return TmpB.CreateAlloca(Type::getDoubleTy(LLVMContext), 0, 343*9880d681SAndroid Build Coastguard Worker VarName.c_str()); 344*9880d681SAndroid Build Coastguard Worker } 345*9880d681SAndroid Build Coastguard Worker 346*9880d681SAndroid Build Coastguard WorkerThis funny looking code creates an IRBuilder object that is pointing at 347*9880d681SAndroid Build Coastguard Workerthe first instruction (.begin()) of the entry block. It then creates an 348*9880d681SAndroid Build Coastguard Workeralloca with the expected name and returns it. Because all values in 349*9880d681SAndroid Build Coastguard WorkerKaleidoscope are doubles, there is no need to pass in a type to use. 350*9880d681SAndroid Build Coastguard Worker 351*9880d681SAndroid Build Coastguard WorkerWith this in place, the first functionality change we want to make is to 352*9880d681SAndroid Build Coastguard Workervariable references. In our new scheme, variables live on the stack, so 353*9880d681SAndroid Build Coastguard Workercode generating a reference to them actually needs to produce a load 354*9880d681SAndroid Build Coastguard Workerfrom the stack slot: 355*9880d681SAndroid Build Coastguard Worker 356*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 357*9880d681SAndroid Build Coastguard Worker 358*9880d681SAndroid Build Coastguard Worker Value *VariableExprAST::codegen() { 359*9880d681SAndroid Build Coastguard Worker // Look this variable up in the function. 360*9880d681SAndroid Build Coastguard Worker Value *V = NamedValues[Name]; 361*9880d681SAndroid Build Coastguard Worker if (!V) 362*9880d681SAndroid Build Coastguard Worker return LogErrorV("Unknown variable name"); 363*9880d681SAndroid Build Coastguard Worker 364*9880d681SAndroid Build Coastguard Worker // Load the value. 365*9880d681SAndroid Build Coastguard Worker return Builder.CreateLoad(V, Name.c_str()); 366*9880d681SAndroid Build Coastguard Worker } 367*9880d681SAndroid Build Coastguard Worker 368*9880d681SAndroid Build Coastguard WorkerAs you can see, this is pretty straightforward. Now we need to update 369*9880d681SAndroid Build Coastguard Workerthe things that define the variables to set up the alloca. We'll start 370*9880d681SAndroid Build Coastguard Workerwith ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for 371*9880d681SAndroid Build Coastguard Workerthe unabridged code): 372*9880d681SAndroid Build Coastguard Worker 373*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 374*9880d681SAndroid Build Coastguard Worker 375*9880d681SAndroid Build Coastguard Worker Function *TheFunction = Builder.GetInsertBlock()->getParent(); 376*9880d681SAndroid Build Coastguard Worker 377*9880d681SAndroid Build Coastguard Worker // Create an alloca for the variable in the entry block. 378*9880d681SAndroid Build Coastguard Worker AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); 379*9880d681SAndroid Build Coastguard Worker 380*9880d681SAndroid Build Coastguard Worker // Emit the start code first, without 'variable' in scope. 381*9880d681SAndroid Build Coastguard Worker Value *StartVal = Start->codegen(); 382*9880d681SAndroid Build Coastguard Worker if (!StartVal) 383*9880d681SAndroid Build Coastguard Worker return nullptr; 384*9880d681SAndroid Build Coastguard Worker 385*9880d681SAndroid Build Coastguard Worker // Store the value into the alloca. 386*9880d681SAndroid Build Coastguard Worker Builder.CreateStore(StartVal, Alloca); 387*9880d681SAndroid Build Coastguard Worker ... 388*9880d681SAndroid Build Coastguard Worker 389*9880d681SAndroid Build Coastguard Worker // Compute the end condition. 390*9880d681SAndroid Build Coastguard Worker Value *EndCond = End->codegen(); 391*9880d681SAndroid Build Coastguard Worker if (!EndCond) 392*9880d681SAndroid Build Coastguard Worker return nullptr; 393*9880d681SAndroid Build Coastguard Worker 394*9880d681SAndroid Build Coastguard Worker // Reload, increment, and restore the alloca. This handles the case where 395*9880d681SAndroid Build Coastguard Worker // the body of the loop mutates the variable. 396*9880d681SAndroid Build Coastguard Worker Value *CurVar = Builder.CreateLoad(Alloca); 397*9880d681SAndroid Build Coastguard Worker Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar"); 398*9880d681SAndroid Build Coastguard Worker Builder.CreateStore(NextVar, Alloca); 399*9880d681SAndroid Build Coastguard Worker ... 400*9880d681SAndroid Build Coastguard Worker 401*9880d681SAndroid Build Coastguard WorkerThis code is virtually identical to the code `before we allowed mutable 402*9880d681SAndroid Build Coastguard Workervariables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we 403*9880d681SAndroid Build Coastguard Workerno longer have to construct a PHI node, and we use load/store to access 404*9880d681SAndroid Build Coastguard Workerthe variable as needed. 405*9880d681SAndroid Build Coastguard Worker 406*9880d681SAndroid Build Coastguard WorkerTo support mutable argument variables, we need to also make allocas for 407*9880d681SAndroid Build Coastguard Workerthem. The code for this is also pretty simple: 408*9880d681SAndroid Build Coastguard Worker 409*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 410*9880d681SAndroid Build Coastguard Worker 411*9880d681SAndroid Build Coastguard Worker /// CreateArgumentAllocas - Create an alloca for each argument and register the 412*9880d681SAndroid Build Coastguard Worker /// argument in the symbol table so that references to it will succeed. 413*9880d681SAndroid Build Coastguard Worker void PrototypeAST::CreateArgumentAllocas(Function *F) { 414*9880d681SAndroid Build Coastguard Worker Function::arg_iterator AI = F->arg_begin(); 415*9880d681SAndroid Build Coastguard Worker for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) { 416*9880d681SAndroid Build Coastguard Worker // Create an alloca for this variable. 417*9880d681SAndroid Build Coastguard Worker AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]); 418*9880d681SAndroid Build Coastguard Worker 419*9880d681SAndroid Build Coastguard Worker // Store the initial value into the alloca. 420*9880d681SAndroid Build Coastguard Worker Builder.CreateStore(AI, Alloca); 421*9880d681SAndroid Build Coastguard Worker 422*9880d681SAndroid Build Coastguard Worker // Add arguments to variable symbol table. 423*9880d681SAndroid Build Coastguard Worker NamedValues[Args[Idx]] = Alloca; 424*9880d681SAndroid Build Coastguard Worker } 425*9880d681SAndroid Build Coastguard Worker } 426*9880d681SAndroid Build Coastguard Worker 427*9880d681SAndroid Build Coastguard WorkerFor each argument, we make an alloca, store the input value to the 428*9880d681SAndroid Build Coastguard Workerfunction into the alloca, and register the alloca as the memory location 429*9880d681SAndroid Build Coastguard Workerfor the argument. This method gets invoked by ``FunctionAST::codegen()`` 430*9880d681SAndroid Build Coastguard Workerright after it sets up the entry block for the function. 431*9880d681SAndroid Build Coastguard Worker 432*9880d681SAndroid Build Coastguard WorkerThe final missing piece is adding the mem2reg pass, which allows us to 433*9880d681SAndroid Build Coastguard Workerget good codegen once again: 434*9880d681SAndroid Build Coastguard Worker 435*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 436*9880d681SAndroid Build Coastguard Worker 437*9880d681SAndroid Build Coastguard Worker // Set up the optimizer pipeline. Start with registering info about how the 438*9880d681SAndroid Build Coastguard Worker // target lays out data structures. 439*9880d681SAndroid Build Coastguard Worker OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout())); 440*9880d681SAndroid Build Coastguard Worker // Promote allocas to registers. 441*9880d681SAndroid Build Coastguard Worker OurFPM.add(createPromoteMemoryToRegisterPass()); 442*9880d681SAndroid Build Coastguard Worker // Do simple "peephole" optimizations and bit-twiddling optzns. 443*9880d681SAndroid Build Coastguard Worker OurFPM.add(createInstructionCombiningPass()); 444*9880d681SAndroid Build Coastguard Worker // Reassociate expressions. 445*9880d681SAndroid Build Coastguard Worker OurFPM.add(createReassociatePass()); 446*9880d681SAndroid Build Coastguard Worker 447*9880d681SAndroid Build Coastguard WorkerIt is interesting to see what the code looks like before and after the 448*9880d681SAndroid Build Coastguard Workermem2reg optimization runs. For example, this is the before/after code 449*9880d681SAndroid Build Coastguard Workerfor our recursive fib function. Before the optimization: 450*9880d681SAndroid Build Coastguard Worker 451*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 452*9880d681SAndroid Build Coastguard Worker 453*9880d681SAndroid Build Coastguard Worker define double @fib(double %x) { 454*9880d681SAndroid Build Coastguard Worker entry: 455*9880d681SAndroid Build Coastguard Worker %x1 = alloca double 456*9880d681SAndroid Build Coastguard Worker store double %x, double* %x1 457*9880d681SAndroid Build Coastguard Worker %x2 = load double* %x1 458*9880d681SAndroid Build Coastguard Worker %cmptmp = fcmp ult double %x2, 3.000000e+00 459*9880d681SAndroid Build Coastguard Worker %booltmp = uitofp i1 %cmptmp to double 460*9880d681SAndroid Build Coastguard Worker %ifcond = fcmp one double %booltmp, 0.000000e+00 461*9880d681SAndroid Build Coastguard Worker br i1 %ifcond, label %then, label %else 462*9880d681SAndroid Build Coastguard Worker 463*9880d681SAndroid Build Coastguard Worker then: ; preds = %entry 464*9880d681SAndroid Build Coastguard Worker br label %ifcont 465*9880d681SAndroid Build Coastguard Worker 466*9880d681SAndroid Build Coastguard Worker else: ; preds = %entry 467*9880d681SAndroid Build Coastguard Worker %x3 = load double* %x1 468*9880d681SAndroid Build Coastguard Worker %subtmp = fsub double %x3, 1.000000e+00 469*9880d681SAndroid Build Coastguard Worker %calltmp = call double @fib(double %subtmp) 470*9880d681SAndroid Build Coastguard Worker %x4 = load double* %x1 471*9880d681SAndroid Build Coastguard Worker %subtmp5 = fsub double %x4, 2.000000e+00 472*9880d681SAndroid Build Coastguard Worker %calltmp6 = call double @fib(double %subtmp5) 473*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %calltmp, %calltmp6 474*9880d681SAndroid Build Coastguard Worker br label %ifcont 475*9880d681SAndroid Build Coastguard Worker 476*9880d681SAndroid Build Coastguard Worker ifcont: ; preds = %else, %then 477*9880d681SAndroid Build Coastguard Worker %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 478*9880d681SAndroid Build Coastguard Worker ret double %iftmp 479*9880d681SAndroid Build Coastguard Worker } 480*9880d681SAndroid Build Coastguard Worker 481*9880d681SAndroid Build Coastguard WorkerHere there is only one variable (x, the input argument) but you can 482*9880d681SAndroid Build Coastguard Workerstill see the extremely simple-minded code generation strategy we are 483*9880d681SAndroid Build Coastguard Workerusing. In the entry block, an alloca is created, and the initial input 484*9880d681SAndroid Build Coastguard Workervalue is stored into it. Each reference to the variable does a reload 485*9880d681SAndroid Build Coastguard Workerfrom the stack. Also, note that we didn't modify the if/then/else 486*9880d681SAndroid Build Coastguard Workerexpression, so it still inserts a PHI node. While we could make an 487*9880d681SAndroid Build Coastguard Workeralloca for it, it is actually easier to create a PHI node for it, so we 488*9880d681SAndroid Build Coastguard Workerstill just make the PHI. 489*9880d681SAndroid Build Coastguard Worker 490*9880d681SAndroid Build Coastguard WorkerHere is the code after the mem2reg pass runs: 491*9880d681SAndroid Build Coastguard Worker 492*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 493*9880d681SAndroid Build Coastguard Worker 494*9880d681SAndroid Build Coastguard Worker define double @fib(double %x) { 495*9880d681SAndroid Build Coastguard Worker entry: 496*9880d681SAndroid Build Coastguard Worker %cmptmp = fcmp ult double %x, 3.000000e+00 497*9880d681SAndroid Build Coastguard Worker %booltmp = uitofp i1 %cmptmp to double 498*9880d681SAndroid Build Coastguard Worker %ifcond = fcmp one double %booltmp, 0.000000e+00 499*9880d681SAndroid Build Coastguard Worker br i1 %ifcond, label %then, label %else 500*9880d681SAndroid Build Coastguard Worker 501*9880d681SAndroid Build Coastguard Worker then: 502*9880d681SAndroid Build Coastguard Worker br label %ifcont 503*9880d681SAndroid Build Coastguard Worker 504*9880d681SAndroid Build Coastguard Worker else: 505*9880d681SAndroid Build Coastguard Worker %subtmp = fsub double %x, 1.000000e+00 506*9880d681SAndroid Build Coastguard Worker %calltmp = call double @fib(double %subtmp) 507*9880d681SAndroid Build Coastguard Worker %subtmp5 = fsub double %x, 2.000000e+00 508*9880d681SAndroid Build Coastguard Worker %calltmp6 = call double @fib(double %subtmp5) 509*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %calltmp, %calltmp6 510*9880d681SAndroid Build Coastguard Worker br label %ifcont 511*9880d681SAndroid Build Coastguard Worker 512*9880d681SAndroid Build Coastguard Worker ifcont: ; preds = %else, %then 513*9880d681SAndroid Build Coastguard Worker %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] 514*9880d681SAndroid Build Coastguard Worker ret double %iftmp 515*9880d681SAndroid Build Coastguard Worker } 516*9880d681SAndroid Build Coastguard Worker 517*9880d681SAndroid Build Coastguard WorkerThis is a trivial case for mem2reg, since there are no redefinitions of 518*9880d681SAndroid Build Coastguard Workerthe variable. The point of showing this is to calm your tension about 519*9880d681SAndroid Build Coastguard Workerinserting such blatent inefficiencies :). 520*9880d681SAndroid Build Coastguard Worker 521*9880d681SAndroid Build Coastguard WorkerAfter the rest of the optimizers run, we get: 522*9880d681SAndroid Build Coastguard Worker 523*9880d681SAndroid Build Coastguard Worker.. code-block:: llvm 524*9880d681SAndroid Build Coastguard Worker 525*9880d681SAndroid Build Coastguard Worker define double @fib(double %x) { 526*9880d681SAndroid Build Coastguard Worker entry: 527*9880d681SAndroid Build Coastguard Worker %cmptmp = fcmp ult double %x, 3.000000e+00 528*9880d681SAndroid Build Coastguard Worker %booltmp = uitofp i1 %cmptmp to double 529*9880d681SAndroid Build Coastguard Worker %ifcond = fcmp ueq double %booltmp, 0.000000e+00 530*9880d681SAndroid Build Coastguard Worker br i1 %ifcond, label %else, label %ifcont 531*9880d681SAndroid Build Coastguard Worker 532*9880d681SAndroid Build Coastguard Worker else: 533*9880d681SAndroid Build Coastguard Worker %subtmp = fsub double %x, 1.000000e+00 534*9880d681SAndroid Build Coastguard Worker %calltmp = call double @fib(double %subtmp) 535*9880d681SAndroid Build Coastguard Worker %subtmp5 = fsub double %x, 2.000000e+00 536*9880d681SAndroid Build Coastguard Worker %calltmp6 = call double @fib(double %subtmp5) 537*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %calltmp, %calltmp6 538*9880d681SAndroid Build Coastguard Worker ret double %addtmp 539*9880d681SAndroid Build Coastguard Worker 540*9880d681SAndroid Build Coastguard Worker ifcont: 541*9880d681SAndroid Build Coastguard Worker ret double 1.000000e+00 542*9880d681SAndroid Build Coastguard Worker } 543*9880d681SAndroid Build Coastguard Worker 544*9880d681SAndroid Build Coastguard WorkerHere we see that the simplifycfg pass decided to clone the return 545*9880d681SAndroid Build Coastguard Workerinstruction into the end of the 'else' block. This allowed it to 546*9880d681SAndroid Build Coastguard Workereliminate some branches and the PHI node. 547*9880d681SAndroid Build Coastguard Worker 548*9880d681SAndroid Build Coastguard WorkerNow that all symbol table references are updated to use stack variables, 549*9880d681SAndroid Build Coastguard Workerwe'll add the assignment operator. 550*9880d681SAndroid Build Coastguard Worker 551*9880d681SAndroid Build Coastguard WorkerNew Assignment Operator 552*9880d681SAndroid Build Coastguard Worker======================= 553*9880d681SAndroid Build Coastguard Worker 554*9880d681SAndroid Build Coastguard WorkerWith our current framework, adding a new assignment operator is really 555*9880d681SAndroid Build Coastguard Workersimple. We will parse it just like any other binary operator, but handle 556*9880d681SAndroid Build Coastguard Workerit internally (instead of allowing the user to define it). The first 557*9880d681SAndroid Build Coastguard Workerstep is to set a precedence: 558*9880d681SAndroid Build Coastguard Worker 559*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 560*9880d681SAndroid Build Coastguard Worker 561*9880d681SAndroid Build Coastguard Worker int main() { 562*9880d681SAndroid Build Coastguard Worker // Install standard binary operators. 563*9880d681SAndroid Build Coastguard Worker // 1 is lowest precedence. 564*9880d681SAndroid Build Coastguard Worker BinopPrecedence['='] = 2; 565*9880d681SAndroid Build Coastguard Worker BinopPrecedence['<'] = 10; 566*9880d681SAndroid Build Coastguard Worker BinopPrecedence['+'] = 20; 567*9880d681SAndroid Build Coastguard Worker BinopPrecedence['-'] = 20; 568*9880d681SAndroid Build Coastguard Worker 569*9880d681SAndroid Build Coastguard WorkerNow that the parser knows the precedence of the binary operator, it 570*9880d681SAndroid Build Coastguard Workertakes care of all the parsing and AST generation. We just need to 571*9880d681SAndroid Build Coastguard Workerimplement codegen for the assignment operator. This looks like: 572*9880d681SAndroid Build Coastguard Worker 573*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 574*9880d681SAndroid Build Coastguard Worker 575*9880d681SAndroid Build Coastguard Worker Value *BinaryExprAST::codegen() { 576*9880d681SAndroid Build Coastguard Worker // Special case '=' because we don't want to emit the LHS as an expression. 577*9880d681SAndroid Build Coastguard Worker if (Op == '=') { 578*9880d681SAndroid Build Coastguard Worker // Assignment requires the LHS to be an identifier. 579*9880d681SAndroid Build Coastguard Worker VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get()); 580*9880d681SAndroid Build Coastguard Worker if (!LHSE) 581*9880d681SAndroid Build Coastguard Worker return LogErrorV("destination of '=' must be a variable"); 582*9880d681SAndroid Build Coastguard Worker 583*9880d681SAndroid Build Coastguard WorkerUnlike the rest of the binary operators, our assignment operator doesn't 584*9880d681SAndroid Build Coastguard Workerfollow the "emit LHS, emit RHS, do computation" model. As such, it is 585*9880d681SAndroid Build Coastguard Workerhandled as a special case before the other binary operators are handled. 586*9880d681SAndroid Build Coastguard WorkerThe other strange thing is that it requires the LHS to be a variable. It 587*9880d681SAndroid Build Coastguard Workeris invalid to have "(x+1) = expr" - only things like "x = expr" are 588*9880d681SAndroid Build Coastguard Workerallowed. 589*9880d681SAndroid Build Coastguard Worker 590*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 591*9880d681SAndroid Build Coastguard Worker 592*9880d681SAndroid Build Coastguard Worker // Codegen the RHS. 593*9880d681SAndroid Build Coastguard Worker Value *Val = RHS->codegen(); 594*9880d681SAndroid Build Coastguard Worker if (!Val) 595*9880d681SAndroid Build Coastguard Worker return nullptr; 596*9880d681SAndroid Build Coastguard Worker 597*9880d681SAndroid Build Coastguard Worker // Look up the name. 598*9880d681SAndroid Build Coastguard Worker Value *Variable = NamedValues[LHSE->getName()]; 599*9880d681SAndroid Build Coastguard Worker if (!Variable) 600*9880d681SAndroid Build Coastguard Worker return LogErrorV("Unknown variable name"); 601*9880d681SAndroid Build Coastguard Worker 602*9880d681SAndroid Build Coastguard Worker Builder.CreateStore(Val, Variable); 603*9880d681SAndroid Build Coastguard Worker return Val; 604*9880d681SAndroid Build Coastguard Worker } 605*9880d681SAndroid Build Coastguard Worker ... 606*9880d681SAndroid Build Coastguard Worker 607*9880d681SAndroid Build Coastguard WorkerOnce we have the variable, codegen'ing the assignment is 608*9880d681SAndroid Build Coastguard Workerstraightforward: we emit the RHS of the assignment, create a store, and 609*9880d681SAndroid Build Coastguard Workerreturn the computed value. Returning a value allows for chained 610*9880d681SAndroid Build Coastguard Workerassignments like "X = (Y = Z)". 611*9880d681SAndroid Build Coastguard Worker 612*9880d681SAndroid Build Coastguard WorkerNow that we have an assignment operator, we can mutate loop variables 613*9880d681SAndroid Build Coastguard Workerand arguments. For example, we can now run code like this: 614*9880d681SAndroid Build Coastguard Worker 615*9880d681SAndroid Build Coastguard Worker:: 616*9880d681SAndroid Build Coastguard Worker 617*9880d681SAndroid Build Coastguard Worker # Function to print a double. 618*9880d681SAndroid Build Coastguard Worker extern printd(x); 619*9880d681SAndroid Build Coastguard Worker 620*9880d681SAndroid Build Coastguard Worker # Define ':' for sequencing: as a low-precedence operator that ignores operands 621*9880d681SAndroid Build Coastguard Worker # and just returns the RHS. 622*9880d681SAndroid Build Coastguard Worker def binary : 1 (x y) y; 623*9880d681SAndroid Build Coastguard Worker 624*9880d681SAndroid Build Coastguard Worker def test(x) 625*9880d681SAndroid Build Coastguard Worker printd(x) : 626*9880d681SAndroid Build Coastguard Worker x = 4 : 627*9880d681SAndroid Build Coastguard Worker printd(x); 628*9880d681SAndroid Build Coastguard Worker 629*9880d681SAndroid Build Coastguard Worker test(123); 630*9880d681SAndroid Build Coastguard Worker 631*9880d681SAndroid Build Coastguard WorkerWhen run, this example prints "123" and then "4", showing that we did 632*9880d681SAndroid Build Coastguard Workeractually mutate the value! Okay, we have now officially implemented our 633*9880d681SAndroid Build Coastguard Workergoal: getting this to work requires SSA construction in the general 634*9880d681SAndroid Build Coastguard Workercase. However, to be really useful, we want the ability to define our 635*9880d681SAndroid Build Coastguard Workerown local variables, let's add this next! 636*9880d681SAndroid Build Coastguard Worker 637*9880d681SAndroid Build Coastguard WorkerUser-defined Local Variables 638*9880d681SAndroid Build Coastguard Worker============================ 639*9880d681SAndroid Build Coastguard Worker 640*9880d681SAndroid Build Coastguard WorkerAdding var/in is just like any other extension we made to 641*9880d681SAndroid Build Coastguard WorkerKaleidoscope: we extend the lexer, the parser, the AST and the code 642*9880d681SAndroid Build Coastguard Workergenerator. The first step for adding our new 'var/in' construct is to 643*9880d681SAndroid Build Coastguard Workerextend the lexer. As before, this is pretty trivial, the code looks like 644*9880d681SAndroid Build Coastguard Workerthis: 645*9880d681SAndroid Build Coastguard Worker 646*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 647*9880d681SAndroid Build Coastguard Worker 648*9880d681SAndroid Build Coastguard Worker enum Token { 649*9880d681SAndroid Build Coastguard Worker ... 650*9880d681SAndroid Build Coastguard Worker // var definition 651*9880d681SAndroid Build Coastguard Worker tok_var = -13 652*9880d681SAndroid Build Coastguard Worker ... 653*9880d681SAndroid Build Coastguard Worker } 654*9880d681SAndroid Build Coastguard Worker ... 655*9880d681SAndroid Build Coastguard Worker static int gettok() { 656*9880d681SAndroid Build Coastguard Worker ... 657*9880d681SAndroid Build Coastguard Worker if (IdentifierStr == "in") 658*9880d681SAndroid Build Coastguard Worker return tok_in; 659*9880d681SAndroid Build Coastguard Worker if (IdentifierStr == "binary") 660*9880d681SAndroid Build Coastguard Worker return tok_binary; 661*9880d681SAndroid Build Coastguard Worker if (IdentifierStr == "unary") 662*9880d681SAndroid Build Coastguard Worker return tok_unary; 663*9880d681SAndroid Build Coastguard Worker if (IdentifierStr == "var") 664*9880d681SAndroid Build Coastguard Worker return tok_var; 665*9880d681SAndroid Build Coastguard Worker return tok_identifier; 666*9880d681SAndroid Build Coastguard Worker ... 667*9880d681SAndroid Build Coastguard Worker 668*9880d681SAndroid Build Coastguard WorkerThe next step is to define the AST node that we will construct. For 669*9880d681SAndroid Build Coastguard Workervar/in, it looks like this: 670*9880d681SAndroid Build Coastguard Worker 671*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 672*9880d681SAndroid Build Coastguard Worker 673*9880d681SAndroid Build Coastguard Worker /// VarExprAST - Expression class for var/in 674*9880d681SAndroid Build Coastguard Worker class VarExprAST : public ExprAST { 675*9880d681SAndroid Build Coastguard Worker std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; 676*9880d681SAndroid Build Coastguard Worker std::unique_ptr<ExprAST> Body; 677*9880d681SAndroid Build Coastguard Worker 678*9880d681SAndroid Build Coastguard Worker public: 679*9880d681SAndroid Build Coastguard Worker VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames, 680*9880d681SAndroid Build Coastguard Worker std::unique_ptr<ExprAST> body) 681*9880d681SAndroid Build Coastguard Worker : VarNames(std::move(VarNames)), Body(std::move(Body)) {} 682*9880d681SAndroid Build Coastguard Worker 683*9880d681SAndroid Build Coastguard Worker virtual Value *codegen(); 684*9880d681SAndroid Build Coastguard Worker }; 685*9880d681SAndroid Build Coastguard Worker 686*9880d681SAndroid Build Coastguard Workervar/in allows a list of names to be defined all at once, and each name 687*9880d681SAndroid Build Coastguard Workercan optionally have an initializer value. As such, we capture this 688*9880d681SAndroid Build Coastguard Workerinformation in the VarNames vector. Also, var/in has a body, this body 689*9880d681SAndroid Build Coastguard Workeris allowed to access the variables defined by the var/in. 690*9880d681SAndroid Build Coastguard Worker 691*9880d681SAndroid Build Coastguard WorkerWith this in place, we can define the parser pieces. The first thing we 692*9880d681SAndroid Build Coastguard Workerdo is add it as a primary expression: 693*9880d681SAndroid Build Coastguard Worker 694*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 695*9880d681SAndroid Build Coastguard Worker 696*9880d681SAndroid Build Coastguard Worker /// primary 697*9880d681SAndroid Build Coastguard Worker /// ::= identifierexpr 698*9880d681SAndroid Build Coastguard Worker /// ::= numberexpr 699*9880d681SAndroid Build Coastguard Worker /// ::= parenexpr 700*9880d681SAndroid Build Coastguard Worker /// ::= ifexpr 701*9880d681SAndroid Build Coastguard Worker /// ::= forexpr 702*9880d681SAndroid Build Coastguard Worker /// ::= varexpr 703*9880d681SAndroid Build Coastguard Worker static std::unique_ptr<ExprAST> ParsePrimary() { 704*9880d681SAndroid Build Coastguard Worker switch (CurTok) { 705*9880d681SAndroid Build Coastguard Worker default: 706*9880d681SAndroid Build Coastguard Worker return LogError("unknown token when expecting an expression"); 707*9880d681SAndroid Build Coastguard Worker case tok_identifier: 708*9880d681SAndroid Build Coastguard Worker return ParseIdentifierExpr(); 709*9880d681SAndroid Build Coastguard Worker case tok_number: 710*9880d681SAndroid Build Coastguard Worker return ParseNumberExpr(); 711*9880d681SAndroid Build Coastguard Worker case '(': 712*9880d681SAndroid Build Coastguard Worker return ParseParenExpr(); 713*9880d681SAndroid Build Coastguard Worker case tok_if: 714*9880d681SAndroid Build Coastguard Worker return ParseIfExpr(); 715*9880d681SAndroid Build Coastguard Worker case tok_for: 716*9880d681SAndroid Build Coastguard Worker return ParseForExpr(); 717*9880d681SAndroid Build Coastguard Worker case tok_var: 718*9880d681SAndroid Build Coastguard Worker return ParseVarExpr(); 719*9880d681SAndroid Build Coastguard Worker } 720*9880d681SAndroid Build Coastguard Worker } 721*9880d681SAndroid Build Coastguard Worker 722*9880d681SAndroid Build Coastguard WorkerNext we define ParseVarExpr: 723*9880d681SAndroid Build Coastguard Worker 724*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 725*9880d681SAndroid Build Coastguard Worker 726*9880d681SAndroid Build Coastguard Worker /// varexpr ::= 'var' identifier ('=' expression)? 727*9880d681SAndroid Build Coastguard Worker // (',' identifier ('=' expression)?)* 'in' expression 728*9880d681SAndroid Build Coastguard Worker static std::unique_ptr<ExprAST> ParseVarExpr() { 729*9880d681SAndroid Build Coastguard Worker getNextToken(); // eat the var. 730*9880d681SAndroid Build Coastguard Worker 731*9880d681SAndroid Build Coastguard Worker std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; 732*9880d681SAndroid Build Coastguard Worker 733*9880d681SAndroid Build Coastguard Worker // At least one variable name is required. 734*9880d681SAndroid Build Coastguard Worker if (CurTok != tok_identifier) 735*9880d681SAndroid Build Coastguard Worker return LogError("expected identifier after var"); 736*9880d681SAndroid Build Coastguard Worker 737*9880d681SAndroid Build Coastguard WorkerThe first part of this code parses the list of identifier/expr pairs 738*9880d681SAndroid Build Coastguard Workerinto the local ``VarNames`` vector. 739*9880d681SAndroid Build Coastguard Worker 740*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 741*9880d681SAndroid Build Coastguard Worker 742*9880d681SAndroid Build Coastguard Worker while (1) { 743*9880d681SAndroid Build Coastguard Worker std::string Name = IdentifierStr; 744*9880d681SAndroid Build Coastguard Worker getNextToken(); // eat identifier. 745*9880d681SAndroid Build Coastguard Worker 746*9880d681SAndroid Build Coastguard Worker // Read the optional initializer. 747*9880d681SAndroid Build Coastguard Worker std::unique_ptr<ExprAST> Init; 748*9880d681SAndroid Build Coastguard Worker if (CurTok == '=') { 749*9880d681SAndroid Build Coastguard Worker getNextToken(); // eat the '='. 750*9880d681SAndroid Build Coastguard Worker 751*9880d681SAndroid Build Coastguard Worker Init = ParseExpression(); 752*9880d681SAndroid Build Coastguard Worker if (!Init) return nullptr; 753*9880d681SAndroid Build Coastguard Worker } 754*9880d681SAndroid Build Coastguard Worker 755*9880d681SAndroid Build Coastguard Worker VarNames.push_back(std::make_pair(Name, std::move(Init))); 756*9880d681SAndroid Build Coastguard Worker 757*9880d681SAndroid Build Coastguard Worker // End of var list, exit loop. 758*9880d681SAndroid Build Coastguard Worker if (CurTok != ',') break; 759*9880d681SAndroid Build Coastguard Worker getNextToken(); // eat the ','. 760*9880d681SAndroid Build Coastguard Worker 761*9880d681SAndroid Build Coastguard Worker if (CurTok != tok_identifier) 762*9880d681SAndroid Build Coastguard Worker return LogError("expected identifier list after var"); 763*9880d681SAndroid Build Coastguard Worker } 764*9880d681SAndroid Build Coastguard Worker 765*9880d681SAndroid Build Coastguard WorkerOnce all the variables are parsed, we then parse the body and create the 766*9880d681SAndroid Build Coastguard WorkerAST node: 767*9880d681SAndroid Build Coastguard Worker 768*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 769*9880d681SAndroid Build Coastguard Worker 770*9880d681SAndroid Build Coastguard Worker // At this point, we have to have 'in'. 771*9880d681SAndroid Build Coastguard Worker if (CurTok != tok_in) 772*9880d681SAndroid Build Coastguard Worker return LogError("expected 'in' keyword after 'var'"); 773*9880d681SAndroid Build Coastguard Worker getNextToken(); // eat 'in'. 774*9880d681SAndroid Build Coastguard Worker 775*9880d681SAndroid Build Coastguard Worker auto Body = ParseExpression(); 776*9880d681SAndroid Build Coastguard Worker if (!Body) 777*9880d681SAndroid Build Coastguard Worker return nullptr; 778*9880d681SAndroid Build Coastguard Worker 779*9880d681SAndroid Build Coastguard Worker return llvm::make_unique<VarExprAST>(std::move(VarNames), 780*9880d681SAndroid Build Coastguard Worker std::move(Body)); 781*9880d681SAndroid Build Coastguard Worker } 782*9880d681SAndroid Build Coastguard Worker 783*9880d681SAndroid Build Coastguard WorkerNow that we can parse and represent the code, we need to support 784*9880d681SAndroid Build Coastguard Workeremission of LLVM IR for it. This code starts out with: 785*9880d681SAndroid Build Coastguard Worker 786*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 787*9880d681SAndroid Build Coastguard Worker 788*9880d681SAndroid Build Coastguard Worker Value *VarExprAST::codegen() { 789*9880d681SAndroid Build Coastguard Worker std::vector<AllocaInst *> OldBindings; 790*9880d681SAndroid Build Coastguard Worker 791*9880d681SAndroid Build Coastguard Worker Function *TheFunction = Builder.GetInsertBlock()->getParent(); 792*9880d681SAndroid Build Coastguard Worker 793*9880d681SAndroid Build Coastguard Worker // Register all variables and emit their initializer. 794*9880d681SAndroid Build Coastguard Worker for (unsigned i = 0, e = VarNames.size(); i != e; ++i) { 795*9880d681SAndroid Build Coastguard Worker const std::string &VarName = VarNames[i].first; 796*9880d681SAndroid Build Coastguard Worker ExprAST *Init = VarNames[i].second.get(); 797*9880d681SAndroid Build Coastguard Worker 798*9880d681SAndroid Build Coastguard WorkerBasically it loops over all the variables, installing them one at a 799*9880d681SAndroid Build Coastguard Workertime. For each variable we put into the symbol table, we remember the 800*9880d681SAndroid Build Coastguard Workerprevious value that we replace in OldBindings. 801*9880d681SAndroid Build Coastguard Worker 802*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 803*9880d681SAndroid Build Coastguard Worker 804*9880d681SAndroid Build Coastguard Worker // Emit the initializer before adding the variable to scope, this prevents 805*9880d681SAndroid Build Coastguard Worker // the initializer from referencing the variable itself, and permits stuff 806*9880d681SAndroid Build Coastguard Worker // like this: 807*9880d681SAndroid Build Coastguard Worker // var a = 1 in 808*9880d681SAndroid Build Coastguard Worker // var a = a in ... # refers to outer 'a'. 809*9880d681SAndroid Build Coastguard Worker Value *InitVal; 810*9880d681SAndroid Build Coastguard Worker if (Init) { 811*9880d681SAndroid Build Coastguard Worker InitVal = Init->codegen(); 812*9880d681SAndroid Build Coastguard Worker if (!InitVal) 813*9880d681SAndroid Build Coastguard Worker return nullptr; 814*9880d681SAndroid Build Coastguard Worker } else { // If not specified, use 0.0. 815*9880d681SAndroid Build Coastguard Worker InitVal = ConstantFP::get(LLVMContext, APFloat(0.0)); 816*9880d681SAndroid Build Coastguard Worker } 817*9880d681SAndroid Build Coastguard Worker 818*9880d681SAndroid Build Coastguard Worker AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); 819*9880d681SAndroid Build Coastguard Worker Builder.CreateStore(InitVal, Alloca); 820*9880d681SAndroid Build Coastguard Worker 821*9880d681SAndroid Build Coastguard Worker // Remember the old variable binding so that we can restore the binding when 822*9880d681SAndroid Build Coastguard Worker // we unrecurse. 823*9880d681SAndroid Build Coastguard Worker OldBindings.push_back(NamedValues[VarName]); 824*9880d681SAndroid Build Coastguard Worker 825*9880d681SAndroid Build Coastguard Worker // Remember this binding. 826*9880d681SAndroid Build Coastguard Worker NamedValues[VarName] = Alloca; 827*9880d681SAndroid Build Coastguard Worker } 828*9880d681SAndroid Build Coastguard Worker 829*9880d681SAndroid Build Coastguard WorkerThere are more comments here than code. The basic idea is that we emit 830*9880d681SAndroid Build Coastguard Workerthe initializer, create the alloca, then update the symbol table to 831*9880d681SAndroid Build Coastguard Workerpoint to it. Once all the variables are installed in the symbol table, 832*9880d681SAndroid Build Coastguard Workerwe evaluate the body of the var/in expression: 833*9880d681SAndroid Build Coastguard Worker 834*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 835*9880d681SAndroid Build Coastguard Worker 836*9880d681SAndroid Build Coastguard Worker // Codegen the body, now that all vars are in scope. 837*9880d681SAndroid Build Coastguard Worker Value *BodyVal = Body->codegen(); 838*9880d681SAndroid Build Coastguard Worker if (!BodyVal) 839*9880d681SAndroid Build Coastguard Worker return nullptr; 840*9880d681SAndroid Build Coastguard Worker 841*9880d681SAndroid Build Coastguard WorkerFinally, before returning, we restore the previous variable bindings: 842*9880d681SAndroid Build Coastguard Worker 843*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 844*9880d681SAndroid Build Coastguard Worker 845*9880d681SAndroid Build Coastguard Worker // Pop all our variables from scope. 846*9880d681SAndroid Build Coastguard Worker for (unsigned i = 0, e = VarNames.size(); i != e; ++i) 847*9880d681SAndroid Build Coastguard Worker NamedValues[VarNames[i].first] = OldBindings[i]; 848*9880d681SAndroid Build Coastguard Worker 849*9880d681SAndroid Build Coastguard Worker // Return the body computation. 850*9880d681SAndroid Build Coastguard Worker return BodyVal; 851*9880d681SAndroid Build Coastguard Worker } 852*9880d681SAndroid Build Coastguard Worker 853*9880d681SAndroid Build Coastguard WorkerThe end result of all of this is that we get properly scoped variable 854*9880d681SAndroid Build Coastguard Workerdefinitions, and we even (trivially) allow mutation of them :). 855*9880d681SAndroid Build Coastguard Worker 856*9880d681SAndroid Build Coastguard WorkerWith this, we completed what we set out to do. Our nice iterative fib 857*9880d681SAndroid Build Coastguard Workerexample from the intro compiles and runs just fine. The mem2reg pass 858*9880d681SAndroid Build Coastguard Workeroptimizes all of our stack variables into SSA registers, inserting PHI 859*9880d681SAndroid Build Coastguard Workernodes where needed, and our front-end remains simple: no "iterated 860*9880d681SAndroid Build Coastguard Workerdominance frontier" computation anywhere in sight. 861*9880d681SAndroid Build Coastguard Worker 862*9880d681SAndroid Build Coastguard WorkerFull Code Listing 863*9880d681SAndroid Build Coastguard Worker================= 864*9880d681SAndroid Build Coastguard Worker 865*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for our running example, enhanced with 866*9880d681SAndroid Build Coastguard Workermutable variables and var/in support. To build this example, use: 867*9880d681SAndroid Build Coastguard Worker 868*9880d681SAndroid Build Coastguard Worker.. code-block:: bash 869*9880d681SAndroid Build Coastguard Worker 870*9880d681SAndroid Build Coastguard Worker # Compile 871*9880d681SAndroid Build Coastguard Worker clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy 872*9880d681SAndroid Build Coastguard Worker # Run 873*9880d681SAndroid Build Coastguard Worker ./toy 874*9880d681SAndroid Build Coastguard Worker 875*9880d681SAndroid Build Coastguard WorkerHere is the code: 876*9880d681SAndroid Build Coastguard Worker 877*9880d681SAndroid Build Coastguard Worker.. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp 878*9880d681SAndroid Build Coastguard Worker :language: c++ 879*9880d681SAndroid Build Coastguard Worker 880*9880d681SAndroid Build Coastguard Worker`Next: Compiling to Object Code <LangImpl08.html>`_ 881*9880d681SAndroid Build Coastguard Worker 882