1*9880d681SAndroid Build Coastguard Worker======================================== 2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Code generation to LLVM IR 3*9880d681SAndroid Build Coastguard Worker======================================== 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard Worker.. contents:: 6*9880d681SAndroid Build Coastguard Worker :local: 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard WorkerChapter 3 Introduction 9*9880d681SAndroid Build Coastguard Worker====================== 10*9880d681SAndroid Build Coastguard Worker 11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 3 of the "`Implementing a language with 12*9880d681SAndroid Build Coastguard WorkerLLVM <index.html>`_" tutorial. This chapter shows you how to transform 13*9880d681SAndroid Build Coastguard Workerthe `Abstract Syntax Tree <LangImpl2.html>`_, built in Chapter 2, into 14*9880d681SAndroid Build Coastguard WorkerLLVM IR. This will teach you a little bit about how LLVM does things, as 15*9880d681SAndroid Build Coastguard Workerwell as demonstrate how easy it is to use. It's much more work to build 16*9880d681SAndroid Build Coastguard Workera lexer and parser than it is to generate LLVM IR code. :) 17*9880d681SAndroid Build Coastguard Worker 18*9880d681SAndroid Build Coastguard Worker**Please note**: the code in this chapter and later require LLVM 3.7 or 19*9880d681SAndroid Build Coastguard Workerlater. LLVM 3.6 and before will not work with it. Also note that you 20*9880d681SAndroid Build Coastguard Workerneed to use a version of this tutorial that matches your LLVM release: 21*9880d681SAndroid Build Coastguard WorkerIf you are using an official LLVM release, use the version of the 22*9880d681SAndroid Build Coastguard Workerdocumentation included with your release or on the `llvm.org releases 23*9880d681SAndroid Build Coastguard Workerpage <http://llvm.org/releases/>`_. 24*9880d681SAndroid Build Coastguard Worker 25*9880d681SAndroid Build Coastguard WorkerCode Generation Setup 26*9880d681SAndroid Build Coastguard Worker===================== 27*9880d681SAndroid Build Coastguard Worker 28*9880d681SAndroid Build Coastguard WorkerIn order to generate LLVM IR, we want some simple setup to get started. 29*9880d681SAndroid Build Coastguard WorkerFirst we define virtual code generation (codegen) methods in each AST 30*9880d681SAndroid Build Coastguard Workerclass: 31*9880d681SAndroid Build Coastguard Worker 32*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 33*9880d681SAndroid Build Coastguard Worker 34*9880d681SAndroid Build Coastguard Worker /// ExprAST - Base class for all expression nodes. 35*9880d681SAndroid Build Coastguard Worker class ExprAST { 36*9880d681SAndroid Build Coastguard Worker public: 37*9880d681SAndroid Build Coastguard Worker virtual ~ExprAST() {} 38*9880d681SAndroid Build Coastguard Worker virtual Value *codegen() = 0; 39*9880d681SAndroid Build Coastguard Worker }; 40*9880d681SAndroid Build Coastguard Worker 41*9880d681SAndroid Build Coastguard Worker /// NumberExprAST - Expression class for numeric literals like "1.0". 42*9880d681SAndroid Build Coastguard Worker class NumberExprAST : public ExprAST { 43*9880d681SAndroid Build Coastguard Worker double Val; 44*9880d681SAndroid Build Coastguard Worker 45*9880d681SAndroid Build Coastguard Worker public: 46*9880d681SAndroid Build Coastguard Worker NumberExprAST(double Val) : Val(Val) {} 47*9880d681SAndroid Build Coastguard Worker virtual Value *codegen(); 48*9880d681SAndroid Build Coastguard Worker }; 49*9880d681SAndroid Build Coastguard Worker ... 50*9880d681SAndroid Build Coastguard Worker 51*9880d681SAndroid Build Coastguard WorkerThe codegen() method says to emit IR for that AST node along with all 52*9880d681SAndroid Build Coastguard Workerthe things it depends on, and they all return an LLVM Value object. 53*9880d681SAndroid Build Coastguard Worker"Value" is the class used to represent a "`Static Single Assignment 54*9880d681SAndroid Build Coastguard Worker(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_ 55*9880d681SAndroid Build Coastguard Workerregister" or "SSA value" in LLVM. The most distinct aspect of SSA values 56*9880d681SAndroid Build Coastguard Workeris that their value is computed as the related instruction executes, and 57*9880d681SAndroid Build Coastguard Workerit does not get a new value until (and if) the instruction re-executes. 58*9880d681SAndroid Build Coastguard WorkerIn other words, there is no way to "change" an SSA value. For more 59*9880d681SAndroid Build Coastguard Workerinformation, please read up on `Static Single 60*9880d681SAndroid Build Coastguard WorkerAssignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_ 61*9880d681SAndroid Build Coastguard Worker- the concepts are really quite natural once you grok them. 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard WorkerNote that instead of adding virtual methods to the ExprAST class 64*9880d681SAndroid Build Coastguard Workerhierarchy, it could also make sense to use a `visitor 65*9880d681SAndroid Build Coastguard Workerpattern <http://en.wikipedia.org/wiki/Visitor_pattern>`_ or some other 66*9880d681SAndroid Build Coastguard Workerway to model this. Again, this tutorial won't dwell on good software 67*9880d681SAndroid Build Coastguard Workerengineering practices: for our purposes, adding a virtual method is 68*9880d681SAndroid Build Coastguard Workersimplest. 69*9880d681SAndroid Build Coastguard Worker 70*9880d681SAndroid Build Coastguard WorkerThe second thing we want is an "LogError" method like we used for the 71*9880d681SAndroid Build Coastguard Workerparser, which will be used to report errors found during code generation 72*9880d681SAndroid Build Coastguard Worker(for example, use of an undeclared parameter): 73*9880d681SAndroid Build Coastguard Worker 74*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 75*9880d681SAndroid Build Coastguard Worker 76*9880d681SAndroid Build Coastguard Worker static LLVMContext TheContext; 77*9880d681SAndroid Build Coastguard Worker static IRBuilder<> Builder(TheContext); 78*9880d681SAndroid Build Coastguard Worker static std::unique_ptr<Module> TheModule; 79*9880d681SAndroid Build Coastguard Worker static std::map<std::string, Value *> NamedValues; 80*9880d681SAndroid Build Coastguard Worker 81*9880d681SAndroid Build Coastguard Worker Value *LogErrorV(const char *Str) { 82*9880d681SAndroid Build Coastguard Worker LogError(Str); 83*9880d681SAndroid Build Coastguard Worker return nullptr; 84*9880d681SAndroid Build Coastguard Worker } 85*9880d681SAndroid Build Coastguard Worker 86*9880d681SAndroid Build Coastguard WorkerThe static variables will be used during code generation. ``TheContext`` 87*9880d681SAndroid Build Coastguard Workeris an opaque object that owns a lot of core LLVM data structures, such as 88*9880d681SAndroid Build Coastguard Workerthe type and constant value tables. We don't need to understand it in 89*9880d681SAndroid Build Coastguard Workerdetail, we just need a single instance to pass into APIs that require it. 90*9880d681SAndroid Build Coastguard Worker 91*9880d681SAndroid Build Coastguard WorkerThe ``Builder`` object is a helper object that makes it easy to generate 92*9880d681SAndroid Build Coastguard WorkerLLVM instructions. Instances of the 93*9880d681SAndroid Build Coastguard Worker`IRBuilder <http://llvm.org/doxygen/IRBuilder_8h-source.html>`_ 94*9880d681SAndroid Build Coastguard Workerclass template keep track of the current place to insert instructions 95*9880d681SAndroid Build Coastguard Workerand has methods to create new instructions. 96*9880d681SAndroid Build Coastguard Worker 97*9880d681SAndroid Build Coastguard Worker``TheModule`` is an LLVM construct that contains functions and global 98*9880d681SAndroid Build Coastguard Workervariables. In many ways, it is the top-level structure that the LLVM IR 99*9880d681SAndroid Build Coastguard Workeruses to contain code. It will own the memory for all of the IR that we 100*9880d681SAndroid Build Coastguard Workergenerate, which is why the codegen() method returns a raw Value\*, 101*9880d681SAndroid Build Coastguard Workerrather than a unique_ptr<Value>. 102*9880d681SAndroid Build Coastguard Worker 103*9880d681SAndroid Build Coastguard WorkerThe ``NamedValues`` map keeps track of which values are defined in the 104*9880d681SAndroid Build Coastguard Workercurrent scope and what their LLVM representation is. (In other words, it 105*9880d681SAndroid Build Coastguard Workeris a symbol table for the code). In this form of Kaleidoscope, the only 106*9880d681SAndroid Build Coastguard Workerthings that can be referenced are function parameters. As such, function 107*9880d681SAndroid Build Coastguard Workerparameters will be in this map when generating code for their function 108*9880d681SAndroid Build Coastguard Workerbody. 109*9880d681SAndroid Build Coastguard Worker 110*9880d681SAndroid Build Coastguard WorkerWith these basics in place, we can start talking about how to generate 111*9880d681SAndroid Build Coastguard Workercode for each expression. Note that this assumes that the ``Builder`` 112*9880d681SAndroid Build Coastguard Workerhas been set up to generate code *into* something. For now, we'll assume 113*9880d681SAndroid Build Coastguard Workerthat this has already been done, and we'll just use it to emit code. 114*9880d681SAndroid Build Coastguard Worker 115*9880d681SAndroid Build Coastguard WorkerExpression Code Generation 116*9880d681SAndroid Build Coastguard Worker========================== 117*9880d681SAndroid Build Coastguard Worker 118*9880d681SAndroid Build Coastguard WorkerGenerating LLVM code for expression nodes is very straightforward: less 119*9880d681SAndroid Build Coastguard Workerthan 45 lines of commented code for all four of our expression nodes. 120*9880d681SAndroid Build Coastguard WorkerFirst we'll do numeric literals: 121*9880d681SAndroid Build Coastguard Worker 122*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 123*9880d681SAndroid Build Coastguard Worker 124*9880d681SAndroid Build Coastguard Worker Value *NumberExprAST::codegen() { 125*9880d681SAndroid Build Coastguard Worker return ConstantFP::get(LLVMContext, APFloat(Val)); 126*9880d681SAndroid Build Coastguard Worker } 127*9880d681SAndroid Build Coastguard Worker 128*9880d681SAndroid Build Coastguard WorkerIn the LLVM IR, numeric constants are represented with the 129*9880d681SAndroid Build Coastguard Worker``ConstantFP`` class, which holds the numeric value in an ``APFloat`` 130*9880d681SAndroid Build Coastguard Workerinternally (``APFloat`` has the capability of holding floating point 131*9880d681SAndroid Build Coastguard Workerconstants of Arbitrary Precision). This code basically just creates 132*9880d681SAndroid Build Coastguard Workerand returns a ``ConstantFP``. Note that in the LLVM IR that constants 133*9880d681SAndroid Build Coastguard Workerare all uniqued together and shared. For this reason, the API uses the 134*9880d681SAndroid Build Coastguard Worker"foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)". 135*9880d681SAndroid Build Coastguard Worker 136*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 137*9880d681SAndroid Build Coastguard Worker 138*9880d681SAndroid Build Coastguard Worker Value *VariableExprAST::codegen() { 139*9880d681SAndroid Build Coastguard Worker // Look this variable up in the function. 140*9880d681SAndroid Build Coastguard Worker Value *V = NamedValues[Name]; 141*9880d681SAndroid Build Coastguard Worker if (!V) 142*9880d681SAndroid Build Coastguard Worker LogErrorV("Unknown variable name"); 143*9880d681SAndroid Build Coastguard Worker return V; 144*9880d681SAndroid Build Coastguard Worker } 145*9880d681SAndroid Build Coastguard Worker 146*9880d681SAndroid Build Coastguard WorkerReferences to variables are also quite simple using LLVM. In the simple 147*9880d681SAndroid Build Coastguard Workerversion of Kaleidoscope, we assume that the variable has already been 148*9880d681SAndroid Build Coastguard Workeremitted somewhere and its value is available. In practice, the only 149*9880d681SAndroid Build Coastguard Workervalues that can be in the ``NamedValues`` map are function arguments. 150*9880d681SAndroid Build Coastguard WorkerThis code simply checks to see that the specified name is in the map (if 151*9880d681SAndroid Build Coastguard Workernot, an unknown variable is being referenced) and returns the value for 152*9880d681SAndroid Build Coastguard Workerit. In future chapters, we'll add support for `loop induction 153*9880d681SAndroid Build Coastguard Workervariables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for `local 154*9880d681SAndroid Build Coastguard Workervariables <LangImpl7.html#user-defined-local-variables>`_. 155*9880d681SAndroid Build Coastguard Worker 156*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 157*9880d681SAndroid Build Coastguard Worker 158*9880d681SAndroid Build Coastguard Worker Value *BinaryExprAST::codegen() { 159*9880d681SAndroid Build Coastguard Worker Value *L = LHS->codegen(); 160*9880d681SAndroid Build Coastguard Worker Value *R = RHS->codegen(); 161*9880d681SAndroid Build Coastguard Worker if (!L || !R) 162*9880d681SAndroid Build Coastguard Worker return nullptr; 163*9880d681SAndroid Build Coastguard Worker 164*9880d681SAndroid Build Coastguard Worker switch (Op) { 165*9880d681SAndroid Build Coastguard Worker case '+': 166*9880d681SAndroid Build Coastguard Worker return Builder.CreateFAdd(L, R, "addtmp"); 167*9880d681SAndroid Build Coastguard Worker case '-': 168*9880d681SAndroid Build Coastguard Worker return Builder.CreateFSub(L, R, "subtmp"); 169*9880d681SAndroid Build Coastguard Worker case '*': 170*9880d681SAndroid Build Coastguard Worker return Builder.CreateFMul(L, R, "multmp"); 171*9880d681SAndroid Build Coastguard Worker case '<': 172*9880d681SAndroid Build Coastguard Worker L = Builder.CreateFCmpULT(L, R, "cmptmp"); 173*9880d681SAndroid Build Coastguard Worker // Convert bool 0/1 to double 0.0 or 1.0 174*9880d681SAndroid Build Coastguard Worker return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext), 175*9880d681SAndroid Build Coastguard Worker "booltmp"); 176*9880d681SAndroid Build Coastguard Worker default: 177*9880d681SAndroid Build Coastguard Worker return LogErrorV("invalid binary operator"); 178*9880d681SAndroid Build Coastguard Worker } 179*9880d681SAndroid Build Coastguard Worker } 180*9880d681SAndroid Build Coastguard Worker 181*9880d681SAndroid Build Coastguard WorkerBinary operators start to get more interesting. The basic idea here is 182*9880d681SAndroid Build Coastguard Workerthat we recursively emit code for the left-hand side of the expression, 183*9880d681SAndroid Build Coastguard Workerthen the right-hand side, then we compute the result of the binary 184*9880d681SAndroid Build Coastguard Workerexpression. In this code, we do a simple switch on the opcode to create 185*9880d681SAndroid Build Coastguard Workerthe right LLVM instruction. 186*9880d681SAndroid Build Coastguard Worker 187*9880d681SAndroid Build Coastguard WorkerIn the example above, the LLVM builder class is starting to show its 188*9880d681SAndroid Build Coastguard Workervalue. IRBuilder knows where to insert the newly created instruction, 189*9880d681SAndroid Build Coastguard Workerall you have to do is specify what instruction to create (e.g. with 190*9880d681SAndroid Build Coastguard Worker``CreateFAdd``), which operands to use (``L`` and ``R`` here) and 191*9880d681SAndroid Build Coastguard Workeroptionally provide a name for the generated instruction. 192*9880d681SAndroid Build Coastguard Worker 193*9880d681SAndroid Build Coastguard WorkerOne nice thing about LLVM is that the name is just a hint. For instance, 194*9880d681SAndroid Build Coastguard Workerif the code above emits multiple "addtmp" variables, LLVM will 195*9880d681SAndroid Build Coastguard Workerautomatically provide each one with an increasing, unique numeric 196*9880d681SAndroid Build Coastguard Workersuffix. Local value names for instructions are purely optional, but it 197*9880d681SAndroid Build Coastguard Workermakes it much easier to read the IR dumps. 198*9880d681SAndroid Build Coastguard Worker 199*9880d681SAndroid Build Coastguard Worker`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict 200*9880d681SAndroid Build Coastguard Workerrules: for example, the Left and Right operators of an `add 201*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#add-instruction>`_ must have the same type, and the 202*9880d681SAndroid Build Coastguard Workerresult type of the add must match the operand types. Because all values 203*9880d681SAndroid Build Coastguard Workerin Kaleidoscope are doubles, this makes for very simple code for add, 204*9880d681SAndroid Build Coastguard Workersub and mul. 205*9880d681SAndroid Build Coastguard Worker 206*9880d681SAndroid Build Coastguard WorkerOn the other hand, LLVM specifies that the `fcmp 207*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a 208*9880d681SAndroid Build Coastguard Workerone bit integer). The problem with this is that Kaleidoscope wants the 209*9880d681SAndroid Build Coastguard Workervalue to be a 0.0 or 1.0 value. In order to get these semantics, we 210*9880d681SAndroid Build Coastguard Workercombine the fcmp instruction with a `uitofp 211*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its 212*9880d681SAndroid Build Coastguard Workerinput integer into a floating point value by treating the input as an 213*9880d681SAndroid Build Coastguard Workerunsigned value. In contrast, if we used the `sitofp 214*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator 215*9880d681SAndroid Build Coastguard Workerwould return 0.0 and -1.0, depending on the input value. 216*9880d681SAndroid Build Coastguard Worker 217*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 218*9880d681SAndroid Build Coastguard Worker 219*9880d681SAndroid Build Coastguard Worker Value *CallExprAST::codegen() { 220*9880d681SAndroid Build Coastguard Worker // Look up the name in the global module table. 221*9880d681SAndroid Build Coastguard Worker Function *CalleeF = TheModule->getFunction(Callee); 222*9880d681SAndroid Build Coastguard Worker if (!CalleeF) 223*9880d681SAndroid Build Coastguard Worker return LogErrorV("Unknown function referenced"); 224*9880d681SAndroid Build Coastguard Worker 225*9880d681SAndroid Build Coastguard Worker // If argument mismatch error. 226*9880d681SAndroid Build Coastguard Worker if (CalleeF->arg_size() != Args.size()) 227*9880d681SAndroid Build Coastguard Worker return LogErrorV("Incorrect # arguments passed"); 228*9880d681SAndroid Build Coastguard Worker 229*9880d681SAndroid Build Coastguard Worker std::vector<Value *> ArgsV; 230*9880d681SAndroid Build Coastguard Worker for (unsigned i = 0, e = Args.size(); i != e; ++i) { 231*9880d681SAndroid Build Coastguard Worker ArgsV.push_back(Args[i]->codegen()); 232*9880d681SAndroid Build Coastguard Worker if (!ArgsV.back()) 233*9880d681SAndroid Build Coastguard Worker return nullptr; 234*9880d681SAndroid Build Coastguard Worker } 235*9880d681SAndroid Build Coastguard Worker 236*9880d681SAndroid Build Coastguard Worker return Builder.CreateCall(CalleeF, ArgsV, "calltmp"); 237*9880d681SAndroid Build Coastguard Worker } 238*9880d681SAndroid Build Coastguard Worker 239*9880d681SAndroid Build Coastguard WorkerCode generation for function calls is quite straightforward with LLVM. The code 240*9880d681SAndroid Build Coastguard Workerabove initially does a function name lookup in the LLVM Module's symbol table. 241*9880d681SAndroid Build Coastguard WorkerRecall that the LLVM Module is the container that holds the functions we are 242*9880d681SAndroid Build Coastguard WorkerJIT'ing. By giving each function the same name as what the user specifies, we 243*9880d681SAndroid Build Coastguard Workercan use the LLVM symbol table to resolve function names for us. 244*9880d681SAndroid Build Coastguard Worker 245*9880d681SAndroid Build Coastguard WorkerOnce we have the function to call, we recursively codegen each argument 246*9880d681SAndroid Build Coastguard Workerthat is to be passed in, and create an LLVM `call 247*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C 248*9880d681SAndroid Build Coastguard Workercalling conventions by default, allowing these calls to also call into 249*9880d681SAndroid Build Coastguard Workerstandard library functions like "sin" and "cos", with no additional 250*9880d681SAndroid Build Coastguard Workereffort. 251*9880d681SAndroid Build Coastguard Worker 252*9880d681SAndroid Build Coastguard WorkerThis wraps up our handling of the four basic expressions that we have so 253*9880d681SAndroid Build Coastguard Workerfar in Kaleidoscope. Feel free to go in and add some more. For example, 254*9880d681SAndroid Build Coastguard Workerby browsing the `LLVM language reference <../LangRef.html>`_ you'll find 255*9880d681SAndroid Build Coastguard Workerseveral other interesting instructions that are really easy to plug into 256*9880d681SAndroid Build Coastguard Workerour basic framework. 257*9880d681SAndroid Build Coastguard Worker 258*9880d681SAndroid Build Coastguard WorkerFunction Code Generation 259*9880d681SAndroid Build Coastguard Worker======================== 260*9880d681SAndroid Build Coastguard Worker 261*9880d681SAndroid Build Coastguard WorkerCode generation for prototypes and functions must handle a number of 262*9880d681SAndroid Build Coastguard Workerdetails, which make their code less beautiful than expression code 263*9880d681SAndroid Build Coastguard Workergeneration, but allows us to illustrate some important points. First, 264*9880d681SAndroid Build Coastguard Workerlets talk about code generation for prototypes: they are used both for 265*9880d681SAndroid Build Coastguard Workerfunction bodies and external function declarations. The code starts 266*9880d681SAndroid Build Coastguard Workerwith: 267*9880d681SAndroid Build Coastguard Worker 268*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 269*9880d681SAndroid Build Coastguard Worker 270*9880d681SAndroid Build Coastguard Worker Function *PrototypeAST::codegen() { 271*9880d681SAndroid Build Coastguard Worker // Make the function type: double(double,double) etc. 272*9880d681SAndroid Build Coastguard Worker std::vector<Type*> Doubles(Args.size(), 273*9880d681SAndroid Build Coastguard Worker Type::getDoubleTy(LLVMContext)); 274*9880d681SAndroid Build Coastguard Worker FunctionType *FT = 275*9880d681SAndroid Build Coastguard Worker FunctionType::get(Type::getDoubleTy(LLVMContext), Doubles, false); 276*9880d681SAndroid Build Coastguard Worker 277*9880d681SAndroid Build Coastguard Worker Function *F = 278*9880d681SAndroid Build Coastguard Worker Function::Create(FT, Function::ExternalLinkage, Name, TheModule); 279*9880d681SAndroid Build Coastguard Worker 280*9880d681SAndroid Build Coastguard WorkerThis code packs a lot of power into a few lines. Note first that this 281*9880d681SAndroid Build Coastguard Workerfunction returns a "Function\*" instead of a "Value\*". Because a 282*9880d681SAndroid Build Coastguard Worker"prototype" really talks about the external interface for a function 283*9880d681SAndroid Build Coastguard Worker(not the value computed by an expression), it makes sense for it to 284*9880d681SAndroid Build Coastguard Workerreturn the LLVM Function it corresponds to when codegen'd. 285*9880d681SAndroid Build Coastguard Worker 286*9880d681SAndroid Build Coastguard WorkerThe call to ``FunctionType::get`` creates the ``FunctionType`` that 287*9880d681SAndroid Build Coastguard Workershould be used for a given Prototype. Since all function arguments in 288*9880d681SAndroid Build Coastguard WorkerKaleidoscope are of type double, the first line creates a vector of "N" 289*9880d681SAndroid Build Coastguard WorkerLLVM double types. It then uses the ``Functiontype::get`` method to 290*9880d681SAndroid Build Coastguard Workercreate a function type that takes "N" doubles as arguments, returns one 291*9880d681SAndroid Build Coastguard Workerdouble as a result, and that is not vararg (the false parameter 292*9880d681SAndroid Build Coastguard Workerindicates this). Note that Types in LLVM are uniqued just like Constants 293*9880d681SAndroid Build Coastguard Workerare, so you don't "new" a type, you "get" it. 294*9880d681SAndroid Build Coastguard Worker 295*9880d681SAndroid Build Coastguard WorkerThe final line above actually creates the IR Function corresponding to 296*9880d681SAndroid Build Coastguard Workerthe Prototype. This indicates the type, linkage and name to use, as 297*9880d681SAndroid Build Coastguard Workerwell as which module to insert into. "`external 298*9880d681SAndroid Build Coastguard Workerlinkage <../LangRef.html#linkage>`_" means that the function may be 299*9880d681SAndroid Build Coastguard Workerdefined outside the current module and/or that it is callable by 300*9880d681SAndroid Build Coastguard Workerfunctions outside the module. The Name passed in is the name the user 301*9880d681SAndroid Build Coastguard Workerspecified: since "``TheModule``" is specified, this name is registered 302*9880d681SAndroid Build Coastguard Workerin "``TheModule``"s symbol table. 303*9880d681SAndroid Build Coastguard Worker 304*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 305*9880d681SAndroid Build Coastguard Worker 306*9880d681SAndroid Build Coastguard Worker // Set names for all arguments. 307*9880d681SAndroid Build Coastguard Worker unsigned Idx = 0; 308*9880d681SAndroid Build Coastguard Worker for (auto &Arg : F->args()) 309*9880d681SAndroid Build Coastguard Worker Arg.setName(Args[Idx++]); 310*9880d681SAndroid Build Coastguard Worker 311*9880d681SAndroid Build Coastguard Worker return F; 312*9880d681SAndroid Build Coastguard Worker 313*9880d681SAndroid Build Coastguard WorkerFinally, we set the name of each of the function's arguments according to the 314*9880d681SAndroid Build Coastguard Workernames given in the Prototype. This step isn't strictly necessary, but keeping 315*9880d681SAndroid Build Coastguard Workerthe names consistent makes the IR more readable, and allows subsequent code to 316*9880d681SAndroid Build Coastguard Workerrefer directly to the arguments for their names, rather than having to look up 317*9880d681SAndroid Build Coastguard Workerthem up in the Prototype AST. 318*9880d681SAndroid Build Coastguard Worker 319*9880d681SAndroid Build Coastguard WorkerAt this point we have a function prototype with no body. This is how LLVM IR 320*9880d681SAndroid Build Coastguard Workerrepresents function declarations. For extern statements in Kaleidoscope, this 321*9880d681SAndroid Build Coastguard Workeris as far as we need to go. For function definitions however, we need to 322*9880d681SAndroid Build Coastguard Workercodegen and attach a function body. 323*9880d681SAndroid Build Coastguard Worker 324*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 325*9880d681SAndroid Build Coastguard Worker 326*9880d681SAndroid Build Coastguard Worker Function *FunctionAST::codegen() { 327*9880d681SAndroid Build Coastguard Worker // First, check for an existing function from a previous 'extern' declaration. 328*9880d681SAndroid Build Coastguard Worker Function *TheFunction = TheModule->getFunction(Proto->getName()); 329*9880d681SAndroid Build Coastguard Worker 330*9880d681SAndroid Build Coastguard Worker if (!TheFunction) 331*9880d681SAndroid Build Coastguard Worker TheFunction = Proto->codegen(); 332*9880d681SAndroid Build Coastguard Worker 333*9880d681SAndroid Build Coastguard Worker if (!TheFunction) 334*9880d681SAndroid Build Coastguard Worker return nullptr; 335*9880d681SAndroid Build Coastguard Worker 336*9880d681SAndroid Build Coastguard Worker if (!TheFunction->empty()) 337*9880d681SAndroid Build Coastguard Worker return (Function*)LogErrorV("Function cannot be redefined."); 338*9880d681SAndroid Build Coastguard Worker 339*9880d681SAndroid Build Coastguard Worker 340*9880d681SAndroid Build Coastguard WorkerFor function definitions, we start by searching TheModule's symbol table for an 341*9880d681SAndroid Build Coastguard Workerexisting version of this function, in case one has already been created using an 342*9880d681SAndroid Build Coastguard Worker'extern' statement. If Module::getFunction returns null then no previous version 343*9880d681SAndroid Build Coastguard Workerexists, so we'll codegen one from the Prototype. In either case, we want to 344*9880d681SAndroid Build Coastguard Workerassert that the function is empty (i.e. has no body yet) before we start. 345*9880d681SAndroid Build Coastguard Worker 346*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 347*9880d681SAndroid Build Coastguard Worker 348*9880d681SAndroid Build Coastguard Worker // Create a new basic block to start insertion into. 349*9880d681SAndroid Build Coastguard Worker BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction); 350*9880d681SAndroid Build Coastguard Worker Builder.SetInsertPoint(BB); 351*9880d681SAndroid Build Coastguard Worker 352*9880d681SAndroid Build Coastguard Worker // Record the function arguments in the NamedValues map. 353*9880d681SAndroid Build Coastguard Worker NamedValues.clear(); 354*9880d681SAndroid Build Coastguard Worker for (auto &Arg : TheFunction->args()) 355*9880d681SAndroid Build Coastguard Worker NamedValues[Arg.getName()] = &Arg; 356*9880d681SAndroid Build Coastguard Worker 357*9880d681SAndroid Build Coastguard WorkerNow we get to the point where the ``Builder`` is set up. The first line 358*9880d681SAndroid Build Coastguard Workercreates a new `basic block <http://en.wikipedia.org/wiki/Basic_block>`_ 359*9880d681SAndroid Build Coastguard Worker(named "entry"), which is inserted into ``TheFunction``. The second line 360*9880d681SAndroid Build Coastguard Workerthen tells the builder that new instructions should be inserted into the 361*9880d681SAndroid Build Coastguard Workerend of the new basic block. Basic blocks in LLVM are an important part 362*9880d681SAndroid Build Coastguard Workerof functions that define the `Control Flow 363*9880d681SAndroid Build Coastguard WorkerGraph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we 364*9880d681SAndroid Build Coastguard Workerdon't have any control flow, our functions will only contain one block 365*9880d681SAndroid Build Coastguard Workerat this point. We'll fix this in `Chapter 5 <LangImpl5.html>`_ :). 366*9880d681SAndroid Build Coastguard Worker 367*9880d681SAndroid Build Coastguard WorkerNext we add the function arguments to the NamedValues map (after first clearing 368*9880d681SAndroid Build Coastguard Workerit out) so that they're accessible to ``VariableExprAST`` nodes. 369*9880d681SAndroid Build Coastguard Worker 370*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 371*9880d681SAndroid Build Coastguard Worker 372*9880d681SAndroid Build Coastguard Worker if (Value *RetVal = Body->codegen()) { 373*9880d681SAndroid Build Coastguard Worker // Finish off the function. 374*9880d681SAndroid Build Coastguard Worker Builder.CreateRet(RetVal); 375*9880d681SAndroid Build Coastguard Worker 376*9880d681SAndroid Build Coastguard Worker // Validate the generated code, checking for consistency. 377*9880d681SAndroid Build Coastguard Worker verifyFunction(*TheFunction); 378*9880d681SAndroid Build Coastguard Worker 379*9880d681SAndroid Build Coastguard Worker return TheFunction; 380*9880d681SAndroid Build Coastguard Worker } 381*9880d681SAndroid Build Coastguard Worker 382*9880d681SAndroid Build Coastguard WorkerOnce the insertion point has been set up and the NamedValues map populated, 383*9880d681SAndroid Build Coastguard Workerwe call the ``codegen()`` method for the root expression of the function. If no 384*9880d681SAndroid Build Coastguard Workererror happens, this emits code to compute the expression into the entry block 385*9880d681SAndroid Build Coastguard Workerand returns the value that was computed. Assuming no error, we then create an 386*9880d681SAndroid Build Coastguard WorkerLLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the function. 387*9880d681SAndroid Build Coastguard WorkerOnce the function is built, we call ``verifyFunction``, which is 388*9880d681SAndroid Build Coastguard Workerprovided by LLVM. This function does a variety of consistency checks on 389*9880d681SAndroid Build Coastguard Workerthe generated code, to determine if our compiler is doing everything 390*9880d681SAndroid Build Coastguard Workerright. Using this is important: it can catch a lot of bugs. Once the 391*9880d681SAndroid Build Coastguard Workerfunction is finished and validated, we return it. 392*9880d681SAndroid Build Coastguard Worker 393*9880d681SAndroid Build Coastguard Worker.. code-block:: c++ 394*9880d681SAndroid Build Coastguard Worker 395*9880d681SAndroid Build Coastguard Worker // Error reading body, remove function. 396*9880d681SAndroid Build Coastguard Worker TheFunction->eraseFromParent(); 397*9880d681SAndroid Build Coastguard Worker return nullptr; 398*9880d681SAndroid Build Coastguard Worker } 399*9880d681SAndroid Build Coastguard Worker 400*9880d681SAndroid Build Coastguard WorkerThe only piece left here is handling of the error case. For simplicity, 401*9880d681SAndroid Build Coastguard Workerwe handle this by merely deleting the function we produced with the 402*9880d681SAndroid Build Coastguard Worker``eraseFromParent`` method. This allows the user to redefine a function 403*9880d681SAndroid Build Coastguard Workerthat they incorrectly typed in before: if we didn't delete it, it would 404*9880d681SAndroid Build Coastguard Workerlive in the symbol table, with a body, preventing future redefinition. 405*9880d681SAndroid Build Coastguard Worker 406*9880d681SAndroid Build Coastguard WorkerThis code does have a bug, though: If the ``FunctionAST::codegen()`` method 407*9880d681SAndroid Build Coastguard Workerfinds an existing IR Function, it does not validate its signature against the 408*9880d681SAndroid Build Coastguard Workerdefinition's own prototype. This means that an earlier 'extern' declaration will 409*9880d681SAndroid Build Coastguard Workertake precedence over the function definition's signature, which can cause 410*9880d681SAndroid Build Coastguard Workercodegen to fail, for instance if the function arguments are named differently. 411*9880d681SAndroid Build Coastguard WorkerThere are a number of ways to fix this bug, see what you can come up with! Here 412*9880d681SAndroid Build Coastguard Workeris a testcase: 413*9880d681SAndroid Build Coastguard Worker 414*9880d681SAndroid Build Coastguard Worker:: 415*9880d681SAndroid Build Coastguard Worker 416*9880d681SAndroid Build Coastguard Worker extern foo(a); # ok, defines foo. 417*9880d681SAndroid Build Coastguard Worker def foo(b) b; # Error: Unknown variable name. (decl using 'a' takes precedence). 418*9880d681SAndroid Build Coastguard Worker 419*9880d681SAndroid Build Coastguard WorkerDriver Changes and Closing Thoughts 420*9880d681SAndroid Build Coastguard Worker=================================== 421*9880d681SAndroid Build Coastguard Worker 422*9880d681SAndroid Build Coastguard WorkerFor now, code generation to LLVM doesn't really get us much, except that 423*9880d681SAndroid Build Coastguard Workerwe can look at the pretty IR calls. The sample code inserts calls to 424*9880d681SAndroid Build Coastguard Workercodegen into the "``HandleDefinition``", "``HandleExtern``" etc 425*9880d681SAndroid Build Coastguard Workerfunctions, and then dumps out the LLVM IR. This gives a nice way to look 426*9880d681SAndroid Build Coastguard Workerat the LLVM IR for simple functions. For example: 427*9880d681SAndroid Build Coastguard Worker 428*9880d681SAndroid Build Coastguard Worker:: 429*9880d681SAndroid Build Coastguard Worker 430*9880d681SAndroid Build Coastguard Worker ready> 4+5; 431*9880d681SAndroid Build Coastguard Worker Read top-level expression: 432*9880d681SAndroid Build Coastguard Worker define double @0() { 433*9880d681SAndroid Build Coastguard Worker entry: 434*9880d681SAndroid Build Coastguard Worker ret double 9.000000e+00 435*9880d681SAndroid Build Coastguard Worker } 436*9880d681SAndroid Build Coastguard Worker 437*9880d681SAndroid Build Coastguard WorkerNote how the parser turns the top-level expression into anonymous 438*9880d681SAndroid Build Coastguard Workerfunctions for us. This will be handy when we add `JIT 439*9880d681SAndroid Build Coastguard Workersupport <LangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that the 440*9880d681SAndroid Build Coastguard Workercode is very literally transcribed, no optimizations are being performed 441*9880d681SAndroid Build Coastguard Workerexcept simple constant folding done by IRBuilder. We will `add 442*9880d681SAndroid Build Coastguard Workeroptimizations <LangImpl4.html#trivial-constant-folding>`_ explicitly in the next 443*9880d681SAndroid Build Coastguard Workerchapter. 444*9880d681SAndroid Build Coastguard Worker 445*9880d681SAndroid Build Coastguard Worker:: 446*9880d681SAndroid Build Coastguard Worker 447*9880d681SAndroid Build Coastguard Worker ready> def foo(a b) a*a + 2*a*b + b*b; 448*9880d681SAndroid Build Coastguard Worker Read function definition: 449*9880d681SAndroid Build Coastguard Worker define double @foo(double %a, double %b) { 450*9880d681SAndroid Build Coastguard Worker entry: 451*9880d681SAndroid Build Coastguard Worker %multmp = fmul double %a, %a 452*9880d681SAndroid Build Coastguard Worker %multmp1 = fmul double 2.000000e+00, %a 453*9880d681SAndroid Build Coastguard Worker %multmp2 = fmul double %multmp1, %b 454*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %multmp, %multmp2 455*9880d681SAndroid Build Coastguard Worker %multmp3 = fmul double %b, %b 456*9880d681SAndroid Build Coastguard Worker %addtmp4 = fadd double %addtmp, %multmp3 457*9880d681SAndroid Build Coastguard Worker ret double %addtmp4 458*9880d681SAndroid Build Coastguard Worker } 459*9880d681SAndroid Build Coastguard Worker 460*9880d681SAndroid Build Coastguard WorkerThis shows some simple arithmetic. Notice the striking similarity to the 461*9880d681SAndroid Build Coastguard WorkerLLVM builder calls that we use to create the instructions. 462*9880d681SAndroid Build Coastguard Worker 463*9880d681SAndroid Build Coastguard Worker:: 464*9880d681SAndroid Build Coastguard Worker 465*9880d681SAndroid Build Coastguard Worker ready> def bar(a) foo(a, 4.0) + bar(31337); 466*9880d681SAndroid Build Coastguard Worker Read function definition: 467*9880d681SAndroid Build Coastguard Worker define double @bar(double %a) { 468*9880d681SAndroid Build Coastguard Worker entry: 469*9880d681SAndroid Build Coastguard Worker %calltmp = call double @foo(double %a, double 4.000000e+00) 470*9880d681SAndroid Build Coastguard Worker %calltmp1 = call double @bar(double 3.133700e+04) 471*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %calltmp, %calltmp1 472*9880d681SAndroid Build Coastguard Worker ret double %addtmp 473*9880d681SAndroid Build Coastguard Worker } 474*9880d681SAndroid Build Coastguard Worker 475*9880d681SAndroid Build Coastguard WorkerThis shows some function calls. Note that this function will take a long 476*9880d681SAndroid Build Coastguard Workertime to execute if you call it. In the future we'll add conditional 477*9880d681SAndroid Build Coastguard Workercontrol flow to actually make recursion useful :). 478*9880d681SAndroid Build Coastguard Worker 479*9880d681SAndroid Build Coastguard Worker:: 480*9880d681SAndroid Build Coastguard Worker 481*9880d681SAndroid Build Coastguard Worker ready> extern cos(x); 482*9880d681SAndroid Build Coastguard Worker Read extern: 483*9880d681SAndroid Build Coastguard Worker declare double @cos(double) 484*9880d681SAndroid Build Coastguard Worker 485*9880d681SAndroid Build Coastguard Worker ready> cos(1.234); 486*9880d681SAndroid Build Coastguard Worker Read top-level expression: 487*9880d681SAndroid Build Coastguard Worker define double @1() { 488*9880d681SAndroid Build Coastguard Worker entry: 489*9880d681SAndroid Build Coastguard Worker %calltmp = call double @cos(double 1.234000e+00) 490*9880d681SAndroid Build Coastguard Worker ret double %calltmp 491*9880d681SAndroid Build Coastguard Worker } 492*9880d681SAndroid Build Coastguard Worker 493*9880d681SAndroid Build Coastguard WorkerThis shows an extern for the libm "cos" function, and a call to it. 494*9880d681SAndroid Build Coastguard Worker 495*9880d681SAndroid Build Coastguard Worker.. TODO:: Abandon Pygments' horrible `llvm` lexer. It just totally gives up 496*9880d681SAndroid Build Coastguard Worker on highlighting this due to the first line. 497*9880d681SAndroid Build Coastguard Worker 498*9880d681SAndroid Build Coastguard Worker:: 499*9880d681SAndroid Build Coastguard Worker 500*9880d681SAndroid Build Coastguard Worker ready> ^D 501*9880d681SAndroid Build Coastguard Worker ; ModuleID = 'my cool jit' 502*9880d681SAndroid Build Coastguard Worker 503*9880d681SAndroid Build Coastguard Worker define double @0() { 504*9880d681SAndroid Build Coastguard Worker entry: 505*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double 4.000000e+00, 5.000000e+00 506*9880d681SAndroid Build Coastguard Worker ret double %addtmp 507*9880d681SAndroid Build Coastguard Worker } 508*9880d681SAndroid Build Coastguard Worker 509*9880d681SAndroid Build Coastguard Worker define double @foo(double %a, double %b) { 510*9880d681SAndroid Build Coastguard Worker entry: 511*9880d681SAndroid Build Coastguard Worker %multmp = fmul double %a, %a 512*9880d681SAndroid Build Coastguard Worker %multmp1 = fmul double 2.000000e+00, %a 513*9880d681SAndroid Build Coastguard Worker %multmp2 = fmul double %multmp1, %b 514*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %multmp, %multmp2 515*9880d681SAndroid Build Coastguard Worker %multmp3 = fmul double %b, %b 516*9880d681SAndroid Build Coastguard Worker %addtmp4 = fadd double %addtmp, %multmp3 517*9880d681SAndroid Build Coastguard Worker ret double %addtmp4 518*9880d681SAndroid Build Coastguard Worker } 519*9880d681SAndroid Build Coastguard Worker 520*9880d681SAndroid Build Coastguard Worker define double @bar(double %a) { 521*9880d681SAndroid Build Coastguard Worker entry: 522*9880d681SAndroid Build Coastguard Worker %calltmp = call double @foo(double %a, double 4.000000e+00) 523*9880d681SAndroid Build Coastguard Worker %calltmp1 = call double @bar(double 3.133700e+04) 524*9880d681SAndroid Build Coastguard Worker %addtmp = fadd double %calltmp, %calltmp1 525*9880d681SAndroid Build Coastguard Worker ret double %addtmp 526*9880d681SAndroid Build Coastguard Worker } 527*9880d681SAndroid Build Coastguard Worker 528*9880d681SAndroid Build Coastguard Worker declare double @cos(double) 529*9880d681SAndroid Build Coastguard Worker 530*9880d681SAndroid Build Coastguard Worker define double @1() { 531*9880d681SAndroid Build Coastguard Worker entry: 532*9880d681SAndroid Build Coastguard Worker %calltmp = call double @cos(double 1.234000e+00) 533*9880d681SAndroid Build Coastguard Worker ret double %calltmp 534*9880d681SAndroid Build Coastguard Worker } 535*9880d681SAndroid Build Coastguard Worker 536*9880d681SAndroid Build Coastguard WorkerWhen you quit the current demo, it dumps out the IR for the entire 537*9880d681SAndroid Build Coastguard Workermodule generated. Here you can see the big picture with all the 538*9880d681SAndroid Build Coastguard Workerfunctions referencing each other. 539*9880d681SAndroid Build Coastguard Worker 540*9880d681SAndroid Build Coastguard WorkerThis wraps up the third chapter of the Kaleidoscope tutorial. Up next, 541*9880d681SAndroid Build Coastguard Workerwe'll describe how to `add JIT codegen and optimizer 542*9880d681SAndroid Build Coastguard Workersupport <LangImpl4.html>`_ to this so we can actually start running 543*9880d681SAndroid Build Coastguard Workercode! 544*9880d681SAndroid Build Coastguard Worker 545*9880d681SAndroid Build Coastguard WorkerFull Code Listing 546*9880d681SAndroid Build Coastguard Worker================= 547*9880d681SAndroid Build Coastguard Worker 548*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for our running example, enhanced with 549*9880d681SAndroid Build Coastguard Workerthe LLVM code generator. Because this uses the LLVM libraries, we need 550*9880d681SAndroid Build Coastguard Workerto link them in. To do this, we use the 551*9880d681SAndroid Build Coastguard Worker`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform 552*9880d681SAndroid Build Coastguard Workerour makefile/command line about which options to use: 553*9880d681SAndroid Build Coastguard Worker 554*9880d681SAndroid Build Coastguard Worker.. code-block:: bash 555*9880d681SAndroid Build Coastguard Worker 556*9880d681SAndroid Build Coastguard Worker # Compile 557*9880d681SAndroid Build Coastguard Worker clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o toy 558*9880d681SAndroid Build Coastguard Worker # Run 559*9880d681SAndroid Build Coastguard Worker ./toy 560*9880d681SAndroid Build Coastguard Worker 561*9880d681SAndroid Build Coastguard WorkerHere is the code: 562*9880d681SAndroid Build Coastguard Worker 563*9880d681SAndroid Build Coastguard Worker.. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp 564*9880d681SAndroid Build Coastguard Worker :language: c++ 565*9880d681SAndroid Build Coastguard Worker 566*9880d681SAndroid Build Coastguard Worker`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_ 567*9880d681SAndroid Build Coastguard Worker 568