xref: /aosp_15_r20/external/llvm/docs/tutorial/LangImpl03.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker========================================
2*9880d681SAndroid Build Coastguard WorkerKaleidoscope: Code generation to LLVM IR
3*9880d681SAndroid Build Coastguard Worker========================================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard Worker.. contents::
6*9880d681SAndroid Build Coastguard Worker   :local:
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard WorkerChapter 3 Introduction
9*9880d681SAndroid Build Coastguard Worker======================
10*9880d681SAndroid Build Coastguard Worker
11*9880d681SAndroid Build Coastguard WorkerWelcome to Chapter 3 of the "`Implementing a language with
12*9880d681SAndroid Build Coastguard WorkerLLVM <index.html>`_" tutorial. This chapter shows you how to transform
13*9880d681SAndroid Build Coastguard Workerthe `Abstract Syntax Tree <LangImpl2.html>`_, built in Chapter 2, into
14*9880d681SAndroid Build Coastguard WorkerLLVM IR. This will teach you a little bit about how LLVM does things, as
15*9880d681SAndroid Build Coastguard Workerwell as demonstrate how easy it is to use. It's much more work to build
16*9880d681SAndroid Build Coastguard Workera lexer and parser than it is to generate LLVM IR code. :)
17*9880d681SAndroid Build Coastguard Worker
18*9880d681SAndroid Build Coastguard Worker**Please note**: the code in this chapter and later require LLVM 3.7 or
19*9880d681SAndroid Build Coastguard Workerlater. LLVM 3.6 and before will not work with it. Also note that you
20*9880d681SAndroid Build Coastguard Workerneed to use a version of this tutorial that matches your LLVM release:
21*9880d681SAndroid Build Coastguard WorkerIf you are using an official LLVM release, use the version of the
22*9880d681SAndroid Build Coastguard Workerdocumentation included with your release or on the `llvm.org releases
23*9880d681SAndroid Build Coastguard Workerpage <http://llvm.org/releases/>`_.
24*9880d681SAndroid Build Coastguard Worker
25*9880d681SAndroid Build Coastguard WorkerCode Generation Setup
26*9880d681SAndroid Build Coastguard Worker=====================
27*9880d681SAndroid Build Coastguard Worker
28*9880d681SAndroid Build Coastguard WorkerIn order to generate LLVM IR, we want some simple setup to get started.
29*9880d681SAndroid Build Coastguard WorkerFirst we define virtual code generation (codegen) methods in each AST
30*9880d681SAndroid Build Coastguard Workerclass:
31*9880d681SAndroid Build Coastguard Worker
32*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
33*9880d681SAndroid Build Coastguard Worker
34*9880d681SAndroid Build Coastguard Worker    /// ExprAST - Base class for all expression nodes.
35*9880d681SAndroid Build Coastguard Worker    class ExprAST {
36*9880d681SAndroid Build Coastguard Worker    public:
37*9880d681SAndroid Build Coastguard Worker      virtual ~ExprAST() {}
38*9880d681SAndroid Build Coastguard Worker      virtual Value *codegen() = 0;
39*9880d681SAndroid Build Coastguard Worker    };
40*9880d681SAndroid Build Coastguard Worker
41*9880d681SAndroid Build Coastguard Worker    /// NumberExprAST - Expression class for numeric literals like "1.0".
42*9880d681SAndroid Build Coastguard Worker    class NumberExprAST : public ExprAST {
43*9880d681SAndroid Build Coastguard Worker      double Val;
44*9880d681SAndroid Build Coastguard Worker
45*9880d681SAndroid Build Coastguard Worker    public:
46*9880d681SAndroid Build Coastguard Worker      NumberExprAST(double Val) : Val(Val) {}
47*9880d681SAndroid Build Coastguard Worker      virtual Value *codegen();
48*9880d681SAndroid Build Coastguard Worker    };
49*9880d681SAndroid Build Coastguard Worker    ...
50*9880d681SAndroid Build Coastguard Worker
51*9880d681SAndroid Build Coastguard WorkerThe codegen() method says to emit IR for that AST node along with all
52*9880d681SAndroid Build Coastguard Workerthe things it depends on, and they all return an LLVM Value object.
53*9880d681SAndroid Build Coastguard Worker"Value" is the class used to represent a "`Static Single Assignment
54*9880d681SAndroid Build Coastguard Worker(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
55*9880d681SAndroid Build Coastguard Workerregister" or "SSA value" in LLVM. The most distinct aspect of SSA values
56*9880d681SAndroid Build Coastguard Workeris that their value is computed as the related instruction executes, and
57*9880d681SAndroid Build Coastguard Workerit does not get a new value until (and if) the instruction re-executes.
58*9880d681SAndroid Build Coastguard WorkerIn other words, there is no way to "change" an SSA value. For more
59*9880d681SAndroid Build Coastguard Workerinformation, please read up on `Static Single
60*9880d681SAndroid Build Coastguard WorkerAssignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
61*9880d681SAndroid Build Coastguard Worker- the concepts are really quite natural once you grok them.
62*9880d681SAndroid Build Coastguard Worker
63*9880d681SAndroid Build Coastguard WorkerNote that instead of adding virtual methods to the ExprAST class
64*9880d681SAndroid Build Coastguard Workerhierarchy, it could also make sense to use a `visitor
65*9880d681SAndroid Build Coastguard Workerpattern <http://en.wikipedia.org/wiki/Visitor_pattern>`_ or some other
66*9880d681SAndroid Build Coastguard Workerway to model this. Again, this tutorial won't dwell on good software
67*9880d681SAndroid Build Coastguard Workerengineering practices: for our purposes, adding a virtual method is
68*9880d681SAndroid Build Coastguard Workersimplest.
69*9880d681SAndroid Build Coastguard Worker
70*9880d681SAndroid Build Coastguard WorkerThe second thing we want is an "LogError" method like we used for the
71*9880d681SAndroid Build Coastguard Workerparser, which will be used to report errors found during code generation
72*9880d681SAndroid Build Coastguard Worker(for example, use of an undeclared parameter):
73*9880d681SAndroid Build Coastguard Worker
74*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
75*9880d681SAndroid Build Coastguard Worker
76*9880d681SAndroid Build Coastguard Worker    static LLVMContext TheContext;
77*9880d681SAndroid Build Coastguard Worker    static IRBuilder<> Builder(TheContext);
78*9880d681SAndroid Build Coastguard Worker    static std::unique_ptr<Module> TheModule;
79*9880d681SAndroid Build Coastguard Worker    static std::map<std::string, Value *> NamedValues;
80*9880d681SAndroid Build Coastguard Worker
81*9880d681SAndroid Build Coastguard Worker    Value *LogErrorV(const char *Str) {
82*9880d681SAndroid Build Coastguard Worker      LogError(Str);
83*9880d681SAndroid Build Coastguard Worker      return nullptr;
84*9880d681SAndroid Build Coastguard Worker    }
85*9880d681SAndroid Build Coastguard Worker
86*9880d681SAndroid Build Coastguard WorkerThe static variables will be used during code generation. ``TheContext``
87*9880d681SAndroid Build Coastguard Workeris an opaque object that owns a lot of core LLVM data structures, such as
88*9880d681SAndroid Build Coastguard Workerthe type and constant value tables. We don't need to understand it in
89*9880d681SAndroid Build Coastguard Workerdetail, we just need a single instance to pass into APIs that require it.
90*9880d681SAndroid Build Coastguard Worker
91*9880d681SAndroid Build Coastguard WorkerThe ``Builder`` object is a helper object that makes it easy to generate
92*9880d681SAndroid Build Coastguard WorkerLLVM instructions. Instances of the
93*9880d681SAndroid Build Coastguard Worker`IRBuilder <http://llvm.org/doxygen/IRBuilder_8h-source.html>`_
94*9880d681SAndroid Build Coastguard Workerclass template keep track of the current place to insert instructions
95*9880d681SAndroid Build Coastguard Workerand has methods to create new instructions.
96*9880d681SAndroid Build Coastguard Worker
97*9880d681SAndroid Build Coastguard Worker``TheModule`` is an LLVM construct that contains functions and global
98*9880d681SAndroid Build Coastguard Workervariables. In many ways, it is the top-level structure that the LLVM IR
99*9880d681SAndroid Build Coastguard Workeruses to contain code. It will own the memory for all of the IR that we
100*9880d681SAndroid Build Coastguard Workergenerate, which is why the codegen() method returns a raw Value\*,
101*9880d681SAndroid Build Coastguard Workerrather than a unique_ptr<Value>.
102*9880d681SAndroid Build Coastguard Worker
103*9880d681SAndroid Build Coastguard WorkerThe ``NamedValues`` map keeps track of which values are defined in the
104*9880d681SAndroid Build Coastguard Workercurrent scope and what their LLVM representation is. (In other words, it
105*9880d681SAndroid Build Coastguard Workeris a symbol table for the code). In this form of Kaleidoscope, the only
106*9880d681SAndroid Build Coastguard Workerthings that can be referenced are function parameters. As such, function
107*9880d681SAndroid Build Coastguard Workerparameters will be in this map when generating code for their function
108*9880d681SAndroid Build Coastguard Workerbody.
109*9880d681SAndroid Build Coastguard Worker
110*9880d681SAndroid Build Coastguard WorkerWith these basics in place, we can start talking about how to generate
111*9880d681SAndroid Build Coastguard Workercode for each expression. Note that this assumes that the ``Builder``
112*9880d681SAndroid Build Coastguard Workerhas been set up to generate code *into* something. For now, we'll assume
113*9880d681SAndroid Build Coastguard Workerthat this has already been done, and we'll just use it to emit code.
114*9880d681SAndroid Build Coastguard Worker
115*9880d681SAndroid Build Coastguard WorkerExpression Code Generation
116*9880d681SAndroid Build Coastguard Worker==========================
117*9880d681SAndroid Build Coastguard Worker
118*9880d681SAndroid Build Coastguard WorkerGenerating LLVM code for expression nodes is very straightforward: less
119*9880d681SAndroid Build Coastguard Workerthan 45 lines of commented code for all four of our expression nodes.
120*9880d681SAndroid Build Coastguard WorkerFirst we'll do numeric literals:
121*9880d681SAndroid Build Coastguard Worker
122*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
123*9880d681SAndroid Build Coastguard Worker
124*9880d681SAndroid Build Coastguard Worker    Value *NumberExprAST::codegen() {
125*9880d681SAndroid Build Coastguard Worker      return ConstantFP::get(LLVMContext, APFloat(Val));
126*9880d681SAndroid Build Coastguard Worker    }
127*9880d681SAndroid Build Coastguard Worker
128*9880d681SAndroid Build Coastguard WorkerIn the LLVM IR, numeric constants are represented with the
129*9880d681SAndroid Build Coastguard Worker``ConstantFP`` class, which holds the numeric value in an ``APFloat``
130*9880d681SAndroid Build Coastguard Workerinternally (``APFloat`` has the capability of holding floating point
131*9880d681SAndroid Build Coastguard Workerconstants of Arbitrary Precision). This code basically just creates
132*9880d681SAndroid Build Coastguard Workerand returns a ``ConstantFP``. Note that in the LLVM IR that constants
133*9880d681SAndroid Build Coastguard Workerare all uniqued together and shared. For this reason, the API uses the
134*9880d681SAndroid Build Coastguard Worker"foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)".
135*9880d681SAndroid Build Coastguard Worker
136*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
137*9880d681SAndroid Build Coastguard Worker
138*9880d681SAndroid Build Coastguard Worker    Value *VariableExprAST::codegen() {
139*9880d681SAndroid Build Coastguard Worker      // Look this variable up in the function.
140*9880d681SAndroid Build Coastguard Worker      Value *V = NamedValues[Name];
141*9880d681SAndroid Build Coastguard Worker      if (!V)
142*9880d681SAndroid Build Coastguard Worker        LogErrorV("Unknown variable name");
143*9880d681SAndroid Build Coastguard Worker      return V;
144*9880d681SAndroid Build Coastguard Worker    }
145*9880d681SAndroid Build Coastguard Worker
146*9880d681SAndroid Build Coastguard WorkerReferences to variables are also quite simple using LLVM. In the simple
147*9880d681SAndroid Build Coastguard Workerversion of Kaleidoscope, we assume that the variable has already been
148*9880d681SAndroid Build Coastguard Workeremitted somewhere and its value is available. In practice, the only
149*9880d681SAndroid Build Coastguard Workervalues that can be in the ``NamedValues`` map are function arguments.
150*9880d681SAndroid Build Coastguard WorkerThis code simply checks to see that the specified name is in the map (if
151*9880d681SAndroid Build Coastguard Workernot, an unknown variable is being referenced) and returns the value for
152*9880d681SAndroid Build Coastguard Workerit. In future chapters, we'll add support for `loop induction
153*9880d681SAndroid Build Coastguard Workervariables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for `local
154*9880d681SAndroid Build Coastguard Workervariables <LangImpl7.html#user-defined-local-variables>`_.
155*9880d681SAndroid Build Coastguard Worker
156*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
157*9880d681SAndroid Build Coastguard Worker
158*9880d681SAndroid Build Coastguard Worker    Value *BinaryExprAST::codegen() {
159*9880d681SAndroid Build Coastguard Worker      Value *L = LHS->codegen();
160*9880d681SAndroid Build Coastguard Worker      Value *R = RHS->codegen();
161*9880d681SAndroid Build Coastguard Worker      if (!L || !R)
162*9880d681SAndroid Build Coastguard Worker        return nullptr;
163*9880d681SAndroid Build Coastguard Worker
164*9880d681SAndroid Build Coastguard Worker      switch (Op) {
165*9880d681SAndroid Build Coastguard Worker      case '+':
166*9880d681SAndroid Build Coastguard Worker        return Builder.CreateFAdd(L, R, "addtmp");
167*9880d681SAndroid Build Coastguard Worker      case '-':
168*9880d681SAndroid Build Coastguard Worker        return Builder.CreateFSub(L, R, "subtmp");
169*9880d681SAndroid Build Coastguard Worker      case '*':
170*9880d681SAndroid Build Coastguard Worker        return Builder.CreateFMul(L, R, "multmp");
171*9880d681SAndroid Build Coastguard Worker      case '<':
172*9880d681SAndroid Build Coastguard Worker        L = Builder.CreateFCmpULT(L, R, "cmptmp");
173*9880d681SAndroid Build Coastguard Worker        // Convert bool 0/1 to double 0.0 or 1.0
174*9880d681SAndroid Build Coastguard Worker        return Builder.CreateUIToFP(L, Type::getDoubleTy(LLVMContext),
175*9880d681SAndroid Build Coastguard Worker                                    "booltmp");
176*9880d681SAndroid Build Coastguard Worker      default:
177*9880d681SAndroid Build Coastguard Worker        return LogErrorV("invalid binary operator");
178*9880d681SAndroid Build Coastguard Worker      }
179*9880d681SAndroid Build Coastguard Worker    }
180*9880d681SAndroid Build Coastguard Worker
181*9880d681SAndroid Build Coastguard WorkerBinary operators start to get more interesting. The basic idea here is
182*9880d681SAndroid Build Coastguard Workerthat we recursively emit code for the left-hand side of the expression,
183*9880d681SAndroid Build Coastguard Workerthen the right-hand side, then we compute the result of the binary
184*9880d681SAndroid Build Coastguard Workerexpression. In this code, we do a simple switch on the opcode to create
185*9880d681SAndroid Build Coastguard Workerthe right LLVM instruction.
186*9880d681SAndroid Build Coastguard Worker
187*9880d681SAndroid Build Coastguard WorkerIn the example above, the LLVM builder class is starting to show its
188*9880d681SAndroid Build Coastguard Workervalue. IRBuilder knows where to insert the newly created instruction,
189*9880d681SAndroid Build Coastguard Workerall you have to do is specify what instruction to create (e.g. with
190*9880d681SAndroid Build Coastguard Worker``CreateFAdd``), which operands to use (``L`` and ``R`` here) and
191*9880d681SAndroid Build Coastguard Workeroptionally provide a name for the generated instruction.
192*9880d681SAndroid Build Coastguard Worker
193*9880d681SAndroid Build Coastguard WorkerOne nice thing about LLVM is that the name is just a hint. For instance,
194*9880d681SAndroid Build Coastguard Workerif the code above emits multiple "addtmp" variables, LLVM will
195*9880d681SAndroid Build Coastguard Workerautomatically provide each one with an increasing, unique numeric
196*9880d681SAndroid Build Coastguard Workersuffix. Local value names for instructions are purely optional, but it
197*9880d681SAndroid Build Coastguard Workermakes it much easier to read the IR dumps.
198*9880d681SAndroid Build Coastguard Worker
199*9880d681SAndroid Build Coastguard Worker`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
200*9880d681SAndroid Build Coastguard Workerrules: for example, the Left and Right operators of an `add
201*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#add-instruction>`_ must have the same type, and the
202*9880d681SAndroid Build Coastguard Workerresult type of the add must match the operand types. Because all values
203*9880d681SAndroid Build Coastguard Workerin Kaleidoscope are doubles, this makes for very simple code for add,
204*9880d681SAndroid Build Coastguard Workersub and mul.
205*9880d681SAndroid Build Coastguard Worker
206*9880d681SAndroid Build Coastguard WorkerOn the other hand, LLVM specifies that the `fcmp
207*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
208*9880d681SAndroid Build Coastguard Workerone bit integer). The problem with this is that Kaleidoscope wants the
209*9880d681SAndroid Build Coastguard Workervalue to be a 0.0 or 1.0 value. In order to get these semantics, we
210*9880d681SAndroid Build Coastguard Workercombine the fcmp instruction with a `uitofp
211*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
212*9880d681SAndroid Build Coastguard Workerinput integer into a floating point value by treating the input as an
213*9880d681SAndroid Build Coastguard Workerunsigned value. In contrast, if we used the `sitofp
214*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
215*9880d681SAndroid Build Coastguard Workerwould return 0.0 and -1.0, depending on the input value.
216*9880d681SAndroid Build Coastguard Worker
217*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
218*9880d681SAndroid Build Coastguard Worker
219*9880d681SAndroid Build Coastguard Worker    Value *CallExprAST::codegen() {
220*9880d681SAndroid Build Coastguard Worker      // Look up the name in the global module table.
221*9880d681SAndroid Build Coastguard Worker      Function *CalleeF = TheModule->getFunction(Callee);
222*9880d681SAndroid Build Coastguard Worker      if (!CalleeF)
223*9880d681SAndroid Build Coastguard Worker        return LogErrorV("Unknown function referenced");
224*9880d681SAndroid Build Coastguard Worker
225*9880d681SAndroid Build Coastguard Worker      // If argument mismatch error.
226*9880d681SAndroid Build Coastguard Worker      if (CalleeF->arg_size() != Args.size())
227*9880d681SAndroid Build Coastguard Worker        return LogErrorV("Incorrect # arguments passed");
228*9880d681SAndroid Build Coastguard Worker
229*9880d681SAndroid Build Coastguard Worker      std::vector<Value *> ArgsV;
230*9880d681SAndroid Build Coastguard Worker      for (unsigned i = 0, e = Args.size(); i != e; ++i) {
231*9880d681SAndroid Build Coastguard Worker        ArgsV.push_back(Args[i]->codegen());
232*9880d681SAndroid Build Coastguard Worker        if (!ArgsV.back())
233*9880d681SAndroid Build Coastguard Worker          return nullptr;
234*9880d681SAndroid Build Coastguard Worker      }
235*9880d681SAndroid Build Coastguard Worker
236*9880d681SAndroid Build Coastguard Worker      return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
237*9880d681SAndroid Build Coastguard Worker    }
238*9880d681SAndroid Build Coastguard Worker
239*9880d681SAndroid Build Coastguard WorkerCode generation for function calls is quite straightforward with LLVM. The code
240*9880d681SAndroid Build Coastguard Workerabove initially does a function name lookup in the LLVM Module's symbol table.
241*9880d681SAndroid Build Coastguard WorkerRecall that the LLVM Module is the container that holds the functions we are
242*9880d681SAndroid Build Coastguard WorkerJIT'ing. By giving each function the same name as what the user specifies, we
243*9880d681SAndroid Build Coastguard Workercan use the LLVM symbol table to resolve function names for us.
244*9880d681SAndroid Build Coastguard Worker
245*9880d681SAndroid Build Coastguard WorkerOnce we have the function to call, we recursively codegen each argument
246*9880d681SAndroid Build Coastguard Workerthat is to be passed in, and create an LLVM `call
247*9880d681SAndroid Build Coastguard Workerinstruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
248*9880d681SAndroid Build Coastguard Workercalling conventions by default, allowing these calls to also call into
249*9880d681SAndroid Build Coastguard Workerstandard library functions like "sin" and "cos", with no additional
250*9880d681SAndroid Build Coastguard Workereffort.
251*9880d681SAndroid Build Coastguard Worker
252*9880d681SAndroid Build Coastguard WorkerThis wraps up our handling of the four basic expressions that we have so
253*9880d681SAndroid Build Coastguard Workerfar in Kaleidoscope. Feel free to go in and add some more. For example,
254*9880d681SAndroid Build Coastguard Workerby browsing the `LLVM language reference <../LangRef.html>`_ you'll find
255*9880d681SAndroid Build Coastguard Workerseveral other interesting instructions that are really easy to plug into
256*9880d681SAndroid Build Coastguard Workerour basic framework.
257*9880d681SAndroid Build Coastguard Worker
258*9880d681SAndroid Build Coastguard WorkerFunction Code Generation
259*9880d681SAndroid Build Coastguard Worker========================
260*9880d681SAndroid Build Coastguard Worker
261*9880d681SAndroid Build Coastguard WorkerCode generation for prototypes and functions must handle a number of
262*9880d681SAndroid Build Coastguard Workerdetails, which make their code less beautiful than expression code
263*9880d681SAndroid Build Coastguard Workergeneration, but allows us to illustrate some important points. First,
264*9880d681SAndroid Build Coastguard Workerlets talk about code generation for prototypes: they are used both for
265*9880d681SAndroid Build Coastguard Workerfunction bodies and external function declarations. The code starts
266*9880d681SAndroid Build Coastguard Workerwith:
267*9880d681SAndroid Build Coastguard Worker
268*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
269*9880d681SAndroid Build Coastguard Worker
270*9880d681SAndroid Build Coastguard Worker    Function *PrototypeAST::codegen() {
271*9880d681SAndroid Build Coastguard Worker      // Make the function type:  double(double,double) etc.
272*9880d681SAndroid Build Coastguard Worker      std::vector<Type*> Doubles(Args.size(),
273*9880d681SAndroid Build Coastguard Worker                                 Type::getDoubleTy(LLVMContext));
274*9880d681SAndroid Build Coastguard Worker      FunctionType *FT =
275*9880d681SAndroid Build Coastguard Worker        FunctionType::get(Type::getDoubleTy(LLVMContext), Doubles, false);
276*9880d681SAndroid Build Coastguard Worker
277*9880d681SAndroid Build Coastguard Worker      Function *F =
278*9880d681SAndroid Build Coastguard Worker        Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
279*9880d681SAndroid Build Coastguard Worker
280*9880d681SAndroid Build Coastguard WorkerThis code packs a lot of power into a few lines. Note first that this
281*9880d681SAndroid Build Coastguard Workerfunction returns a "Function\*" instead of a "Value\*". Because a
282*9880d681SAndroid Build Coastguard Worker"prototype" really talks about the external interface for a function
283*9880d681SAndroid Build Coastguard Worker(not the value computed by an expression), it makes sense for it to
284*9880d681SAndroid Build Coastguard Workerreturn the LLVM Function it corresponds to when codegen'd.
285*9880d681SAndroid Build Coastguard Worker
286*9880d681SAndroid Build Coastguard WorkerThe call to ``FunctionType::get`` creates the ``FunctionType`` that
287*9880d681SAndroid Build Coastguard Workershould be used for a given Prototype. Since all function arguments in
288*9880d681SAndroid Build Coastguard WorkerKaleidoscope are of type double, the first line creates a vector of "N"
289*9880d681SAndroid Build Coastguard WorkerLLVM double types. It then uses the ``Functiontype::get`` method to
290*9880d681SAndroid Build Coastguard Workercreate a function type that takes "N" doubles as arguments, returns one
291*9880d681SAndroid Build Coastguard Workerdouble as a result, and that is not vararg (the false parameter
292*9880d681SAndroid Build Coastguard Workerindicates this). Note that Types in LLVM are uniqued just like Constants
293*9880d681SAndroid Build Coastguard Workerare, so you don't "new" a type, you "get" it.
294*9880d681SAndroid Build Coastguard Worker
295*9880d681SAndroid Build Coastguard WorkerThe final line above actually creates the IR Function corresponding to
296*9880d681SAndroid Build Coastguard Workerthe Prototype. This indicates the type, linkage and name to use, as
297*9880d681SAndroid Build Coastguard Workerwell as which module to insert into. "`external
298*9880d681SAndroid Build Coastguard Workerlinkage <../LangRef.html#linkage>`_" means that the function may be
299*9880d681SAndroid Build Coastguard Workerdefined outside the current module and/or that it is callable by
300*9880d681SAndroid Build Coastguard Workerfunctions outside the module. The Name passed in is the name the user
301*9880d681SAndroid Build Coastguard Workerspecified: since "``TheModule``" is specified, this name is registered
302*9880d681SAndroid Build Coastguard Workerin "``TheModule``"s symbol table.
303*9880d681SAndroid Build Coastguard Worker
304*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
305*9880d681SAndroid Build Coastguard Worker
306*9880d681SAndroid Build Coastguard Worker  // Set names for all arguments.
307*9880d681SAndroid Build Coastguard Worker  unsigned Idx = 0;
308*9880d681SAndroid Build Coastguard Worker  for (auto &Arg : F->args())
309*9880d681SAndroid Build Coastguard Worker    Arg.setName(Args[Idx++]);
310*9880d681SAndroid Build Coastguard Worker
311*9880d681SAndroid Build Coastguard Worker  return F;
312*9880d681SAndroid Build Coastguard Worker
313*9880d681SAndroid Build Coastguard WorkerFinally, we set the name of each of the function's arguments according to the
314*9880d681SAndroid Build Coastguard Workernames given in the Prototype. This step isn't strictly necessary, but keeping
315*9880d681SAndroid Build Coastguard Workerthe names consistent makes the IR more readable, and allows subsequent code to
316*9880d681SAndroid Build Coastguard Workerrefer directly to the arguments for their names, rather than having to look up
317*9880d681SAndroid Build Coastguard Workerthem up in the Prototype AST.
318*9880d681SAndroid Build Coastguard Worker
319*9880d681SAndroid Build Coastguard WorkerAt this point we have a function prototype with no body. This is how LLVM IR
320*9880d681SAndroid Build Coastguard Workerrepresents function declarations. For extern statements in Kaleidoscope, this
321*9880d681SAndroid Build Coastguard Workeris as far as we need to go. For function definitions however, we need to
322*9880d681SAndroid Build Coastguard Workercodegen and attach a function body.
323*9880d681SAndroid Build Coastguard Worker
324*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
325*9880d681SAndroid Build Coastguard Worker
326*9880d681SAndroid Build Coastguard Worker  Function *FunctionAST::codegen() {
327*9880d681SAndroid Build Coastguard Worker      // First, check for an existing function from a previous 'extern' declaration.
328*9880d681SAndroid Build Coastguard Worker    Function *TheFunction = TheModule->getFunction(Proto->getName());
329*9880d681SAndroid Build Coastguard Worker
330*9880d681SAndroid Build Coastguard Worker    if (!TheFunction)
331*9880d681SAndroid Build Coastguard Worker      TheFunction = Proto->codegen();
332*9880d681SAndroid Build Coastguard Worker
333*9880d681SAndroid Build Coastguard Worker    if (!TheFunction)
334*9880d681SAndroid Build Coastguard Worker      return nullptr;
335*9880d681SAndroid Build Coastguard Worker
336*9880d681SAndroid Build Coastguard Worker    if (!TheFunction->empty())
337*9880d681SAndroid Build Coastguard Worker      return (Function*)LogErrorV("Function cannot be redefined.");
338*9880d681SAndroid Build Coastguard Worker
339*9880d681SAndroid Build Coastguard Worker
340*9880d681SAndroid Build Coastguard WorkerFor function definitions, we start by searching TheModule's symbol table for an
341*9880d681SAndroid Build Coastguard Workerexisting version of this function, in case one has already been created using an
342*9880d681SAndroid Build Coastguard Worker'extern' statement. If Module::getFunction returns null then no previous version
343*9880d681SAndroid Build Coastguard Workerexists, so we'll codegen one from the Prototype. In either case, we want to
344*9880d681SAndroid Build Coastguard Workerassert that the function is empty (i.e. has no body yet) before we start.
345*9880d681SAndroid Build Coastguard Worker
346*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
347*9880d681SAndroid Build Coastguard Worker
348*9880d681SAndroid Build Coastguard Worker  // Create a new basic block to start insertion into.
349*9880d681SAndroid Build Coastguard Worker  BasicBlock *BB = BasicBlock::Create(LLVMContext, "entry", TheFunction);
350*9880d681SAndroid Build Coastguard Worker  Builder.SetInsertPoint(BB);
351*9880d681SAndroid Build Coastguard Worker
352*9880d681SAndroid Build Coastguard Worker  // Record the function arguments in the NamedValues map.
353*9880d681SAndroid Build Coastguard Worker  NamedValues.clear();
354*9880d681SAndroid Build Coastguard Worker  for (auto &Arg : TheFunction->args())
355*9880d681SAndroid Build Coastguard Worker    NamedValues[Arg.getName()] = &Arg;
356*9880d681SAndroid Build Coastguard Worker
357*9880d681SAndroid Build Coastguard WorkerNow we get to the point where the ``Builder`` is set up. The first line
358*9880d681SAndroid Build Coastguard Workercreates a new `basic block <http://en.wikipedia.org/wiki/Basic_block>`_
359*9880d681SAndroid Build Coastguard Worker(named "entry"), which is inserted into ``TheFunction``. The second line
360*9880d681SAndroid Build Coastguard Workerthen tells the builder that new instructions should be inserted into the
361*9880d681SAndroid Build Coastguard Workerend of the new basic block. Basic blocks in LLVM are an important part
362*9880d681SAndroid Build Coastguard Workerof functions that define the `Control Flow
363*9880d681SAndroid Build Coastguard WorkerGraph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
364*9880d681SAndroid Build Coastguard Workerdon't have any control flow, our functions will only contain one block
365*9880d681SAndroid Build Coastguard Workerat this point. We'll fix this in `Chapter 5 <LangImpl5.html>`_ :).
366*9880d681SAndroid Build Coastguard Worker
367*9880d681SAndroid Build Coastguard WorkerNext we add the function arguments to the NamedValues map (after first clearing
368*9880d681SAndroid Build Coastguard Workerit out) so that they're accessible to ``VariableExprAST`` nodes.
369*9880d681SAndroid Build Coastguard Worker
370*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
371*9880d681SAndroid Build Coastguard Worker
372*9880d681SAndroid Build Coastguard Worker      if (Value *RetVal = Body->codegen()) {
373*9880d681SAndroid Build Coastguard Worker        // Finish off the function.
374*9880d681SAndroid Build Coastguard Worker        Builder.CreateRet(RetVal);
375*9880d681SAndroid Build Coastguard Worker
376*9880d681SAndroid Build Coastguard Worker        // Validate the generated code, checking for consistency.
377*9880d681SAndroid Build Coastguard Worker        verifyFunction(*TheFunction);
378*9880d681SAndroid Build Coastguard Worker
379*9880d681SAndroid Build Coastguard Worker        return TheFunction;
380*9880d681SAndroid Build Coastguard Worker      }
381*9880d681SAndroid Build Coastguard Worker
382*9880d681SAndroid Build Coastguard WorkerOnce the insertion point has been set up and the NamedValues map populated,
383*9880d681SAndroid Build Coastguard Workerwe call the ``codegen()`` method for the root expression of the function. If no
384*9880d681SAndroid Build Coastguard Workererror happens, this emits code to compute the expression into the entry block
385*9880d681SAndroid Build Coastguard Workerand returns the value that was computed. Assuming no error, we then create an
386*9880d681SAndroid Build Coastguard WorkerLLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the function.
387*9880d681SAndroid Build Coastguard WorkerOnce the function is built, we call ``verifyFunction``, which is
388*9880d681SAndroid Build Coastguard Workerprovided by LLVM. This function does a variety of consistency checks on
389*9880d681SAndroid Build Coastguard Workerthe generated code, to determine if our compiler is doing everything
390*9880d681SAndroid Build Coastguard Workerright. Using this is important: it can catch a lot of bugs. Once the
391*9880d681SAndroid Build Coastguard Workerfunction is finished and validated, we return it.
392*9880d681SAndroid Build Coastguard Worker
393*9880d681SAndroid Build Coastguard Worker.. code-block:: c++
394*9880d681SAndroid Build Coastguard Worker
395*9880d681SAndroid Build Coastguard Worker      // Error reading body, remove function.
396*9880d681SAndroid Build Coastguard Worker      TheFunction->eraseFromParent();
397*9880d681SAndroid Build Coastguard Worker      return nullptr;
398*9880d681SAndroid Build Coastguard Worker    }
399*9880d681SAndroid Build Coastguard Worker
400*9880d681SAndroid Build Coastguard WorkerThe only piece left here is handling of the error case. For simplicity,
401*9880d681SAndroid Build Coastguard Workerwe handle this by merely deleting the function we produced with the
402*9880d681SAndroid Build Coastguard Worker``eraseFromParent`` method. This allows the user to redefine a function
403*9880d681SAndroid Build Coastguard Workerthat they incorrectly typed in before: if we didn't delete it, it would
404*9880d681SAndroid Build Coastguard Workerlive in the symbol table, with a body, preventing future redefinition.
405*9880d681SAndroid Build Coastguard Worker
406*9880d681SAndroid Build Coastguard WorkerThis code does have a bug, though: If the ``FunctionAST::codegen()`` method
407*9880d681SAndroid Build Coastguard Workerfinds an existing IR Function, it does not validate its signature against the
408*9880d681SAndroid Build Coastguard Workerdefinition's own prototype. This means that an earlier 'extern' declaration will
409*9880d681SAndroid Build Coastguard Workertake precedence over the function definition's signature, which can cause
410*9880d681SAndroid Build Coastguard Workercodegen to fail, for instance if the function arguments are named differently.
411*9880d681SAndroid Build Coastguard WorkerThere are a number of ways to fix this bug, see what you can come up with! Here
412*9880d681SAndroid Build Coastguard Workeris a testcase:
413*9880d681SAndroid Build Coastguard Worker
414*9880d681SAndroid Build Coastguard Worker::
415*9880d681SAndroid Build Coastguard Worker
416*9880d681SAndroid Build Coastguard Worker    extern foo(a);     # ok, defines foo.
417*9880d681SAndroid Build Coastguard Worker    def foo(b) b;      # Error: Unknown variable name. (decl using 'a' takes precedence).
418*9880d681SAndroid Build Coastguard Worker
419*9880d681SAndroid Build Coastguard WorkerDriver Changes and Closing Thoughts
420*9880d681SAndroid Build Coastguard Worker===================================
421*9880d681SAndroid Build Coastguard Worker
422*9880d681SAndroid Build Coastguard WorkerFor now, code generation to LLVM doesn't really get us much, except that
423*9880d681SAndroid Build Coastguard Workerwe can look at the pretty IR calls. The sample code inserts calls to
424*9880d681SAndroid Build Coastguard Workercodegen into the "``HandleDefinition``", "``HandleExtern``" etc
425*9880d681SAndroid Build Coastguard Workerfunctions, and then dumps out the LLVM IR. This gives a nice way to look
426*9880d681SAndroid Build Coastguard Workerat the LLVM IR for simple functions. For example:
427*9880d681SAndroid Build Coastguard Worker
428*9880d681SAndroid Build Coastguard Worker::
429*9880d681SAndroid Build Coastguard Worker
430*9880d681SAndroid Build Coastguard Worker    ready> 4+5;
431*9880d681SAndroid Build Coastguard Worker    Read top-level expression:
432*9880d681SAndroid Build Coastguard Worker    define double @0() {
433*9880d681SAndroid Build Coastguard Worker    entry:
434*9880d681SAndroid Build Coastguard Worker      ret double 9.000000e+00
435*9880d681SAndroid Build Coastguard Worker    }
436*9880d681SAndroid Build Coastguard Worker
437*9880d681SAndroid Build Coastguard WorkerNote how the parser turns the top-level expression into anonymous
438*9880d681SAndroid Build Coastguard Workerfunctions for us. This will be handy when we add `JIT
439*9880d681SAndroid Build Coastguard Workersupport <LangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that the
440*9880d681SAndroid Build Coastguard Workercode is very literally transcribed, no optimizations are being performed
441*9880d681SAndroid Build Coastguard Workerexcept simple constant folding done by IRBuilder. We will `add
442*9880d681SAndroid Build Coastguard Workeroptimizations <LangImpl4.html#trivial-constant-folding>`_ explicitly in the next
443*9880d681SAndroid Build Coastguard Workerchapter.
444*9880d681SAndroid Build Coastguard Worker
445*9880d681SAndroid Build Coastguard Worker::
446*9880d681SAndroid Build Coastguard Worker
447*9880d681SAndroid Build Coastguard Worker    ready> def foo(a b) a*a + 2*a*b + b*b;
448*9880d681SAndroid Build Coastguard Worker    Read function definition:
449*9880d681SAndroid Build Coastguard Worker    define double @foo(double %a, double %b) {
450*9880d681SAndroid Build Coastguard Worker    entry:
451*9880d681SAndroid Build Coastguard Worker      %multmp = fmul double %a, %a
452*9880d681SAndroid Build Coastguard Worker      %multmp1 = fmul double 2.000000e+00, %a
453*9880d681SAndroid Build Coastguard Worker      %multmp2 = fmul double %multmp1, %b
454*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %multmp, %multmp2
455*9880d681SAndroid Build Coastguard Worker      %multmp3 = fmul double %b, %b
456*9880d681SAndroid Build Coastguard Worker      %addtmp4 = fadd double %addtmp, %multmp3
457*9880d681SAndroid Build Coastguard Worker      ret double %addtmp4
458*9880d681SAndroid Build Coastguard Worker    }
459*9880d681SAndroid Build Coastguard Worker
460*9880d681SAndroid Build Coastguard WorkerThis shows some simple arithmetic. Notice the striking similarity to the
461*9880d681SAndroid Build Coastguard WorkerLLVM builder calls that we use to create the instructions.
462*9880d681SAndroid Build Coastguard Worker
463*9880d681SAndroid Build Coastguard Worker::
464*9880d681SAndroid Build Coastguard Worker
465*9880d681SAndroid Build Coastguard Worker    ready> def bar(a) foo(a, 4.0) + bar(31337);
466*9880d681SAndroid Build Coastguard Worker    Read function definition:
467*9880d681SAndroid Build Coastguard Worker    define double @bar(double %a) {
468*9880d681SAndroid Build Coastguard Worker    entry:
469*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @foo(double %a, double 4.000000e+00)
470*9880d681SAndroid Build Coastguard Worker      %calltmp1 = call double @bar(double 3.133700e+04)
471*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %calltmp, %calltmp1
472*9880d681SAndroid Build Coastguard Worker      ret double %addtmp
473*9880d681SAndroid Build Coastguard Worker    }
474*9880d681SAndroid Build Coastguard Worker
475*9880d681SAndroid Build Coastguard WorkerThis shows some function calls. Note that this function will take a long
476*9880d681SAndroid Build Coastguard Workertime to execute if you call it. In the future we'll add conditional
477*9880d681SAndroid Build Coastguard Workercontrol flow to actually make recursion useful :).
478*9880d681SAndroid Build Coastguard Worker
479*9880d681SAndroid Build Coastguard Worker::
480*9880d681SAndroid Build Coastguard Worker
481*9880d681SAndroid Build Coastguard Worker    ready> extern cos(x);
482*9880d681SAndroid Build Coastguard Worker    Read extern:
483*9880d681SAndroid Build Coastguard Worker    declare double @cos(double)
484*9880d681SAndroid Build Coastguard Worker
485*9880d681SAndroid Build Coastguard Worker    ready> cos(1.234);
486*9880d681SAndroid Build Coastguard Worker    Read top-level expression:
487*9880d681SAndroid Build Coastguard Worker    define double @1() {
488*9880d681SAndroid Build Coastguard Worker    entry:
489*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @cos(double 1.234000e+00)
490*9880d681SAndroid Build Coastguard Worker      ret double %calltmp
491*9880d681SAndroid Build Coastguard Worker    }
492*9880d681SAndroid Build Coastguard Worker
493*9880d681SAndroid Build Coastguard WorkerThis shows an extern for the libm "cos" function, and a call to it.
494*9880d681SAndroid Build Coastguard Worker
495*9880d681SAndroid Build Coastguard Worker.. TODO:: Abandon Pygments' horrible `llvm` lexer. It just totally gives up
496*9880d681SAndroid Build Coastguard Worker   on highlighting this due to the first line.
497*9880d681SAndroid Build Coastguard Worker
498*9880d681SAndroid Build Coastguard Worker::
499*9880d681SAndroid Build Coastguard Worker
500*9880d681SAndroid Build Coastguard Worker    ready> ^D
501*9880d681SAndroid Build Coastguard Worker    ; ModuleID = 'my cool jit'
502*9880d681SAndroid Build Coastguard Worker
503*9880d681SAndroid Build Coastguard Worker    define double @0() {
504*9880d681SAndroid Build Coastguard Worker    entry:
505*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double 4.000000e+00, 5.000000e+00
506*9880d681SAndroid Build Coastguard Worker      ret double %addtmp
507*9880d681SAndroid Build Coastguard Worker    }
508*9880d681SAndroid Build Coastguard Worker
509*9880d681SAndroid Build Coastguard Worker    define double @foo(double %a, double %b) {
510*9880d681SAndroid Build Coastguard Worker    entry:
511*9880d681SAndroid Build Coastguard Worker      %multmp = fmul double %a, %a
512*9880d681SAndroid Build Coastguard Worker      %multmp1 = fmul double 2.000000e+00, %a
513*9880d681SAndroid Build Coastguard Worker      %multmp2 = fmul double %multmp1, %b
514*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %multmp, %multmp2
515*9880d681SAndroid Build Coastguard Worker      %multmp3 = fmul double %b, %b
516*9880d681SAndroid Build Coastguard Worker      %addtmp4 = fadd double %addtmp, %multmp3
517*9880d681SAndroid Build Coastguard Worker      ret double %addtmp4
518*9880d681SAndroid Build Coastguard Worker    }
519*9880d681SAndroid Build Coastguard Worker
520*9880d681SAndroid Build Coastguard Worker    define double @bar(double %a) {
521*9880d681SAndroid Build Coastguard Worker    entry:
522*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @foo(double %a, double 4.000000e+00)
523*9880d681SAndroid Build Coastguard Worker      %calltmp1 = call double @bar(double 3.133700e+04)
524*9880d681SAndroid Build Coastguard Worker      %addtmp = fadd double %calltmp, %calltmp1
525*9880d681SAndroid Build Coastguard Worker      ret double %addtmp
526*9880d681SAndroid Build Coastguard Worker    }
527*9880d681SAndroid Build Coastguard Worker
528*9880d681SAndroid Build Coastguard Worker    declare double @cos(double)
529*9880d681SAndroid Build Coastguard Worker
530*9880d681SAndroid Build Coastguard Worker    define double @1() {
531*9880d681SAndroid Build Coastguard Worker    entry:
532*9880d681SAndroid Build Coastguard Worker      %calltmp = call double @cos(double 1.234000e+00)
533*9880d681SAndroid Build Coastguard Worker      ret double %calltmp
534*9880d681SAndroid Build Coastguard Worker    }
535*9880d681SAndroid Build Coastguard Worker
536*9880d681SAndroid Build Coastguard WorkerWhen you quit the current demo, it dumps out the IR for the entire
537*9880d681SAndroid Build Coastguard Workermodule generated. Here you can see the big picture with all the
538*9880d681SAndroid Build Coastguard Workerfunctions referencing each other.
539*9880d681SAndroid Build Coastguard Worker
540*9880d681SAndroid Build Coastguard WorkerThis wraps up the third chapter of the Kaleidoscope tutorial. Up next,
541*9880d681SAndroid Build Coastguard Workerwe'll describe how to `add JIT codegen and optimizer
542*9880d681SAndroid Build Coastguard Workersupport <LangImpl4.html>`_ to this so we can actually start running
543*9880d681SAndroid Build Coastguard Workercode!
544*9880d681SAndroid Build Coastguard Worker
545*9880d681SAndroid Build Coastguard WorkerFull Code Listing
546*9880d681SAndroid Build Coastguard Worker=================
547*9880d681SAndroid Build Coastguard Worker
548*9880d681SAndroid Build Coastguard WorkerHere is the complete code listing for our running example, enhanced with
549*9880d681SAndroid Build Coastguard Workerthe LLVM code generator. Because this uses the LLVM libraries, we need
550*9880d681SAndroid Build Coastguard Workerto link them in. To do this, we use the
551*9880d681SAndroid Build Coastguard Worker`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
552*9880d681SAndroid Build Coastguard Workerour makefile/command line about which options to use:
553*9880d681SAndroid Build Coastguard Worker
554*9880d681SAndroid Build Coastguard Worker.. code-block:: bash
555*9880d681SAndroid Build Coastguard Worker
556*9880d681SAndroid Build Coastguard Worker    # Compile
557*9880d681SAndroid Build Coastguard Worker    clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o toy
558*9880d681SAndroid Build Coastguard Worker    # Run
559*9880d681SAndroid Build Coastguard Worker    ./toy
560*9880d681SAndroid Build Coastguard Worker
561*9880d681SAndroid Build Coastguard WorkerHere is the code:
562*9880d681SAndroid Build Coastguard Worker
563*9880d681SAndroid Build Coastguard Worker.. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp
564*9880d681SAndroid Build Coastguard Worker   :language: c++
565*9880d681SAndroid Build Coastguard Worker
566*9880d681SAndroid Build Coastguard Worker`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_
567*9880d681SAndroid Build Coastguard Worker
568