xref: /aosp_15_r20/external/llvm/docs/Frontend/PerformanceTips.rst (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker=====================================
2*9880d681SAndroid Build Coastguard WorkerPerformance Tips for Frontend Authors
3*9880d681SAndroid Build Coastguard Worker=====================================
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard Worker.. contents::
6*9880d681SAndroid Build Coastguard Worker   :local:
7*9880d681SAndroid Build Coastguard Worker   :depth: 2
8*9880d681SAndroid Build Coastguard Worker
9*9880d681SAndroid Build Coastguard WorkerAbstract
10*9880d681SAndroid Build Coastguard Worker========
11*9880d681SAndroid Build Coastguard Worker
12*9880d681SAndroid Build Coastguard WorkerThe intended audience of this document is developers of language frontends
13*9880d681SAndroid Build Coastguard Workertargeting LLVM IR. This document is home to a collection of tips on how to
14*9880d681SAndroid Build Coastguard Workergenerate IR that optimizes well.
15*9880d681SAndroid Build Coastguard Worker
16*9880d681SAndroid Build Coastguard WorkerIR Best Practices
17*9880d681SAndroid Build Coastguard Worker=================
18*9880d681SAndroid Build Coastguard Worker
19*9880d681SAndroid Build Coastguard WorkerAs with any optimizer, LLVM has its strengths and weaknesses.  In some cases,
20*9880d681SAndroid Build Coastguard Workersurprisingly small changes in the source IR can have a large effect on the
21*9880d681SAndroid Build Coastguard Workergenerated code.
22*9880d681SAndroid Build Coastguard Worker
23*9880d681SAndroid Build Coastguard WorkerBeyond the specific items on the list below, it's worth noting that the most
24*9880d681SAndroid Build Coastguard Workermature frontend for LLVM is Clang.  As a result, the further your IR gets from what Clang might emit, the less likely it is to be effectively optimized.  It
25*9880d681SAndroid Build Coastguard Workercan often be useful to write a quick C program with the semantics you're trying
26*9880d681SAndroid Build Coastguard Workerto model and see what decisions Clang's IRGen makes about what IR to emit.
27*9880d681SAndroid Build Coastguard WorkerStudying Clang's CodeGen directory can also be a good source of ideas.  Note
28*9880d681SAndroid Build Coastguard Workerthat Clang and LLVM are explicitly version locked so you'll need to make sure
29*9880d681SAndroid Build Coastguard Workeryou're using a Clang built from the same svn revision or release as the LLVM
30*9880d681SAndroid Build Coastguard Workerlibrary you're using.  As always, it's *strongly* recommended that you track
31*9880d681SAndroid Build Coastguard Workertip of tree development, particularly during bring up of a new project.
32*9880d681SAndroid Build Coastguard Worker
33*9880d681SAndroid Build Coastguard WorkerThe Basics
34*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^
35*9880d681SAndroid Build Coastguard Worker
36*9880d681SAndroid Build Coastguard Worker#. Make sure that your Modules contain both a data layout specification and
37*9880d681SAndroid Build Coastguard Worker   target triple. Without these pieces, non of the target specific optimization
38*9880d681SAndroid Build Coastguard Worker   will be enabled.  This can have a major effect on the generated code quality.
39*9880d681SAndroid Build Coastguard Worker
40*9880d681SAndroid Build Coastguard Worker#. For each function or global emitted, use the most private linkage type
41*9880d681SAndroid Build Coastguard Worker   possible (private, internal or linkonce_odr preferably).  Doing so will
42*9880d681SAndroid Build Coastguard Worker   make LLVM's inter-procedural optimizations much more effective.
43*9880d681SAndroid Build Coastguard Worker
44*9880d681SAndroid Build Coastguard Worker#. Avoid high in-degree basic blocks (e.g. basic blocks with dozens or hundreds
45*9880d681SAndroid Build Coastguard Worker   of predecessors).  Among other issues, the register allocator is known to
46*9880d681SAndroid Build Coastguard Worker   perform badly with confronted with such structures.  The only exception to
47*9880d681SAndroid Build Coastguard Worker   this guidance is that a unified return block with high in-degree is fine.
48*9880d681SAndroid Build Coastguard Worker
49*9880d681SAndroid Build Coastguard WorkerUse of allocas
50*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^
51*9880d681SAndroid Build Coastguard Worker
52*9880d681SAndroid Build Coastguard WorkerAn alloca instruction can be used to represent a function scoped stack slot,
53*9880d681SAndroid Build Coastguard Workerbut can also represent dynamic frame expansion.  When representing function
54*9880d681SAndroid Build Coastguard Workerscoped variables or locations, placing alloca instructions at the beginning of
55*9880d681SAndroid Build Coastguard Workerthe entry block should be preferred.   In particular, place them before any
56*9880d681SAndroid Build Coastguard Workercall instructions. Call instructions might get inlined and replaced with
57*9880d681SAndroid Build Coastguard Workermultiple basic blocks. The end result is that a following alloca instruction
58*9880d681SAndroid Build Coastguard Workerwould no longer be in the entry basic block afterward.
59*9880d681SAndroid Build Coastguard Worker
60*9880d681SAndroid Build Coastguard WorkerThe SROA (Scalar Replacement Of Aggregates) and Mem2Reg passes only attempt
61*9880d681SAndroid Build Coastguard Workerto eliminate alloca instructions that are in the entry basic block.  Given
62*9880d681SAndroid Build Coastguard WorkerSSA is the canonical form expected by much of the optimizer; if allocas can
63*9880d681SAndroid Build Coastguard Workernot be eliminated by Mem2Reg or SROA, the optimizer is likely to be less
64*9880d681SAndroid Build Coastguard Workereffective than it could be.
65*9880d681SAndroid Build Coastguard Worker
66*9880d681SAndroid Build Coastguard WorkerAvoid loads and stores of large aggregate type
67*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
68*9880d681SAndroid Build Coastguard Worker
69*9880d681SAndroid Build Coastguard WorkerLLVM currently does not optimize well loads and stores of large :ref:`aggregate
70*9880d681SAndroid Build Coastguard Workertypes <t_aggregate>` (i.e. structs and arrays).  As an alternative, consider
71*9880d681SAndroid Build Coastguard Workerloading individual fields from memory.
72*9880d681SAndroid Build Coastguard Worker
73*9880d681SAndroid Build Coastguard WorkerAggregates that are smaller than the largest (performant) load or store
74*9880d681SAndroid Build Coastguard Workerinstruction supported by the targeted hardware are well supported.  These can
75*9880d681SAndroid Build Coastguard Workerbe an effective way to represent collections of small packed fields.
76*9880d681SAndroid Build Coastguard Worker
77*9880d681SAndroid Build Coastguard WorkerPrefer zext over sext when legal
78*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
79*9880d681SAndroid Build Coastguard Worker
80*9880d681SAndroid Build Coastguard WorkerOn some architectures (X86_64 is one), sign extension can involve an extra
81*9880d681SAndroid Build Coastguard Workerinstruction whereas zero extension can be folded into a load.  LLVM will try to
82*9880d681SAndroid Build Coastguard Workerreplace a sext with a zext when it can be proven safe, but if you have
83*9880d681SAndroid Build Coastguard Workerinformation in your source language about the range of a integer value, it can
84*9880d681SAndroid Build Coastguard Workerbe profitable to use a zext rather than a sext.
85*9880d681SAndroid Build Coastguard Worker
86*9880d681SAndroid Build Coastguard WorkerAlternatively, you can :ref:`specify the range of the value using metadata
87*9880d681SAndroid Build Coastguard Worker<range-metadata>` and LLVM can do the sext to zext conversion for you.
88*9880d681SAndroid Build Coastguard Worker
89*9880d681SAndroid Build Coastguard WorkerZext GEP indices to machine register width
90*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91*9880d681SAndroid Build Coastguard Worker
92*9880d681SAndroid Build Coastguard WorkerInternally, LLVM often promotes the width of GEP indices to machine register
93*9880d681SAndroid Build Coastguard Workerwidth.  When it does so, it will default to using sign extension (sext)
94*9880d681SAndroid Build Coastguard Workeroperations for safety.  If your source language provides information about
95*9880d681SAndroid Build Coastguard Workerthe range of the index, you may wish to manually extend indices to machine
96*9880d681SAndroid Build Coastguard Workerregister width using a zext instruction.
97*9880d681SAndroid Build Coastguard Worker
98*9880d681SAndroid Build Coastguard WorkerWhen to specify alignment
99*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^
100*9880d681SAndroid Build Coastguard WorkerLLVM will always generate correct code if you don’t specify alignment, but may
101*9880d681SAndroid Build Coastguard Workergenerate inefficient code.  For example, if you are targeting MIPS (or older
102*9880d681SAndroid Build Coastguard WorkerARM ISAs) then the hardware does not handle unaligned loads and stores, and
103*9880d681SAndroid Build Coastguard Workerso you will enter a trap-and-emulate path if you do a load or store with
104*9880d681SAndroid Build Coastguard Workerlower-than-natural alignment.  To avoid this, LLVM will emit a slower
105*9880d681SAndroid Build Coastguard Workersequence of loads, shifts and masks (or load-right + load-left on MIPS) for
106*9880d681SAndroid Build Coastguard Workerall cases where the load / store does not have a sufficiently high alignment
107*9880d681SAndroid Build Coastguard Workerin the IR.
108*9880d681SAndroid Build Coastguard Worker
109*9880d681SAndroid Build Coastguard WorkerThe alignment is used to guarantee the alignment on allocas and globals,
110*9880d681SAndroid Build Coastguard Workerthough in most cases this is unnecessary (most targets have a sufficiently
111*9880d681SAndroid Build Coastguard Workerhigh default alignment that they’ll be fine).  It is also used to provide a
112*9880d681SAndroid Build Coastguard Workercontract to the back end saying ‘either this load/store has this alignment, or
113*9880d681SAndroid Build Coastguard Workerit is undefined behavior’.  This means that the back end is free to emit
114*9880d681SAndroid Build Coastguard Workerinstructions that rely on that alignment (and mid-level optimizers are free to
115*9880d681SAndroid Build Coastguard Workerperform transforms that require that alignment).  For x86, it doesn’t make
116*9880d681SAndroid Build Coastguard Workermuch difference, as almost all instructions are alignment-independent.  For
117*9880d681SAndroid Build Coastguard WorkerMIPS, it can make a big difference.
118*9880d681SAndroid Build Coastguard Worker
119*9880d681SAndroid Build Coastguard WorkerNote that if your loads and stores are atomic, the backend will be unable to
120*9880d681SAndroid Build Coastguard Workerlower an under aligned access into a sequence of natively aligned accesses.
121*9880d681SAndroid Build Coastguard WorkerAs a result, alignment is mandatory for atomic loads and stores.
122*9880d681SAndroid Build Coastguard Worker
123*9880d681SAndroid Build Coastguard WorkerOther Things to Consider
124*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^
125*9880d681SAndroid Build Coastguard Worker
126*9880d681SAndroid Build Coastguard Worker#. Use ptrtoint/inttoptr sparingly (they interfere with pointer aliasing
127*9880d681SAndroid Build Coastguard Worker   analysis), prefer GEPs
128*9880d681SAndroid Build Coastguard Worker
129*9880d681SAndroid Build Coastguard Worker#. Prefer globals over inttoptr of a constant address - this gives you
130*9880d681SAndroid Build Coastguard Worker   dereferencability information.  In MCJIT, use getSymbolAddress to provide
131*9880d681SAndroid Build Coastguard Worker   actual address.
132*9880d681SAndroid Build Coastguard Worker
133*9880d681SAndroid Build Coastguard Worker#. Be wary of ordered and atomic memory operations.  They are hard to optimize
134*9880d681SAndroid Build Coastguard Worker   and may not be well optimized by the current optimizer.  Depending on your
135*9880d681SAndroid Build Coastguard Worker   source language, you may consider using fences instead.
136*9880d681SAndroid Build Coastguard Worker
137*9880d681SAndroid Build Coastguard Worker#. If calling a function which is known to throw an exception (unwind), use
138*9880d681SAndroid Build Coastguard Worker   an invoke with a normal destination which contains an unreachable
139*9880d681SAndroid Build Coastguard Worker   instruction.  This form conveys to the optimizer that the call returns
140*9880d681SAndroid Build Coastguard Worker   abnormally.  For an invoke which neither returns normally or requires unwind
141*9880d681SAndroid Build Coastguard Worker   code in the current function, you can use a noreturn call instruction if
142*9880d681SAndroid Build Coastguard Worker   desired.  This is generally not required because the optimizer will convert
143*9880d681SAndroid Build Coastguard Worker   an invoke with an unreachable unwind destination to a call instruction.
144*9880d681SAndroid Build Coastguard Worker
145*9880d681SAndroid Build Coastguard Worker#. Use profile metadata to indicate statically known cold paths, even if
146*9880d681SAndroid Build Coastguard Worker   dynamic profiling information is not available.  This can make a large
147*9880d681SAndroid Build Coastguard Worker   difference in code placement and thus the performance of tight loops.
148*9880d681SAndroid Build Coastguard Worker
149*9880d681SAndroid Build Coastguard Worker#. When generating code for loops, try to avoid terminating the header block of
150*9880d681SAndroid Build Coastguard Worker   the loop earlier than necessary.  If the terminator of the loop header
151*9880d681SAndroid Build Coastguard Worker   block is a loop exiting conditional branch, the effectiveness of LICM will
152*9880d681SAndroid Build Coastguard Worker   be limited for loads not in the header.  (This is due to the fact that LLVM
153*9880d681SAndroid Build Coastguard Worker   may not know such a load is safe to speculatively execute and thus can't
154*9880d681SAndroid Build Coastguard Worker   lift an otherwise loop invariant load unless it can prove the exiting
155*9880d681SAndroid Build Coastguard Worker   condition is not taken.)  It can be profitable, in some cases, to emit such
156*9880d681SAndroid Build Coastguard Worker   instructions into the header even if they are not used along a rarely
157*9880d681SAndroid Build Coastguard Worker   executed path that exits the loop.  This guidance specifically does not
158*9880d681SAndroid Build Coastguard Worker   apply if the condition which terminates the loop header is itself invariant,
159*9880d681SAndroid Build Coastguard Worker   or can be easily discharged by inspecting the loop index variables.
160*9880d681SAndroid Build Coastguard Worker
161*9880d681SAndroid Build Coastguard Worker#. In hot loops, consider duplicating instructions from small basic blocks
162*9880d681SAndroid Build Coastguard Worker   which end in highly predictable terminators into their successor blocks.
163*9880d681SAndroid Build Coastguard Worker   If a hot successor block contains instructions which can be vectorized
164*9880d681SAndroid Build Coastguard Worker   with the duplicated ones, this can provide a noticeable throughput
165*9880d681SAndroid Build Coastguard Worker   improvement.  Note that this is not always profitable and does involve a
166*9880d681SAndroid Build Coastguard Worker   potentially large increase in code size.
167*9880d681SAndroid Build Coastguard Worker
168*9880d681SAndroid Build Coastguard Worker#. When checking a value against a constant, emit the check using a consistent
169*9880d681SAndroid Build Coastguard Worker   comparison type.  The GVN pass *will* optimize redundant equalities even if
170*9880d681SAndroid Build Coastguard Worker   the type of comparison is inverted, but GVN only runs late in the pipeline.
171*9880d681SAndroid Build Coastguard Worker   As a result, you may miss the opportunity to run other important
172*9880d681SAndroid Build Coastguard Worker   optimizations.  Improvements to EarlyCSE to remove this issue are tracked in
173*9880d681SAndroid Build Coastguard Worker   Bug 23333.
174*9880d681SAndroid Build Coastguard Worker
175*9880d681SAndroid Build Coastguard Worker#. Avoid using arithmetic intrinsics unless you are *required* by your source
176*9880d681SAndroid Build Coastguard Worker   language specification to emit a particular code sequence.  The optimizer
177*9880d681SAndroid Build Coastguard Worker   is quite good at reasoning about general control flow and arithmetic, it is
178*9880d681SAndroid Build Coastguard Worker   not anywhere near as strong at reasoning about the various intrinsics.  If
179*9880d681SAndroid Build Coastguard Worker   profitable for code generation purposes, the optimizer will likely form the
180*9880d681SAndroid Build Coastguard Worker   intrinsics itself late in the optimization pipeline.  It is *very* rarely
181*9880d681SAndroid Build Coastguard Worker   profitable to emit these directly in the language frontend.  This item
182*9880d681SAndroid Build Coastguard Worker   explicitly includes the use of the :ref:`overflow intrinsics <int_overflow>`.
183*9880d681SAndroid Build Coastguard Worker
184*9880d681SAndroid Build Coastguard Worker#. Avoid using the :ref:`assume intrinsic <int_assume>` until you've
185*9880d681SAndroid Build Coastguard Worker   established that a) there's no other way to express the given fact and b)
186*9880d681SAndroid Build Coastguard Worker   that fact is critical for optimization purposes.  Assumes are a great
187*9880d681SAndroid Build Coastguard Worker   prototyping mechanism, but they can have negative effects on both compile
188*9880d681SAndroid Build Coastguard Worker   time and optimization effectiveness.  The former is fixable with enough
189*9880d681SAndroid Build Coastguard Worker   effort, but the later is fairly fundamental to their designed purpose.
190*9880d681SAndroid Build Coastguard Worker
191*9880d681SAndroid Build Coastguard Worker
192*9880d681SAndroid Build Coastguard WorkerDescribing Language Specific Properties
193*9880d681SAndroid Build Coastguard Worker=======================================
194*9880d681SAndroid Build Coastguard Worker
195*9880d681SAndroid Build Coastguard WorkerWhen translating a source language to LLVM, finding ways to express concepts
196*9880d681SAndroid Build Coastguard Workerand guarantees available in your source language which are not natively
197*9880d681SAndroid Build Coastguard Workerprovided by LLVM IR will greatly improve LLVM's ability to optimize your code.
198*9880d681SAndroid Build Coastguard WorkerAs an example, C/C++'s ability to mark every add as "no signed wrap (nsw)" goes
199*9880d681SAndroid Build Coastguard Workera long way to assisting the optimizer in reasoning about loop induction
200*9880d681SAndroid Build Coastguard Workervariables and thus generating more optimal code for loops.
201*9880d681SAndroid Build Coastguard Worker
202*9880d681SAndroid Build Coastguard WorkerThe LLVM LangRef includes a number of mechanisms for annotating the IR with
203*9880d681SAndroid Build Coastguard Workeradditional semantic information.  It is *strongly* recommended that you become
204*9880d681SAndroid Build Coastguard Workerhighly familiar with this document.  The list below is intended to highlight a
205*9880d681SAndroid Build Coastguard Workercouple of items of particular interest, but is by no means exhaustive.
206*9880d681SAndroid Build Coastguard Worker
207*9880d681SAndroid Build Coastguard WorkerRestricted Operation Semantics
208*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
209*9880d681SAndroid Build Coastguard Worker#. Add nsw/nuw flags as appropriate.  Reasoning about overflow is
210*9880d681SAndroid Build Coastguard Worker   generally hard for an optimizer so providing these facts from the frontend
211*9880d681SAndroid Build Coastguard Worker   can be very impactful.
212*9880d681SAndroid Build Coastguard Worker
213*9880d681SAndroid Build Coastguard Worker#. Use fast-math flags on floating point operations if legal.  If you don't
214*9880d681SAndroid Build Coastguard Worker   need strict IEEE floating point semantics, there are a number of additional
215*9880d681SAndroid Build Coastguard Worker   optimizations that can be performed.  This can be highly impactful for
216*9880d681SAndroid Build Coastguard Worker   floating point intensive computations.
217*9880d681SAndroid Build Coastguard Worker
218*9880d681SAndroid Build Coastguard WorkerDescribing Aliasing Properties
219*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
220*9880d681SAndroid Build Coastguard Worker
221*9880d681SAndroid Build Coastguard Worker#. Add noalias/align/dereferenceable/nonnull to function arguments and return
222*9880d681SAndroid Build Coastguard Worker   values as appropriate
223*9880d681SAndroid Build Coastguard Worker
224*9880d681SAndroid Build Coastguard Worker#. Use pointer aliasing metadata, especially tbaa metadata, to communicate
225*9880d681SAndroid Build Coastguard Worker   otherwise-non-deducible pointer aliasing facts
226*9880d681SAndroid Build Coastguard Worker
227*9880d681SAndroid Build Coastguard Worker#. Use inbounds on geps.  This can help to disambiguate some aliasing queries.
228*9880d681SAndroid Build Coastguard Worker
229*9880d681SAndroid Build Coastguard Worker
230*9880d681SAndroid Build Coastguard WorkerModeling Memory Effects
231*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^
232*9880d681SAndroid Build Coastguard Worker
233*9880d681SAndroid Build Coastguard Worker#. Mark functions as readnone/readonly/argmemonly or noreturn/nounwind when
234*9880d681SAndroid Build Coastguard Worker   known.  The optimizer will try to infer these flags, but may not always be
235*9880d681SAndroid Build Coastguard Worker   able to.  Manual annotations are particularly important for external
236*9880d681SAndroid Build Coastguard Worker   functions that the optimizer can not analyze.
237*9880d681SAndroid Build Coastguard Worker
238*9880d681SAndroid Build Coastguard Worker#. Use the lifetime.start/lifetime.end and invariant.start/invariant.end
239*9880d681SAndroid Build Coastguard Worker   intrinsics where possible.  Common profitable uses are for stack like data
240*9880d681SAndroid Build Coastguard Worker   structures (thus allowing dead store elimination) and for describing
241*9880d681SAndroid Build Coastguard Worker   life times of allocas (thus allowing smaller stack sizes).
242*9880d681SAndroid Build Coastguard Worker
243*9880d681SAndroid Build Coastguard Worker#. Mark invariant locations using !invariant.load and TBAA's constant flags
244*9880d681SAndroid Build Coastguard Worker
245*9880d681SAndroid Build Coastguard WorkerPass Ordering
246*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^
247*9880d681SAndroid Build Coastguard Worker
248*9880d681SAndroid Build Coastguard WorkerOne of the most common mistakes made by new language frontend projects is to
249*9880d681SAndroid Build Coastguard Workeruse the existing -O2 or -O3 pass pipelines as is.  These pass pipelines make a
250*9880d681SAndroid Build Coastguard Workergood starting point for an optimizing compiler for any language, but they have
251*9880d681SAndroid Build Coastguard Workerbeen carefully tuned for C and C++, not your target language.  You will almost
252*9880d681SAndroid Build Coastguard Workercertainly need to use a custom pass order to achieve optimal performance.  A
253*9880d681SAndroid Build Coastguard Workercouple specific suggestions:
254*9880d681SAndroid Build Coastguard Worker
255*9880d681SAndroid Build Coastguard Worker#. For languages with numerous rarely executed guard conditions (e.g. null
256*9880d681SAndroid Build Coastguard Worker   checks, type checks, range checks) consider adding an extra execution or
257*9880d681SAndroid Build Coastguard Worker   two of LoopUnswith and LICM to your pass order.  The standard pass order,
258*9880d681SAndroid Build Coastguard Worker   which is tuned for C and C++ applications, may not be sufficient to remove
259*9880d681SAndroid Build Coastguard Worker   all dischargeable checks from loops.
260*9880d681SAndroid Build Coastguard Worker
261*9880d681SAndroid Build Coastguard Worker#. If you language uses range checks, consider using the IRCE pass.  It is not
262*9880d681SAndroid Build Coastguard Worker   currently part of the standard pass order.
263*9880d681SAndroid Build Coastguard Worker
264*9880d681SAndroid Build Coastguard Worker#. A useful sanity check to run is to run your optimized IR back through the
265*9880d681SAndroid Build Coastguard Worker   -O2 pipeline again.  If you see noticeable improvement in the resulting IR,
266*9880d681SAndroid Build Coastguard Worker   you likely need to adjust your pass order.
267*9880d681SAndroid Build Coastguard Worker
268*9880d681SAndroid Build Coastguard Worker
269*9880d681SAndroid Build Coastguard WorkerI Still Can't Find What I'm Looking For
270*9880d681SAndroid Build Coastguard Worker^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
271*9880d681SAndroid Build Coastguard Worker
272*9880d681SAndroid Build Coastguard WorkerIf you didn't find what you were looking for above, consider proposing an piece
273*9880d681SAndroid Build Coastguard Workerof metadata which provides the optimization hint you need.  Such extensions are
274*9880d681SAndroid Build Coastguard Workerrelatively common and are generally well received by the community.  You will
275*9880d681SAndroid Build Coastguard Workerneed to ensure that your proposal is sufficiently general so that it benefits
276*9880d681SAndroid Build Coastguard Workerothers if you wish to contribute it upstream.
277*9880d681SAndroid Build Coastguard Worker
278*9880d681SAndroid Build Coastguard WorkerYou should also consider describing the problem you're facing on `llvm-dev
279*9880d681SAndroid Build Coastguard Worker<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ and asking for advice.
280*9880d681SAndroid Build Coastguard WorkerIt's entirely possible someone has encountered your problem before and can
281*9880d681SAndroid Build Coastguard Workergive good advice.  If there are multiple interested parties, that also
282*9880d681SAndroid Build Coastguard Workerincreases the chances that a metadata extension would be well received by the
283*9880d681SAndroid Build Coastguard Workercommunity as a whole.
284*9880d681SAndroid Build Coastguard Worker
285*9880d681SAndroid Build Coastguard WorkerAdding to this document
286*9880d681SAndroid Build Coastguard Worker=======================
287*9880d681SAndroid Build Coastguard Worker
288*9880d681SAndroid Build Coastguard WorkerIf you run across a case that you feel deserves to be covered here, please send
289*9880d681SAndroid Build Coastguard Workera patch to `llvm-commits
290*9880d681SAndroid Build Coastguard Worker<http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review.
291*9880d681SAndroid Build Coastguard Worker
292*9880d681SAndroid Build Coastguard WorkerIf you have questions on these items, please direct them to `llvm-dev
293*9880d681SAndroid Build Coastguard Worker<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_.  The more relevant
294*9880d681SAndroid Build Coastguard Workercontext you are able to give to your question, the more likely it is to be
295*9880d681SAndroid Build Coastguard Workeranswered.
296*9880d681SAndroid Build Coastguard Worker
297