xref: /aosp_15_r20/dalvik/docs/dexopt.html (revision 055d459012065f78d96b68be8421640240ddf631)
1*055d4590SKeyi Gui<html>
2*055d4590SKeyi Gui<head>
3*055d4590SKeyi Gui    <title>Dalvik Optimization and Verification</title>
4*055d4590SKeyi Gui</head>
5*055d4590SKeyi Gui
6*055d4590SKeyi Gui<body>
7*055d4590SKeyi Gui<h1>Dalvik Optimization and Verification With <i>dexopt</i></h1>
8*055d4590SKeyi Gui
9*055d4590SKeyi Gui<p>
10*055d4590SKeyi GuiThe Dalvik virtual machine was designed specifically for the Android
11*055d4590SKeyi Guimobile platform.  The target systems have little RAM, store data on slow
12*055d4590SKeyi Guiinternal flash memory, and generally have the performance characteristics
13*055d4590SKeyi Guiof decade-old desktop systems.  They also run Linux, which provides
14*055d4590SKeyi Guivirtual memory, processes and threads, and UID-based security mechanisms.
15*055d4590SKeyi Gui<p>
16*055d4590SKeyi GuiThe features and limitations caused us to focus on certain goals:
17*055d4590SKeyi Gui
18*055d4590SKeyi Gui<ul>
19*055d4590SKeyi Gui    <li>Class data, notably bytecode, must be shared between multiple
20*055d4590SKeyi Gui    processes to minimize total system memory usage.
21*055d4590SKeyi Gui    <li>The overhead in launching a new app must be minimized to keep
22*055d4590SKeyi Gui    the device responsive.
23*055d4590SKeyi Gui    <li>Storing class data in individual files results in a lot of
24*055d4590SKeyi Gui    redundancy, especially with respect to strings.  To conserve disk
25*055d4590SKeyi Gui    space we need to factor this out.
26*055d4590SKeyi Gui    <li>Parsing class data fields adds unnecessary overhead during
27*055d4590SKeyi Gui    class loading.  Accessing data values (e.g. integers and strings)
28*055d4590SKeyi Gui    directly as C types is better.
29*055d4590SKeyi Gui    <li>Bytecode verification is necessary, but slow, so we want to verify
30*055d4590SKeyi Gui    as much as possible outside app execution.
31*055d4590SKeyi Gui    <li>Bytecode optimization (quickened instructions, method pruning) is
32*055d4590SKeyi Gui    important for speed and battery life.
33*055d4590SKeyi Gui    <li>For security reasons, processes may not edit shared code.
34*055d4590SKeyi Gui</ul>
35*055d4590SKeyi Gui
36*055d4590SKeyi Gui<p>
37*055d4590SKeyi GuiThe typical VM implementation uncompresses individual classes from a
38*055d4590SKeyi Guicompressed archive and stores them on the heap.  This implies a separate
39*055d4590SKeyi Guicopy of each class in every process, and slows application startup because
40*055d4590SKeyi Guithe code must be uncompressed (or at least read off disk in many small
41*055d4590SKeyi Guipieces).  On the other hand, having the bytecode on the local heap makes
42*055d4590SKeyi Guiit easy to rewrite instructions on first use, facilitating a number of
43*055d4590SKeyi Guidifferent optimizations.
44*055d4590SKeyi Gui<p>
45*055d4590SKeyi GuiThe goals led us to make some fundamental decisions:
46*055d4590SKeyi Gui
47*055d4590SKeyi Gui<ul>
48*055d4590SKeyi Gui    <li>Multiple classes are aggregated into a single "DEX" file.
49*055d4590SKeyi Gui    <li>DEX files are mapped read-only and shared between processes.
50*055d4590SKeyi Gui    <li>Byte ordering and word alignment are adjusted to suit the local
51*055d4590SKeyi Gui    system.
52*055d4590SKeyi Gui    <li>Bytecode verification is mandatory for all classes, but we want
53*055d4590SKeyi Gui    to "pre-verify" whatever we can.
54*055d4590SKeyi Gui    <li>Optimizations that require rewriting bytecode must be done ahead
55*055d4590SKeyi Gui    of time.
56*055d4590SKeyi Gui</ul>
57*055d4590SKeyi Gui
58*055d4590SKeyi Gui<p>
59*055d4590SKeyi GuiThe consequences of these decisions are explained in the following sections.
60*055d4590SKeyi Gui
61*055d4590SKeyi Gui
62*055d4590SKeyi Gui<h2>VM Operation</h2>
63*055d4590SKeyi Gui
64*055d4590SKeyi Gui<p>
65*055d4590SKeyi GuiApplication code is delivered to the system in a <code>.jar</code>
66*055d4590SKeyi Guior <code>.apk</code> file.  These are really just <code>.zip</code>
67*055d4590SKeyi Guiarchives with some meta-data files added.  The Dalvik DEX data file
68*055d4590SKeyi Guiis always called <code>classes.dex</code>.
69*055d4590SKeyi Gui<p>
70*055d4590SKeyi GuiThe bytecode cannot be memory-mapped and executed directly from the zip
71*055d4590SKeyi Guifile, because the data is compressed and the start of the file is not
72*055d4590SKeyi Guiguaranteed to be word-aligned.  These problems could be addressed by
73*055d4590SKeyi Guistoring <code>classes.dex</code> without compression and padding out the zip
74*055d4590SKeyi Guifile, but that would increase the size of the package sent across the
75*055d4590SKeyi Guidata network.
76*055d4590SKeyi Gui<p>
77*055d4590SKeyi GuiWe need to extract <code>classes.dex</code> from the zip archive before
78*055d4590SKeyi Guiwe can use it.  While we have the file available, we might as well perform
79*055d4590SKeyi Guisome of the other actions (realignment, optimization, verification) described
80*055d4590SKeyi Guiearlier.  This raises a new question however: who is responsible for doing
81*055d4590SKeyi Guithis, and where do we keep the output?
82*055d4590SKeyi Gui
83*055d4590SKeyi Gui<h3>Preparation</h3>
84*055d4590SKeyi Gui
85*055d4590SKeyi Gui<p>
86*055d4590SKeyi GuiThere are at least three different ways to create a "prepared" DEX file,
87*055d4590SKeyi Guisometimes known as "ODEX" (for Optimized DEX):
88*055d4590SKeyi Gui<ol>
89*055d4590SKeyi Gui    <li>The VM does it "just in time".  The output goes into a special
90*055d4590SKeyi Gui    <code>dalvik-cache</code> directory.  This works on the desktop and
91*055d4590SKeyi Gui    engineering-only device builds where the permissions on the
92*055d4590SKeyi Gui    <code>dalvik-cache</code> directory are not restricted.  On production
93*055d4590SKeyi Gui    devices, this is not allowed.
94*055d4590SKeyi Gui    <li>The system installer does it when an application is first added.
95*055d4590SKeyi Gui    It has the privileges required to write to <code>dalvik-cache</code>.
96*055d4590SKeyi Gui    <li>The build system does it ahead of time.  The relevant <code>jar</code>
97*055d4590SKeyi Gui    / <code>apk</code> files are present, but the <code>classes.dex</code>
98*055d4590SKeyi Gui    is stripped out.  The optimized DEX is stored next to the original
99*055d4590SKeyi Gui    zip archive, not in <code>dalvik-cache</code>, and is part of the
100*055d4590SKeyi Gui    system image.
101*055d4590SKeyi Gui</ol>
102*055d4590SKeyi Gui<p>
103*055d4590SKeyi GuiThe <code>dalvik-cache</code> directory is more accurately
104*055d4590SKeyi Gui<code>$ANDROID_DATA/data/dalvik-cache</code>.  The files inside it have
105*055d4590SKeyi Guinames derived from the full path of the source DEX.  On the device the
106*055d4590SKeyi Guidirectory is owned by <code>system</code> / <code>system</code>
107*055d4590SKeyi Guiand has 0771 permissions, and the optimized DEX files stored there are
108*055d4590SKeyi Guiowned by <code>system</code> and the
109*055d4590SKeyi Guiapplication's group, with 0644 permissions.  DRM-locked applications will
110*055d4590SKeyi Guiuse 640 permissions to prevent other user applications from examining them.
111*055d4590SKeyi GuiThe bottom line is that you can read your own DEX file and those of most
112*055d4590SKeyi Guiother applications, but you cannot create, modify, or remove them.
113*055d4590SKeyi Gui<p>
114*055d4590SKeyi GuiPreparation of the DEX file for the "just in time" and "system installer"
115*055d4590SKeyi Guiapproaches proceeds in three steps:
116*055d4590SKeyi Gui<p>
117*055d4590SKeyi GuiFirst, the dalvik-cache file is created.  This must be done in a process
118*055d4590SKeyi Guiwith appropriate privileges, so for the "system installer" case this is
119*055d4590SKeyi Guidone within <code>installd</code>, which runs as root.
120*055d4590SKeyi Gui<p>
121*055d4590SKeyi GuiSecond, the <code>classes.dex</code> entry is extracted from the the zip
122*055d4590SKeyi Guiarchive.  A small amount of space is left at the start of the file for
123*055d4590SKeyi Guithe ODEX header.
124*055d4590SKeyi Gui<p>
125*055d4590SKeyi GuiThird, the file is memory-mapped for easy access and tweaked for use on
126*055d4590SKeyi Guithe current system.  This includes byte-swapping and structure realigning,
127*055d4590SKeyi Guibut no meaningful changes to the DEX file.  We also do some basic
128*055d4590SKeyi Guistructure checks, such as ensuring that file offsets and data indices
129*055d4590SKeyi Guifall within valid ranges.
130*055d4590SKeyi Gui<p>
131*055d4590SKeyi GuiThe build system uses a hairy process that involves starting the
132*055d4590SKeyi Guiemulator, forcing just-in-time optimization of all relevant DEX files,
133*055d4590SKeyi Guiand then extracting the results from <code>dalvik-cache</code>.  The
134*055d4590SKeyi Guireasons for doing this, rather than using a tool that runs on the desktop,
135*055d4590SKeyi Guiwill become more apparent when the optimizations are explained.
136*055d4590SKeyi Gui<p>
137*055d4590SKeyi GuiOnce the code is byte-swapped and aligned, we're ready to go.  We append
138*055d4590SKeyi Guisome pre-computed data, fill in the ODEX header at the start of the file,
139*055d4590SKeyi Guiand start executing.  (The header is filled in last, so that we don't
140*055d4590SKeyi Guitry to use a partial file.)  If we're interested in verification and
141*055d4590SKeyi Guioptimization, however, we need to insert a step after the initial prep.
142*055d4590SKeyi Gui
143*055d4590SKeyi Gui<h3>dexopt</h3>
144*055d4590SKeyi Gui
145*055d4590SKeyi Gui<p>
146*055d4590SKeyi GuiWe want to verify and optimize all of the classes in the DEX file.  The
147*055d4590SKeyi Guieasiest and safest way to do this is to load all of the classes into
148*055d4590SKeyi Guithe VM and run through them.  Anything that fails to load is simply not
149*055d4590SKeyi Guiverified or optimized.  Unfortunately, this can cause allocation of some
150*055d4590SKeyi Guiresources that are difficult to release (e.g. loading of native shared
151*055d4590SKeyi Guilibraries), so we don't want to do it in the same virtual machine that
152*055d4590SKeyi Guiwe're running applications in.
153*055d4590SKeyi Gui<p>
154*055d4590SKeyi GuiThe solution is to invoke a program called <code>dexopt</code>, which
155*055d4590SKeyi Guiis really just a back door into the VM.  It performs an abbreviated VM
156*055d4590SKeyi Guiinitialization, loads zero or more DEX files from the bootstrap class
157*055d4590SKeyi Guipath, and then sets about verifying and optimizing whatever it can from
158*055d4590SKeyi Guithe target DEX.  On completion, the process exits, freeing all resources.
159*055d4590SKeyi Gui<p>
160*055d4590SKeyi GuiIt is possible for multiple VMs to want the same DEX file at the same
161*055d4590SKeyi Guitime.  File locking is used to ensure that dexopt is only run once.
162*055d4590SKeyi Gui
163*055d4590SKeyi Gui
164*055d4590SKeyi Gui<h2>Verification</h2>
165*055d4590SKeyi Gui
166*055d4590SKeyi Gui<p>
167*055d4590SKeyi GuiThe bytecode verification process involves scanning through the instructions
168*055d4590SKeyi Guiin every method in every class in a DEX file.  The goal is to identify
169*055d4590SKeyi Guiillegal instruction sequences so that we don't have to check for them at
170*055d4590SKeyi Guirun time.  Many of the computations involved are also necessary for "exact"
171*055d4590SKeyi Guigarbage collection.  See
172*055d4590SKeyi Gui<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> for more
173*055d4590SKeyi Guiinformation.
174*055d4590SKeyi Gui<p>
175*055d4590SKeyi GuiFor performance reasons, the optimizer (described in the next section)
176*055d4590SKeyi Guiassumes that the verifier has run successfully, and makes some potentially
177*055d4590SKeyi Guiunsafe assumptions.  By default, Dalvik insists upon verifying all classes,
178*055d4590SKeyi Guiand only optimizes classes that have been verified.  If you want to
179*055d4590SKeyi Guidisable the verifier, you can use command-line flags to do so.  See also
180*055d4590SKeyi Gui<a href="embedded-vm-control.html"> Controlling the Embedded VM</a>
181*055d4590SKeyi Guifor instructions on controlling these
182*055d4590SKeyi Guifeatures within the Android application framework.
183*055d4590SKeyi Gui<p>
184*055d4590SKeyi GuiReporting of verification failures is a tricky issue.  For example,
185*055d4590SKeyi Guicalling a package-scope method on a class in a different package is
186*055d4590SKeyi Guiillegal and will be caught by the verifier.  We don't necessarily want
187*055d4590SKeyi Guito report it during verification though -- we actually want to throw
188*055d4590SKeyi Guian exception when the method call is attempted.  Checking the access
189*055d4590SKeyi Guiflags on every method call is expensive though.  The
190*055d4590SKeyi Gui<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> document
191*055d4590SKeyi Guiaddresses this issue.
192*055d4590SKeyi Gui<p>
193*055d4590SKeyi GuiClasses that have been verified successfully have a flag set in the ODEX.
194*055d4590SKeyi GuiThey will not be re-verified when loaded.  The Linux access permissions
195*055d4590SKeyi Guiare expected to prevent tampering; if you can get around those, installing
196*055d4590SKeyi Guifaulty bytecode is far from the easiest line of attack.  The ODEX file has
197*055d4590SKeyi Guia 32-bit checksum, but that's chiefly present as a quick check for
198*055d4590SKeyi Guicorrupted data.
199*055d4590SKeyi Gui
200*055d4590SKeyi Gui
201*055d4590SKeyi Gui<h2>Optimization</h2>
202*055d4590SKeyi Gui
203*055d4590SKeyi Gui<p>
204*055d4590SKeyi GuiVirtual machine interpreters typically perform certain optimizations the
205*055d4590SKeyi Guifirst time a piece of code is used.  Constant pool references are replaced
206*055d4590SKeyi Guiwith pointers to internal data structures, operations that always succeed
207*055d4590SKeyi Guior always work a certain way are replaced with simpler forms.  Some of
208*055d4590SKeyi Guithese require information only available at runtime, others can be inferred
209*055d4590SKeyi Guistatically when certain assumptions are made.
210*055d4590SKeyi Gui<p>
211*055d4590SKeyi GuiThe Dalvik optimizer does the following:
212*055d4590SKeyi Gui<ul>
213*055d4590SKeyi Gui    <li>For virtual method calls, replace the method index with a
214*055d4590SKeyi Gui    vtable index.
215*055d4590SKeyi Gui    <li>For instance field get/put, replace the field index with
216*055d4590SKeyi Gui    a byte offset.  Also, merge the boolean / byte / char / short
217*055d4590SKeyi Gui    variants into a single 32-bit form (less code in the interpreter
218*055d4590SKeyi Gui    means more room in the CPU I-cache).
219*055d4590SKeyi Gui    <li>Replace a handful of high-volume calls, like String.length(),
220*055d4590SKeyi Gui    with "inline" replacements.  This skips the usual method call
221*055d4590SKeyi Gui    overhead, directly switching from the interpreter to a native
222*055d4590SKeyi Gui    implementation.
223*055d4590SKeyi Gui    <li>Prune empty methods.  The simplest example is
224*055d4590SKeyi Gui    <code>Object.&lt;init&gt;</code>, which does nothing, but must be
225*055d4590SKeyi Gui    called whenever any object is allocated.  The instruction is
226*055d4590SKeyi Gui    replaced with a new version that acts as a no-op unless a debugger
227*055d4590SKeyi Gui    is attached.
228*055d4590SKeyi Gui    <li>Append pre-computed data.  For example, the VM wants to have a
229*055d4590SKeyi Gui    hash table for lookups on class name.  Instead of computing this
230*055d4590SKeyi Gui    when the DEX file is loaded, we can compute it now, saving heap
231*055d4590SKeyi Gui    space and computation time in every VM where the DEX is loaded.
232*055d4590SKeyi Gui</ul>
233*055d4590SKeyi Gui
234*055d4590SKeyi Gui<p>
235*055d4590SKeyi GuiAll of the instruction modifications involve replacing the opcode with
236*055d4590SKeyi Guione not defined by the Dalvik specification.  This allows us to freely
237*055d4590SKeyi Guimix optimized and unoptimized instructions.  The set of optimized
238*055d4590SKeyi Guiinstructions, and their exact representation, is tied closely to the VM
239*055d4590SKeyi Guiversion.
240*055d4590SKeyi Gui<p>
241*055d4590SKeyi GuiMost of the optimizations are obvious "wins".  The use of raw indices
242*055d4590SKeyi Guiand offsets not only allows us to execute more quickly, we can also
243*055d4590SKeyi Guiskip the initial symbolic resolution.  Pre-computation eats up
244*055d4590SKeyi Guidisk space, and so must be done in moderation.
245*055d4590SKeyi Gui<p>
246*055d4590SKeyi GuiThere are a couple of potential sources of trouble with these
247*055d4590SKeyi Guioptimizations.  First, vtable indices and byte offsets are subject to
248*055d4590SKeyi Guichange if the VM is updated.  Second, if a superclass is in a different
249*055d4590SKeyi GuiDEX, and that other DEX is updated, we need to ensure that our optimized
250*055d4590SKeyi Guiindices and offsets are updated as well.  A similar but more subtle
251*055d4590SKeyi Guiproblem emerges when user-defined class loaders are employed: the class
252*055d4590SKeyi Guiwe actually call may not be the one we expected to call.
253*055d4590SKeyi Gui<p>These problems are addressed with dependency lists and some limitations
254*055d4590SKeyi Guion what can be optimized.
255*055d4590SKeyi Gui
256*055d4590SKeyi Gui
257*055d4590SKeyi Gui<h2>Dependencies and Limitations</h2>
258*055d4590SKeyi Gui
259*055d4590SKeyi Gui<p>
260*055d4590SKeyi GuiThe optimized DEX file includes a list of dependencies on other DEX files,
261*055d4590SKeyi Guiplus the CRC-32 and modification date from the originating
262*055d4590SKeyi Gui<code>classes.dex</code> zip file entry.  The dependency list includes the
263*055d4590SKeyi Guifull path to the <code>dalvik-cache</code> file, and the file's SHA-1
264*055d4590SKeyi Guisignature.  The timestamps of files on the device are unreliable and
265*055d4590SKeyi Guinot used.  The dependency area also includes the VM version number.
266*055d4590SKeyi Gui<p>
267*055d4590SKeyi GuiAn optimized DEX is dependent upon all of the DEX files in the bootstrap
268*055d4590SKeyi Guiclass path.  DEX files that are part of the bootstrap class path depend
269*055d4590SKeyi Guiupon the DEX files that appeared earlier.  To ensure that nothing outside
270*055d4590SKeyi Guithe dependent DEX files is available, <code>dexopt</code> only loads the
271*055d4590SKeyi Guibootstrap classes.  References to classes in other DEX files fail, which
272*055d4590SKeyi Guicauses class loading and/or verification to fail, and classes with
273*055d4590SKeyi Guiexternal dependencies are simply not optimized.
274*055d4590SKeyi Gui<p>
275*055d4590SKeyi GuiThis means that splitting code out into many separate DEX files has a
276*055d4590SKeyi Guidisadvantage: virtual method calls and instance field lookups between
277*055d4590SKeyi Guinon-boot DEX files can't be optimized.  Because verification is pass/fail
278*055d4590SKeyi Guiwith class granularity, no method in a class that has any reliance on
279*055d4590SKeyi Guiclasses in external DEX files can be optimized.  This may be a bit
280*055d4590SKeyi Guiheavy-handed, but it's the only way to guarantee that nothing breaks
281*055d4590SKeyi Guiwhen individual pieces are updated.
282*055d4590SKeyi Gui<p>
283*055d4590SKeyi GuiAnother negative consequence: any change to a bootstrap DEX will result
284*055d4590SKeyi Guiin rejection of all optimized DEX files.  This makes it hard to keep
285*055d4590SKeyi Guisystem updates small.
286*055d4590SKeyi Gui<p>
287*055d4590SKeyi GuiDespite our caution, there is still a possibility that a class in a DEX
288*055d4590SKeyi Guifile loaded by a user-defined class loader could ask for a bootstrap class
289*055d4590SKeyi Gui(say, String) and be given a different class with the same name.  If a
290*055d4590SKeyi Guiclass in the DEX file being processed has the same name as a class in the
291*055d4590SKeyi Guibootstrap DEX files, the class will be flagged as ambiguous and references
292*055d4590SKeyi Guito it will not be resolved during verification / optimization.  The class
293*055d4590SKeyi Guilinking code in the VM does additional checks to plug another hole;
294*055d4590SKeyi Guisee the verbose description in the VM sources for details (vm/oo/Class.c).
295*055d4590SKeyi Gui<p>
296*055d4590SKeyi GuiIf one of the dependencies is updated, we need to re-verify and
297*055d4590SKeyi Guire-optimize the DEX file.  If we can do a just-in-time <code>dexopt</code>
298*055d4590SKeyi Guiinvocation, this is easy.  If we have to rely on the installer daemon, or
299*055d4590SKeyi Guithe DEX was shipped only in ODEX, then the VM has to reject the DEX.
300*055d4590SKeyi Gui<p>
301*055d4590SKeyi GuiThe output of <code>dexopt</code> is byte-swapped and struct-aligned
302*055d4590SKeyi Guifor the host, and contains indices and offsets that are highly VM-specific
303*055d4590SKeyi Gui(both version-wise and platform-wise).  For this reason it's tricky to
304*055d4590SKeyi Guiwrite a version of <code>dexopt</code> that runs on the desktop but
305*055d4590SKeyi Guigenerates output suitable for a particular device.  The safest way to
306*055d4590SKeyi Guiinvoke it is on the target device, or on an emulator for that device.
307*055d4590SKeyi Gui
308*055d4590SKeyi Gui
309*055d4590SKeyi Gui<h2>Generated DEX</h2>
310*055d4590SKeyi Gui
311*055d4590SKeyi Gui<p>
312*055d4590SKeyi GuiSome languages and frameworks rely on the ability to generate bytecode
313*055d4590SKeyi Guiand execute it.  The rather heavy <code>dexopt</code> verification and
314*055d4590SKeyi Guioptimization model doesn't work well with that.
315*055d4590SKeyi Gui<p>
316*055d4590SKeyi GuiWe intend to support this in a future release, but the exact method is
317*055d4590SKeyi Guito be determined.  We may allow individual classes to be added or whole
318*055d4590SKeyi GuiDEX files; may allow Java bytecode or Dalvik bytecode in instructions;
319*055d4590SKeyi Guimay perform the usual set of optimizations, or use a separate interpreter
320*055d4590SKeyi Guithat performs on-first-use optimizations directly on the bytecode (which
321*055d4590SKeyi Guiwon't be mapped read-only, since it's locally defined).
322*055d4590SKeyi Gui
323*055d4590SKeyi Gui<address>Copyright &copy; 2008 The Android Open Source Project</address>
324*055d4590SKeyi Gui
325*055d4590SKeyi Gui</body>
326*055d4590SKeyi Gui</html>
327