1*288bf522SAndroid Build Coastguard Worker# Simpleperf 2*288bf522SAndroid Build Coastguard Worker 3*288bf522SAndroid Build Coastguard WorkerAndroid Studio includes a graphical front end to Simpleperf, documented in 4*288bf522SAndroid Build Coastguard Worker[Inspect CPU activity with CPU Profiler](https://developer.android.com/studio/profile/cpu-profiler). 5*288bf522SAndroid Build Coastguard WorkerMost users will prefer to use that instead of using Simpleperf directly. 6*288bf522SAndroid Build Coastguard Worker 7*288bf522SAndroid Build Coastguard WorkerSimpleperf is a native CPU profiling tool for Android. It can be used to profile 8*288bf522SAndroid Build Coastguard Workerboth Android applications and native processes running on Android. It can 9*288bf522SAndroid Build Coastguard Workerprofile both Java and C++ code on Android. The simpleperf executable can run on Android >=L, 10*288bf522SAndroid Build Coastguard Workerand Python scripts can be used on Android >= N. 11*288bf522SAndroid Build Coastguard Worker 12*288bf522SAndroid Build Coastguard WorkerSimpleperf is part of the Android Open Source Project. 13*288bf522SAndroid Build Coastguard WorkerThe source code is [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/). 14*288bf522SAndroid Build Coastguard WorkerThe latest document is [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/README.md). 15*288bf522SAndroid Build Coastguard Worker 16*288bf522SAndroid Build Coastguard Worker[TOC] 17*288bf522SAndroid Build Coastguard Worker 18*288bf522SAndroid Build Coastguard Worker## Introduction 19*288bf522SAndroid Build Coastguard Worker 20*288bf522SAndroid Build Coastguard WorkerAn introduction slide deck is [here](./introduction.pdf). 21*288bf522SAndroid Build Coastguard Worker 22*288bf522SAndroid Build Coastguard WorkerSimpleperf contains two parts: the simpleperf executable and Python scripts. 23*288bf522SAndroid Build Coastguard Worker 24*288bf522SAndroid Build Coastguard WorkerThe simpleperf executable works similar to linux-tools-perf, but has some specific features for 25*288bf522SAndroid Build Coastguard Workerthe Android profiling environment: 26*288bf522SAndroid Build Coastguard Worker 27*288bf522SAndroid Build Coastguard Worker1. It collects more info in profiling data. Since the common workflow is "record on the device, and 28*288bf522SAndroid Build Coastguard Worker report on the host", simpleperf not only collects samples in profiling data, but also collects 29*288bf522SAndroid Build Coastguard Worker needed symbols, device info and recording time. 30*288bf522SAndroid Build Coastguard Worker 31*288bf522SAndroid Build Coastguard Worker2. It delivers new features for recording. 32*288bf522SAndroid Build Coastguard Worker 1) When recording dwarf based call graph, simpleperf unwinds the stack before writing a sample 33*288bf522SAndroid Build Coastguard Worker to file. This is to save storage space on the device. 34*288bf522SAndroid Build Coastguard Worker 2) Support tracing both on CPU time and off CPU time with --trace-offcpu option. 35*288bf522SAndroid Build Coastguard Worker 3) Support recording callgraphs of JITed and interpreted Java code on Android >= P. 36*288bf522SAndroid Build Coastguard Worker 37*288bf522SAndroid Build Coastguard Worker3. It relates closely to the Android platform. 38*288bf522SAndroid Build Coastguard Worker 1) Is aware of Android environment, like using system properties to enable profiling, using 39*288bf522SAndroid Build Coastguard Worker run-as to profile in application's context. 40*288bf522SAndroid Build Coastguard Worker 2) Supports reading symbols and debug information from the .gnu_debugdata section, because 41*288bf522SAndroid Build Coastguard Worker system libraries are built with .gnu_debugdata section starting from Android O. 42*288bf522SAndroid Build Coastguard Worker 3) Supports profiling shared libraries embedded in apk files. 43*288bf522SAndroid Build Coastguard Worker 4) It uses the standard Android stack unwinder, so its results are consistent with all other 44*288bf522SAndroid Build Coastguard Worker Android tools. 45*288bf522SAndroid Build Coastguard Worker 46*288bf522SAndroid Build Coastguard Worker4. It builds executables and shared libraries for different usages. 47*288bf522SAndroid Build Coastguard Worker 1) Builds static executables on the device. Since static executables don't rely on any library, 48*288bf522SAndroid Build Coastguard Worker simpleperf executables can be pushed on any Android device and used to record profiling data. 49*288bf522SAndroid Build Coastguard Worker 2) Builds executables on different hosts: Linux, Mac and Windows. These executables can be used 50*288bf522SAndroid Build Coastguard Worker to report on hosts. 51*288bf522SAndroid Build Coastguard Worker 3) Builds report shared libraries on different hosts. The report library is used by different 52*288bf522SAndroid Build Coastguard Worker Python scripts to parse profiling data. 53*288bf522SAndroid Build Coastguard Worker 54*288bf522SAndroid Build Coastguard WorkerDetailed documentation for the simpleperf executable is [here](#executable-commands-reference). 55*288bf522SAndroid Build Coastguard Worker 56*288bf522SAndroid Build Coastguard WorkerPython scripts are split into three parts according to their functions: 57*288bf522SAndroid Build Coastguard Worker 58*288bf522SAndroid Build Coastguard Worker1. Scripts used for recording, like app_profiler.py, run_simpleperf_without_usb_connection.py. 59*288bf522SAndroid Build Coastguard Worker 60*288bf522SAndroid Build Coastguard Worker2. Scripts used for reporting, like report.py, report_html.py, inferno. 61*288bf522SAndroid Build Coastguard Worker 62*288bf522SAndroid Build Coastguard Worker3. Scripts used for parsing profiling data, like simpleperf_report_lib.py. 63*288bf522SAndroid Build Coastguard Worker 64*288bf522SAndroid Build Coastguard WorkerThe python scripts are tested on Python >= 3.9. Older versions may not be supported. 65*288bf522SAndroid Build Coastguard WorkerDetailed documentation for the Python scripts is [here](#scripts-reference). 66*288bf522SAndroid Build Coastguard Worker 67*288bf522SAndroid Build Coastguard Worker 68*288bf522SAndroid Build Coastguard Worker## Tools in simpleperf 69*288bf522SAndroid Build Coastguard Worker 70*288bf522SAndroid Build Coastguard WorkerThe simpleperf executables and Python scripts are located in simpleperf/ in ndk releases, and in 71*288bf522SAndroid Build Coastguard Workersystem/extras/simpleperf/scripts/ in AOSP. Their functions are listed below. 72*288bf522SAndroid Build Coastguard Worker 73*288bf522SAndroid Build Coastguard Workerbin/: contains executables and shared libraries. 74*288bf522SAndroid Build Coastguard Worker 75*288bf522SAndroid Build Coastguard Workerbin/android/${arch}/simpleperf: static simpleperf executables used on the device. 76*288bf522SAndroid Build Coastguard Worker 77*288bf522SAndroid Build Coastguard Workerbin/${host}/${arch}/simpleperf: simpleperf executables used on the host, only supports reporting. 78*288bf522SAndroid Build Coastguard Worker 79*288bf522SAndroid Build Coastguard Workerbin/${host}/${arch}/libsimpleperf_report.${so/dylib/dll}: report shared libraries used on the host. 80*288bf522SAndroid Build Coastguard Worker 81*288bf522SAndroid Build Coastguard Worker*.py, inferno, purgatorio: Python scripts used for recording and reporting. Details are in [scripts_reference.md](scripts_reference.md). 82*288bf522SAndroid Build Coastguard Worker 83*288bf522SAndroid Build Coastguard Worker 84*288bf522SAndroid Build Coastguard Worker## Android application profiling 85*288bf522SAndroid Build Coastguard Worker 86*288bf522SAndroid Build Coastguard WorkerSee [android_application_profiling.md](./android_application_profiling.md). 87*288bf522SAndroid Build Coastguard Worker 88*288bf522SAndroid Build Coastguard Worker 89*288bf522SAndroid Build Coastguard Worker## Android platform profiling 90*288bf522SAndroid Build Coastguard Worker 91*288bf522SAndroid Build Coastguard WorkerSee [android_platform_profiling.md](./android_platform_profiling.md). 92*288bf522SAndroid Build Coastguard Worker 93*288bf522SAndroid Build Coastguard Worker 94*288bf522SAndroid Build Coastguard Worker## Executable commands reference 95*288bf522SAndroid Build Coastguard Worker 96*288bf522SAndroid Build Coastguard WorkerSee [executable_commands_reference.md](./executable_commands_reference.md). 97*288bf522SAndroid Build Coastguard Worker 98*288bf522SAndroid Build Coastguard Worker 99*288bf522SAndroid Build Coastguard Worker## Scripts reference 100*288bf522SAndroid Build Coastguard Worker 101*288bf522SAndroid Build Coastguard WorkerSee [scripts_reference.md](./scripts_reference.md). 102*288bf522SAndroid Build Coastguard Worker 103*288bf522SAndroid Build Coastguard Worker## View the profile 104*288bf522SAndroid Build Coastguard Worker 105*288bf522SAndroid Build Coastguard WorkerSee [view_the_profile.md](./view_the_profile.md). 106*288bf522SAndroid Build Coastguard Worker 107*288bf522SAndroid Build Coastguard Worker## Answers to common issues 108*288bf522SAndroid Build Coastguard Worker 109*288bf522SAndroid Build Coastguard Worker### Support on different Android versions 110*288bf522SAndroid Build Coastguard Worker 111*288bf522SAndroid Build Coastguard WorkerOn Android < N, the kernel may be too old (< 3.18) to support features like recording DWARF 112*288bf522SAndroid Build Coastguard Workerbased call graphs. 113*288bf522SAndroid Build Coastguard WorkerOn Android M - O, we can only profile C++ code and fully compiled Java code. 114*288bf522SAndroid Build Coastguard WorkerOn Android >= P, the ART interpreter supports DWARF based unwinding. So we can profile Java code. 115*288bf522SAndroid Build Coastguard WorkerOn Android >= Q, we can used simpleperf shipped on device to profile released Android apps, with 116*288bf522SAndroid Build Coastguard Worker `<profileable android:shell="true" />`. 117*288bf522SAndroid Build Coastguard Worker 118*288bf522SAndroid Build Coastguard Worker 119*288bf522SAndroid Build Coastguard Worker### Comparing DWARF based and stack frame based call graphs 120*288bf522SAndroid Build Coastguard Worker 121*288bf522SAndroid Build Coastguard WorkerSimpleperf supports two ways recording call stacks with samples. One is DWARF based call graph, 122*288bf522SAndroid Build Coastguard Workerthe other is stack frame based call graph. Below is their comparison: 123*288bf522SAndroid Build Coastguard Worker 124*288bf522SAndroid Build Coastguard WorkerRecording DWARF based call graph: 125*288bf522SAndroid Build Coastguard Worker1. Needs support of debug information in binaries. 126*288bf522SAndroid Build Coastguard Worker2. Behaves normally well on both ARM and ARM64, for both Java code and C++ code. 127*288bf522SAndroid Build Coastguard Worker3. Can only unwind 64K stack for each sample. So it isn't always possible to unwind to the bottom. 128*288bf522SAndroid Build Coastguard Worker However, this is alleviated in simpleperf, as explained in the next section. 129*288bf522SAndroid Build Coastguard Worker4. Takes more CPU time than stack frame based call graphs. So it has higher overhead, and can't 130*288bf522SAndroid Build Coastguard Worker sample at very high frequency (usually <= 4000 Hz). 131*288bf522SAndroid Build Coastguard Worker 132*288bf522SAndroid Build Coastguard WorkerRecording stack frame based call graph: 133*288bf522SAndroid Build Coastguard Worker1. Needs support of stack frame registers. 134*288bf522SAndroid Build Coastguard Worker2. Doesn't work well on ARM. Because ARM is short of registers, and ARM and THUMB code have 135*288bf522SAndroid Build Coastguard Worker different stack frame registers. So the kernel can't unwind user stack containing both ARM and 136*288bf522SAndroid Build Coastguard Worker THUMB code. 137*288bf522SAndroid Build Coastguard Worker3. Also doesn't work well on Java code. Because the ART compiler doesn't reserve stack frame 138*288bf522SAndroid Build Coastguard Worker registers. And it can't get frames for interpreted Java code. 139*288bf522SAndroid Build Coastguard Worker4. Works well when profiling native programs on ARM64. One example is profiling surfacelinger. And 140*288bf522SAndroid Build Coastguard Worker usually shows complete flamegraph when it works well. 141*288bf522SAndroid Build Coastguard Worker5. Takes much less CPU time than DWARF based call graphs. So the sample frequency can be 10000 Hz or 142*288bf522SAndroid Build Coastguard Worker higher. 143*288bf522SAndroid Build Coastguard Worker 144*288bf522SAndroid Build Coastguard WorkerSo if you need to profile code on ARM or profile Java code, DWARF based call graph is better. If you 145*288bf522SAndroid Build Coastguard Workerneed to profile C++ code on ARM64, stack frame based call graphs may be better. After all, you can 146*288bf522SAndroid Build Coastguard Workerfisrt try DWARF based call graph, which is also the default option when `-g` is used. Because it 147*288bf522SAndroid Build Coastguard Workeralways produces reasonable results. If it doesn't work well enough, then try stack frame based call 148*288bf522SAndroid Build Coastguard Workergraph instead. 149*288bf522SAndroid Build Coastguard Worker 150*288bf522SAndroid Build Coastguard Worker 151*288bf522SAndroid Build Coastguard Worker### Fix broken DWARF based call graph 152*288bf522SAndroid Build Coastguard Worker 153*288bf522SAndroid Build Coastguard WorkerA DWARF-based call graph is generated by unwinding thread stacks. When a sample is recorded, a 154*288bf522SAndroid Build Coastguard Workerkernel dumps up to 64 kilobytes of stack data. By unwinding the stack based on DWARF information, 155*288bf522SAndroid Build Coastguard Workerwe can get a call stack. 156*288bf522SAndroid Build Coastguard Worker 157*288bf522SAndroid Build Coastguard WorkerTwo reasons may cause a broken call stack: 158*288bf522SAndroid Build Coastguard Worker1. The kernel can only dump up to 64 kilobytes of stack data for each sample, but a thread can have 159*288bf522SAndroid Build Coastguard Worker much larger stack. In this case, we can't unwind to the thread start point. 160*288bf522SAndroid Build Coastguard Worker 161*288bf522SAndroid Build Coastguard Worker2. We need binaries containing DWARF call frame information to unwind stack frames. The binary 162*288bf522SAndroid Build Coastguard Worker should have one of the following sections: .eh_frame, .debug_frame, .ARM.exidx or .gnu_debugdata. 163*288bf522SAndroid Build Coastguard Worker 164*288bf522SAndroid Build Coastguard WorkerTo mitigate these problems, 165*288bf522SAndroid Build Coastguard Worker 166*288bf522SAndroid Build Coastguard Worker 167*288bf522SAndroid Build Coastguard WorkerFor the missing stack data problem: 168*288bf522SAndroid Build Coastguard Worker1. To alleviate it, simpleperf joins callchains (call stacks) after recording. If two callchains of 169*288bf522SAndroid Build Coastguard Worker a thread have an entry containing the same ip and sp address, then simpleperf tries to join them 170*288bf522SAndroid Build Coastguard Worker to make the callchains longer. So we can get more complete callchains by recording longer and 171*288bf522SAndroid Build Coastguard Worker joining more samples. This doesn't guarantee to get complete call graphs. But it usually works 172*288bf522SAndroid Build Coastguard Worker well. 173*288bf522SAndroid Build Coastguard Worker 174*288bf522SAndroid Build Coastguard Worker2. Simpleperf stores samples in a buffer before unwinding them. If the bufer is low in free space, 175*288bf522SAndroid Build Coastguard Worker simpleperf may decide to truncate stack data for a sample to 1K. Hopefully, this can be recovered 176*288bf522SAndroid Build Coastguard Worker by callchain joiner. But when a high percentage of samples are truncated, many callchains can be 177*288bf522SAndroid Build Coastguard Worker broken. We can tell if many samples are truncated in the record command output, like: 178*288bf522SAndroid Build Coastguard Worker 179*288bf522SAndroid Build Coastguard Worker```sh 180*288bf522SAndroid Build Coastguard Worker$ simpleperf record ... 181*288bf522SAndroid Build Coastguard Workersimpleperf I cmd_record.cpp:809] Samples recorded: 105584 (cut 86291). Samples lost: 6501. 182*288bf522SAndroid Build Coastguard Worker 183*288bf522SAndroid Build Coastguard Worker$ simpleperf record ... 184*288bf522SAndroid Build Coastguard Workersimpleperf I cmd_record.cpp:894] Samples recorded: 7,365 (1,857 with truncated stacks). 185*288bf522SAndroid Build Coastguard Worker``` 186*288bf522SAndroid Build Coastguard Worker 187*288bf522SAndroid Build Coastguard Worker There are two ways to avoid truncating samples. One is increasing the buffer size, like 188*288bf522SAndroid Build Coastguard Worker `--user-buffer-size 1G`. But `--user-buffer-size` is only available on latest simpleperf. If that 189*288bf522SAndroid Build Coastguard Worker option isn't available, we can use `--no-cut-samples` to disable truncating samples. 190*288bf522SAndroid Build Coastguard Worker 191*288bf522SAndroid Build Coastguard WorkerFor the missing DWARF call frame info problem: 192*288bf522SAndroid Build Coastguard Worker1. Most C++ code generates binaries containing call frame info, in .eh_frame or .ARM.exidx sections. 193*288bf522SAndroid Build Coastguard Worker These sections are not stripped, and are usually enough for stack unwinding. 194*288bf522SAndroid Build Coastguard Worker 195*288bf522SAndroid Build Coastguard Worker2. For C code and a small percentage of C++ code that the compiler is sure will not generate 196*288bf522SAndroid Build Coastguard Worker exceptions, the call frame info is generated in .debug_frame section. .debug_frame section is 197*288bf522SAndroid Build Coastguard Worker usually stripped with other debug sections. One way to fix it, is to download unstripped binaries 198*288bf522SAndroid Build Coastguard Worker on device, as [here](#fix-broken-callchain-stopped-at-c-functions). 199*288bf522SAndroid Build Coastguard Worker 200*288bf522SAndroid Build Coastguard Worker3. The compiler doesn't generate unwind instructions for function prologue and epilogue. Because 201*288bf522SAndroid Build Coastguard Worker they operates stack frames and will not generate exceptions. But profiling may hit these 202*288bf522SAndroid Build Coastguard Worker instructions, and fails to unwind them. This usually doesn't matter in a frame graph. But in a 203*288bf522SAndroid Build Coastguard Worker time based Stack Chart (like in Android Studio and Firefox profiler), this causes stack gaps once 204*288bf522SAndroid Build Coastguard Worker in a while. We can remove stack gaps via `--remove-gaps`, which is already enabled by default. 205*288bf522SAndroid Build Coastguard Worker 206*288bf522SAndroid Build Coastguard Worker 207*288bf522SAndroid Build Coastguard Worker### Fix broken callchain stopped at C functions 208*288bf522SAndroid Build Coastguard Worker 209*288bf522SAndroid Build Coastguard WorkerWhen using dwarf based call graphs, simpleperf generates callchains during recording to save space. 210*288bf522SAndroid Build Coastguard WorkerThe debug information needed to unwind C functions is in .debug_frame section, which is usually 211*288bf522SAndroid Build Coastguard Workerstripped in native libraries in apks. To fix this, we can download unstripped version of native 212*288bf522SAndroid Build Coastguard Workerlibraries on device, and ask simpleperf to use them when recording. 213*288bf522SAndroid Build Coastguard Worker 214*288bf522SAndroid Build Coastguard WorkerTo use simpleperf directly: 215*288bf522SAndroid Build Coastguard Worker 216*288bf522SAndroid Build Coastguard Worker```sh 217*288bf522SAndroid Build Coastguard Worker# create native_libs dir on device, and push unstripped libs in it (nested dirs are not supported). 218*288bf522SAndroid Build Coastguard Worker$ adb shell mkdir /data/local/tmp/native_libs 219*288bf522SAndroid Build Coastguard Worker$ adb push <unstripped_dir>/*.so /data/local/tmp/native_libs 220*288bf522SAndroid Build Coastguard Worker# run simpleperf record with --symfs option. 221*288bf522SAndroid Build Coastguard Worker$ adb shell simpleperf record xxx --symfs /data/local/tmp/native_libs 222*288bf522SAndroid Build Coastguard Worker``` 223*288bf522SAndroid Build Coastguard Worker 224*288bf522SAndroid Build Coastguard WorkerTo use app_profiler.py: 225*288bf522SAndroid Build Coastguard Worker 226*288bf522SAndroid Build Coastguard Worker```sh 227*288bf522SAndroid Build Coastguard Worker$ ./app_profiler.py -lib <unstripped_dir> 228*288bf522SAndroid Build Coastguard Worker``` 229*288bf522SAndroid Build Coastguard Worker 230*288bf522SAndroid Build Coastguard Worker 231*288bf522SAndroid Build Coastguard Worker### How to solve missing symbols in report? 232*288bf522SAndroid Build Coastguard Worker 233*288bf522SAndroid Build Coastguard WorkerThe simpleperf record command collects symbols on device in perf.data. But if the native libraries 234*288bf522SAndroid Build Coastguard Workeryou use on device are stripped, this will result in a lot of unknown symbols in the report. A 235*288bf522SAndroid Build Coastguard Workersolution is to build binary_cache on host. 236*288bf522SAndroid Build Coastguard Worker 237*288bf522SAndroid Build Coastguard Worker```sh 238*288bf522SAndroid Build Coastguard Worker# Collect binaries needed by perf.data in binary_cache/. 239*288bf522SAndroid Build Coastguard Worker$ ./binary_cache_builder.py -lib NATIVE_LIB_DIR,... 240*288bf522SAndroid Build Coastguard Worker``` 241*288bf522SAndroid Build Coastguard Worker 242*288bf522SAndroid Build Coastguard WorkerThe NATIVE_LIB_DIRs passed in -lib option are the directories containing unstripped native 243*288bf522SAndroid Build Coastguard Workerlibraries on host. After running it, the native libraries containing symbol tables are collected 244*288bf522SAndroid Build Coastguard Workerin binary_cache/ for use when reporting. 245*288bf522SAndroid Build Coastguard Worker 246*288bf522SAndroid Build Coastguard Worker```sh 247*288bf522SAndroid Build Coastguard Worker$ ./report.py --symfs binary_cache 248*288bf522SAndroid Build Coastguard Worker 249*288bf522SAndroid Build Coastguard Worker# report_html.py searches binary_cache/ automatically, so you don't need to 250*288bf522SAndroid Build Coastguard Worker# pass it any argument. 251*288bf522SAndroid Build Coastguard Worker$ ./report_html.py 252*288bf522SAndroid Build Coastguard Worker``` 253*288bf522SAndroid Build Coastguard Worker 254*288bf522SAndroid Build Coastguard Worker 255*288bf522SAndroid Build Coastguard Worker### Show annotated source code and disassembly 256*288bf522SAndroid Build Coastguard Worker 257*288bf522SAndroid Build Coastguard WorkerTo show hot places at source code and instruction level, we need to show source code and 258*288bf522SAndroid Build Coastguard Workerdisassembly with event count annotation. Simpleperf supports showing annotated source code and 259*288bf522SAndroid Build Coastguard Workerdisassembly for C++ code and fully compiled Java code. Simpleperf supports two ways to do it: 260*288bf522SAndroid Build Coastguard Worker 261*288bf522SAndroid Build Coastguard Worker1. Through report_html.py: 262*288bf522SAndroid Build Coastguard Worker 1) Generate perf.data and pull it on host. 263*288bf522SAndroid Build Coastguard Worker 2) Generate binary_cache, containing elf files with debug information. Use -lib option to add 264*288bf522SAndroid Build Coastguard Worker libs with debug info. Do it with 265*288bf522SAndroid Build Coastguard Worker `binary_cache_builder.py -i perf.data -lib <dir_of_lib_with_debug_info>`. 266*288bf522SAndroid Build Coastguard Worker For Android platform, we can add debug binaries as in [Android platform profiling](android_platform_profiling.md#general-tips). 267*288bf522SAndroid Build Coastguard Worker 3) Use report_html.py to generate report.html with annotated source code and disassembly, 268*288bf522SAndroid Build Coastguard Worker as described [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/scripts_reference.md#report_html_py). 269*288bf522SAndroid Build Coastguard Worker 270*288bf522SAndroid Build Coastguard Worker2. Through pprof. 271*288bf522SAndroid Build Coastguard Worker 1) Generate perf.data and binary_cache as above. 272*288bf522SAndroid Build Coastguard Worker 2) Use pprof_proto_generator.py to generate pprof proto file. `pprof_proto_generator.py`. 273*288bf522SAndroid Build Coastguard Worker 3) Use pprof to report a function with annotated source code, as described [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/scripts_reference.md#pprof_proto_generator_py). 274*288bf522SAndroid Build Coastguard Worker 275*288bf522SAndroid Build Coastguard Worker3. Through Continuous PProf UI. 276*288bf522SAndroid Build Coastguard Worker 1) Generate pprof proto file as above. 277*288bf522SAndroid Build Coastguard Worker 2) Upload pprof.profile to pprof/. It can show source file path and line numbers for each symbol. 278*288bf522SAndroid Build Coastguard Worker An example is [here](https://pprof.corp.google.com/?id=f5588600d3a225737a1901cb28f3f5b1). 279*288bf522SAndroid Build Coastguard Worker 280*288bf522SAndroid Build Coastguard Worker 281*288bf522SAndroid Build Coastguard Worker### Reduce lost samples and samples with truncated stack 282*288bf522SAndroid Build Coastguard Worker 283*288bf522SAndroid Build Coastguard WorkerWhen using `simpleperf record`, we may see lost samples or samples with truncated stack data. Before 284*288bf522SAndroid Build Coastguard Workersaving samples to a file, simpleperf uses two buffers to cache samples in memory. One is a kernel 285*288bf522SAndroid Build Coastguard Workerbuffer, the other is a userspace buffer. The kernel puts samples to the kernel buffer. Simpleperf 286*288bf522SAndroid Build Coastguard Workermoves samples from the kernel buffer to the userspace buffer before processing them. If a buffer 287*288bf522SAndroid Build Coastguard Workeroverflows, we lose samples or get samples with truncated stack data. Below is an example. 288*288bf522SAndroid Build Coastguard Worker 289*288bf522SAndroid Build Coastguard Worker```sh 290*288bf522SAndroid Build Coastguard Worker$ simpleperf record -a --duration 1 -g --user-buffer-size 100k 291*288bf522SAndroid Build Coastguard Workersimpleperf I cmd_record.cpp:799] Recorded for 1.00814 seconds. Start post processing. 292*288bf522SAndroid Build Coastguard Workersimpleperf I cmd_record.cpp:894] Samples recorded: 79 (16 with truncated stacks). 293*288bf522SAndroid Build Coastguard Worker Samples lost: 2,129 (kernelspace: 18, userspace: 2,111). 294*288bf522SAndroid Build Coastguard Workersimpleperf W cmd_record.cpp:911] Lost 18.5567% of samples in kernel space, consider increasing 295*288bf522SAndroid Build Coastguard Worker kernel buffer size(-m), or decreasing sample frequency(-f), or 296*288bf522SAndroid Build Coastguard Worker increasing sample period(-c). 297*288bf522SAndroid Build Coastguard Workersimpleperf W cmd_record.cpp:928] Lost/Truncated 97.1233% of samples in user space, consider 298*288bf522SAndroid Build Coastguard Worker increasing userspace buffer size(--user-buffer-size), or 299*288bf522SAndroid Build Coastguard Worker decreasing sample frequency(-f), or increasing sample period(-c). 300*288bf522SAndroid Build Coastguard Worker``` 301*288bf522SAndroid Build Coastguard Worker 302*288bf522SAndroid Build Coastguard WorkerIn the above example, we get 79 samples, 16 of them are with truncated stack data. We lose 18 303*288bf522SAndroid Build Coastguard Workersamples in the kernel buffer, and lose 2111 samples in the userspace buffer. 304*288bf522SAndroid Build Coastguard Worker 305*288bf522SAndroid Build Coastguard WorkerTo reduce lost samples in the kernel buffer, we can increase kernel buffer size via `-m`. To reduce 306*288bf522SAndroid Build Coastguard Workerlost samples in the userspace buffer, or reduce samples with truncated stack data, we can increase 307*288bf522SAndroid Build Coastguard Workeruserspace buffer size via `--user-buffer-size`. 308*288bf522SAndroid Build Coastguard Worker 309*288bf522SAndroid Build Coastguard WorkerWe can also reduce samples generated in a fixed time period, like reducing sample frequency using 310*288bf522SAndroid Build Coastguard Worker`-f`, reducing monitored threads, not monitoring multiple perf events at the same time. 311*288bf522SAndroid Build Coastguard Worker 312*288bf522SAndroid Build Coastguard Worker 313*288bf522SAndroid Build Coastguard Worker## Bugs and contribution 314*288bf522SAndroid Build Coastguard Worker 315*288bf522SAndroid Build Coastguard WorkerBugs and feature requests can be submitted at https://github.com/android/ndk/issues. 316*288bf522SAndroid Build Coastguard WorkerPatches can be uploaded to android-review.googlesource.com as [here](https://source.android.com/setup/contribute/), 317*288bf522SAndroid Build Coastguard Workeror sent to email addresses listed [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/OWNERS). 318*288bf522SAndroid Build Coastguard Worker 319*288bf522SAndroid Build Coastguard WorkerIf you want to compile simpleperf C++ source code, follow below steps: 320*288bf522SAndroid Build Coastguard Worker1. Download AOSP main branch as [here](https://source.android.com/setup/build/requirements). 321*288bf522SAndroid Build Coastguard Worker2. Build simpleperf. 322*288bf522SAndroid Build Coastguard Worker```sh 323*288bf522SAndroid Build Coastguard Worker$ . build/envsetup.sh 324*288bf522SAndroid Build Coastguard Worker$ lunch aosp_arm64-trunk_staging-userdebug 325*288bf522SAndroid Build Coastguard Worker$ mmma system/extras/simpleperf -j30 326*288bf522SAndroid Build Coastguard Worker``` 327*288bf522SAndroid Build Coastguard Worker 328*288bf522SAndroid Build Coastguard WorkerIf built successfully, out/target/product/generic_arm64/system/bin/simpleperf is for ARM64, and 329*288bf522SAndroid Build Coastguard Workerout/target/product/generic_arm64/system/bin/simpleperf32 is for ARM. 330*288bf522SAndroid Build Coastguard Worker 331*288bf522SAndroid Build Coastguard WorkerThe source code of simpleperf python scripts is in [system/extras/simpleperf/scripts](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/scripts/). 332*288bf522SAndroid Build Coastguard WorkerMost scripts rely on simpleperf binaries to work. To update binaries for scripts (using linux 333*288bf522SAndroid Build Coastguard Workerx86_64 host and android arm64 target as an example): 334*288bf522SAndroid Build Coastguard Worker```sh 335*288bf522SAndroid Build Coastguard Worker$ cp out/host/linux-x86/lib64/libsimpleperf_report.so system/extras/simpleperf/scripts/bin/linux/x86_64/libsimpleperf_report.so 336*288bf522SAndroid Build Coastguard Worker$ cp out/target/product/generic_arm64/system/bin/simpleperf_ndk64 system/extras/simpleperf/scripts/bin/android/arm64/simpleperf 337*288bf522SAndroid Build Coastguard Worker``` 338*288bf522SAndroid Build Coastguard Worker 339*288bf522SAndroid Build Coastguard WorkerThen you can try the latest simpleperf scripts and binaries in system/extras/simpleperf/scripts. 340