1*6dbdd20aSAndroid Build Coastguard Worker# Buffers and dataflow 2*6dbdd20aSAndroid Build Coastguard Worker 3*6dbdd20aSAndroid Build Coastguard WorkerThis page describes the dataflow in Perfetto when recording traces. It describes 4*6dbdd20aSAndroid Build Coastguard Workerall the buffering stages, explains how to size the buffers and how to debug 5*6dbdd20aSAndroid Build Coastguard Workerdata losses. 6*6dbdd20aSAndroid Build Coastguard Worker 7*6dbdd20aSAndroid Build Coastguard Worker## Concepts 8*6dbdd20aSAndroid Build Coastguard Worker 9*6dbdd20aSAndroid Build Coastguard WorkerTracing in Perfetto is an asynchronous multiple-writer single-reader pipeline. 10*6dbdd20aSAndroid Build Coastguard WorkerIn many senses, its architecture is very similar to modern GPUs' command 11*6dbdd20aSAndroid Build Coastguard Workerbuffers. 12*6dbdd20aSAndroid Build Coastguard Worker 13*6dbdd20aSAndroid Build Coastguard WorkerThe design principles of the tracing dataflow are: 14*6dbdd20aSAndroid Build Coastguard Worker 15*6dbdd20aSAndroid Build Coastguard Worker* The tracing fastpath is based on direct writes into a shared memory buffer. 16*6dbdd20aSAndroid Build Coastguard Worker* Highly optimized for low-overhead writing. NOT optimized for low-latency 17*6dbdd20aSAndroid Build Coastguard Worker reading. 18*6dbdd20aSAndroid Build Coastguard Worker* Trace data is eventually committed in the central trace buffer by the end 19*6dbdd20aSAndroid Build Coastguard Worker of the trace or when explicit flush requests are issued via the IPC channel. 20*6dbdd20aSAndroid Build Coastguard Worker* Producers are untrusted and should not be able to see each-other's trace data, 21*6dbdd20aSAndroid Build Coastguard Worker as that would leak sensitive information. 22*6dbdd20aSAndroid Build Coastguard Worker 23*6dbdd20aSAndroid Build Coastguard WorkerIn the general case, there are two types buffers involved in a trace. When 24*6dbdd20aSAndroid Build Coastguard Workerpulling data from the Linux kernel's ftrace infrastructure, there is a third 25*6dbdd20aSAndroid Build Coastguard Workerstage of buffering (one per-CPU) involved: 26*6dbdd20aSAndroid Build Coastguard Worker 27*6dbdd20aSAndroid Build Coastguard Worker 28*6dbdd20aSAndroid Build Coastguard Worker 29*6dbdd20aSAndroid Build Coastguard Worker#### Tracing service's central buffers 30*6dbdd20aSAndroid Build Coastguard Worker 31*6dbdd20aSAndroid Build Coastguard WorkerThese buffers (yellow, in the picture above) are defined by the user in the 32*6dbdd20aSAndroid Build Coastguard Worker`buffers` section of the [trace config](config.md). In the most simple cases, 33*6dbdd20aSAndroid Build Coastguard Workerone tracing session = one buffer, regardless of the number of data sources and 34*6dbdd20aSAndroid Build Coastguard Workerproducers. 35*6dbdd20aSAndroid Build Coastguard Worker 36*6dbdd20aSAndroid Build Coastguard WorkerThis is the place where the tracing data is ultimately kept, while in memory, 37*6dbdd20aSAndroid Build Coastguard Workerwhether it comes from the kernel ftrace infrastructure, from some other data 38*6dbdd20aSAndroid Build Coastguard Workersource in `traced_probes` or from another userspace process using the 39*6dbdd20aSAndroid Build Coastguard Worker[Perfetto SDK](/docs/instrumentation/tracing-sdk.md). 40*6dbdd20aSAndroid Build Coastguard WorkerAt the end of the trace (or during, if in [streaming mode]) these buffers are 41*6dbdd20aSAndroid Build Coastguard Workerwritten into the output trace file. 42*6dbdd20aSAndroid Build Coastguard Worker 43*6dbdd20aSAndroid Build Coastguard WorkerThese buffers can contain a mixture of trace packets coming from different data 44*6dbdd20aSAndroid Build Coastguard Workersources and even different producer processes. What-goes-where is defined in the 45*6dbdd20aSAndroid Build Coastguard Worker[buffers mapping section](config.md#dynamic-buffer-mapping) of the trace config. 46*6dbdd20aSAndroid Build Coastguard WorkerBecause of this, the tracing buffers are not shared across processes, to avoid 47*6dbdd20aSAndroid Build Coastguard Workercross-talking and information leaking across producer processes. 48*6dbdd20aSAndroid Build Coastguard Worker 49*6dbdd20aSAndroid Build Coastguard Worker#### Shared memory buffers 50*6dbdd20aSAndroid Build Coastguard Worker 51*6dbdd20aSAndroid Build Coastguard WorkerEach producer process has one memory buffer shared 1:1 with the tracing service 52*6dbdd20aSAndroid Build Coastguard Worker(blue, in the picture above), regardless of the number of data sources it hosts. 53*6dbdd20aSAndroid Build Coastguard WorkerThis buffer is a temporary staging buffer and has two purposes: 54*6dbdd20aSAndroid Build Coastguard Worker 55*6dbdd20aSAndroid Build Coastguard Worker1. Zero-copy on the writer path. This buffer allows direct serialization of the 56*6dbdd20aSAndroid Build Coastguard Worker tracing data from the writer fastpath in a memory region directly readable by 57*6dbdd20aSAndroid Build Coastguard Worker the tracing service. 58*6dbdd20aSAndroid Build Coastguard Worker 59*6dbdd20aSAndroid Build Coastguard Worker2. Decoupling writes from reads of the tracing service. The tracing service has 60*6dbdd20aSAndroid Build Coastguard Worker the job of moving trace packets from the shared memory buffer (blue) into the 61*6dbdd20aSAndroid Build Coastguard Worker central buffer (yellow) as fast as it can. 62*6dbdd20aSAndroid Build Coastguard Worker The shared memory buffer hides the scheduling and response latencies of the 63*6dbdd20aSAndroid Build Coastguard Worker tracing service, allowing the producer to keep writing without losing data 64*6dbdd20aSAndroid Build Coastguard Worker when the tracing service is temporarily blocked. 65*6dbdd20aSAndroid Build Coastguard Worker 66*6dbdd20aSAndroid Build Coastguard Worker#### Ftrace buffer 67*6dbdd20aSAndroid Build Coastguard Worker 68*6dbdd20aSAndroid Build Coastguard WorkerWhen the `linux.ftrace` data source is enabled, the kernel will have its own 69*6dbdd20aSAndroid Build Coastguard Workerper-CPU buffers. These are unavoidable because the kernel cannot write directly 70*6dbdd20aSAndroid Build Coastguard Workerinto user-space buffers. The `traced_probes` process will periodically read 71*6dbdd20aSAndroid Build Coastguard Workerthose buffers, convert the data into binary protos and follow the same dataflow 72*6dbdd20aSAndroid Build Coastguard Workerof userspace tracing. These buffers need to be just large enough to hold data 73*6dbdd20aSAndroid Build Coastguard Workerbetween two ftrace read cycles (`TraceConfig.FtraceConfig.drain_period_ms`). 74*6dbdd20aSAndroid Build Coastguard Worker 75*6dbdd20aSAndroid Build Coastguard Worker## Life of a trace packet 76*6dbdd20aSAndroid Build Coastguard Worker 77*6dbdd20aSAndroid Build Coastguard WorkerHere is a summary to understand the dataflow of trace packets across buffers. 78*6dbdd20aSAndroid Build Coastguard WorkerConsider the case of a producer process hosting two data sources writing packets 79*6dbdd20aSAndroid Build Coastguard Workerat a different rates, both targeting the same central buffer. 80*6dbdd20aSAndroid Build Coastguard Worker 81*6dbdd20aSAndroid Build Coastguard Worker1. When each data source starts writing, it will grab a free page of the shared 82*6dbdd20aSAndroid Build Coastguard Worker memory buffer and directly serialize proto-encoded tracing data onto it. 83*6dbdd20aSAndroid Build Coastguard Worker 84*6dbdd20aSAndroid Build Coastguard Worker2. When a page of the shared memory buffer is filled, the producer will send an 85*6dbdd20aSAndroid Build Coastguard Worker async IPC to the service, asking it to copy the shared memory page just 86*6dbdd20aSAndroid Build Coastguard Worker written. Then, the producer will grab the next free page in the shared memory 87*6dbdd20aSAndroid Build Coastguard Worker buffer and keep writing. 88*6dbdd20aSAndroid Build Coastguard Worker 89*6dbdd20aSAndroid Build Coastguard Worker3. When the service receives the IPC, it copies the shared memory page into 90*6dbdd20aSAndroid Build Coastguard Worker the central buffer and marks the shared memory buffer page as free again. Data 91*6dbdd20aSAndroid Build Coastguard Worker sources within the producer are able to reuse that page at this point. 92*6dbdd20aSAndroid Build Coastguard Worker 93*6dbdd20aSAndroid Build Coastguard Worker4. When the tracing session ends, the service sends a `Flush` request to all 94*6dbdd20aSAndroid Build Coastguard Worker data sources. In reaction to this, data sources will commit all outstanding 95*6dbdd20aSAndroid Build Coastguard Worker shared memory pages, even if not completely full. The services copies these 96*6dbdd20aSAndroid Build Coastguard Worker pages into the service's central buffer. 97*6dbdd20aSAndroid Build Coastguard Worker 98*6dbdd20aSAndroid Build Coastguard Worker 99*6dbdd20aSAndroid Build Coastguard Worker 100*6dbdd20aSAndroid Build Coastguard Worker## Buffer sizing 101*6dbdd20aSAndroid Build Coastguard Worker 102*6dbdd20aSAndroid Build Coastguard Worker#### Central buffer sizing 103*6dbdd20aSAndroid Build Coastguard Worker 104*6dbdd20aSAndroid Build Coastguard WorkerThe math for sizing the central buffer is quite straightforward: in the default 105*6dbdd20aSAndroid Build Coastguard Workercase of tracing without `write_into_file` (when the trace file is written only 106*6dbdd20aSAndroid Build Coastguard Workerat the end of the trace), the buffer will hold as much data as it has been 107*6dbdd20aSAndroid Build Coastguard Workerwritten by the various data sources. 108*6dbdd20aSAndroid Build Coastguard Worker 109*6dbdd20aSAndroid Build Coastguard WorkerThe total length of the trace will be `(buffer size) / (aggregated write rate)`. 110*6dbdd20aSAndroid Build Coastguard WorkerIf all producers write at a combined rate of 2 MB/s, a 16 MB buffer will hold 111*6dbdd20aSAndroid Build Coastguard Worker~ 8 seconds of tracing data. 112*6dbdd20aSAndroid Build Coastguard Worker 113*6dbdd20aSAndroid Build Coastguard WorkerThe write rate is highly dependent on the data sources configured and by the 114*6dbdd20aSAndroid Build Coastguard Workeractivity of the system. 1-2 MB/s is a typical figure on Android traces with 115*6dbdd20aSAndroid Build Coastguard Workerscheduler tracing, but can go up easily by 1+ orders of magnitude if chattier 116*6dbdd20aSAndroid Build Coastguard Workerdata sources are enabled (e.g., syscall or pagefault tracing). 117*6dbdd20aSAndroid Build Coastguard Worker 118*6dbdd20aSAndroid Build Coastguard WorkerWhen using [streaming mode] the buffer needs to be able to hold enough data 119*6dbdd20aSAndroid Build Coastguard Workerbetween two `file_write_period_ms` periods (default: 5s). 120*6dbdd20aSAndroid Build Coastguard WorkerFor instance, if `file_write_period_ms = 5000` and the write data rate is 2 MB/s 121*6dbdd20aSAndroid Build Coastguard Workerthe central buffer needs to be at least 5 * 2 = 10 MB to avoid data losses. 122*6dbdd20aSAndroid Build Coastguard Worker 123*6dbdd20aSAndroid Build Coastguard Worker#### Shared memory buffer sizing 124*6dbdd20aSAndroid Build Coastguard Worker 125*6dbdd20aSAndroid Build Coastguard WorkerThe sizing of the shared memory buffer depends on: 126*6dbdd20aSAndroid Build Coastguard Worker 127*6dbdd20aSAndroid Build Coastguard Worker* The scheduling characteristics of the underlying system, i.e. for how long the 128*6dbdd20aSAndroid Build Coastguard Worker tracing service can be blocked on the scheduler queues. This is a function of 129*6dbdd20aSAndroid Build Coastguard Worker the kernel configuration and nice-ness level of the `traced` process. 130*6dbdd20aSAndroid Build Coastguard Worker* The max write rate of all data sources within a producer process. 131*6dbdd20aSAndroid Build Coastguard Worker 132*6dbdd20aSAndroid Build Coastguard WorkerSuppose that a producer produce at a max rate of 8 MB/s. If `traced` gets 133*6dbdd20aSAndroid Build Coastguard Workerblocked for 10 ms, the shared memory buffer need to be at least 8 * 0.01 = 80 KB 134*6dbdd20aSAndroid Build Coastguard Workerto avoid losses. 135*6dbdd20aSAndroid Build Coastguard Worker 136*6dbdd20aSAndroid Build Coastguard WorkerEmpirical measurements suggest that on most Android systems a shared memory 137*6dbdd20aSAndroid Build Coastguard Workerbuffer size of 128-512 KB is good enough. 138*6dbdd20aSAndroid Build Coastguard Worker 139*6dbdd20aSAndroid Build Coastguard WorkerThe default shared memory buffer size is 256 KB. When using the Perfetto Client 140*6dbdd20aSAndroid Build Coastguard WorkerLibrary, this value can be tweaked setting `TracingInitArgs.shmem_size_hint_kb`. 141*6dbdd20aSAndroid Build Coastguard Worker 142*6dbdd20aSAndroid Build Coastguard WorkerWARNING: if a data source writes very large trace packets in a single batch, 143*6dbdd20aSAndroid Build Coastguard Workereither the shared memory buffer needs to be big enough to handle that or 144*6dbdd20aSAndroid Build Coastguard Worker`BufferExhaustedPolicy.kStall` must be employed. 145*6dbdd20aSAndroid Build Coastguard Worker 146*6dbdd20aSAndroid Build Coastguard WorkerFor instance, consider a data source that emits a 2MB screenshot every 10s. 147*6dbdd20aSAndroid Build Coastguard WorkerIts (simplified) code, would look like: 148*6dbdd20aSAndroid Build Coastguard Worker```c++ 149*6dbdd20aSAndroid Build Coastguard Workerfor (;;) { 150*6dbdd20aSAndroid Build Coastguard Worker ScreenshotDataSource::Trace([](ScreenshotDataSource::TraceContext ctx) { 151*6dbdd20aSAndroid Build Coastguard Worker auto packet = ctx.NewTracePacket(); 152*6dbdd20aSAndroid Build Coastguard Worker packet.set_bitmap(Grab2MBScreenshot()); 153*6dbdd20aSAndroid Build Coastguard Worker }); 154*6dbdd20aSAndroid Build Coastguard Worker std::this_thread::sleep_for(std::chrono::seconds(10)); 155*6dbdd20aSAndroid Build Coastguard Worker} 156*6dbdd20aSAndroid Build Coastguard Worker``` 157*6dbdd20aSAndroid Build Coastguard Worker 158*6dbdd20aSAndroid Build Coastguard WorkerIts average write rate is 2MB / 10s = 200 KB/s. However, the data source will 159*6dbdd20aSAndroid Build Coastguard Workercreate bursts of 2MB back-to-back without yielding; it is limited only by the 160*6dbdd20aSAndroid Build Coastguard Workertracing serialization overhead. In practice, it will write the 2MB buffer at 161*6dbdd20aSAndroid Build Coastguard WorkerO(GB/s). If the shared memory buffer is < 2 MB, the tracing service will be 162*6dbdd20aSAndroid Build Coastguard Workerunlikely to catch up at that rate and data losses will be experienced. 163*6dbdd20aSAndroid Build Coastguard Worker 164*6dbdd20aSAndroid Build Coastguard WorkerIn a case like this these options are: 165*6dbdd20aSAndroid Build Coastguard Worker 166*6dbdd20aSAndroid Build Coastguard Worker* Increase the size of the shared memory buffer in the producer that hosts the 167*6dbdd20aSAndroid Build Coastguard Worker data source. 168*6dbdd20aSAndroid Build Coastguard Worker* Split the write into chunks spaced by some delay. 169*6dbdd20aSAndroid Build Coastguard Worker* Adopt the `BufferExhaustedPolicy::kStall` when defining the data source: 170*6dbdd20aSAndroid Build Coastguard Worker 171*6dbdd20aSAndroid Build Coastguard Worker```c++ 172*6dbdd20aSAndroid Build Coastguard Workerclass ScreenshotDataSource : public perfetto::DataSource<ScreenshotDataSource> { 173*6dbdd20aSAndroid Build Coastguard Worker public: 174*6dbdd20aSAndroid Build Coastguard Worker constexpr static BufferExhaustedPolicy kBufferExhaustedPolicy = 175*6dbdd20aSAndroid Build Coastguard Worker BufferExhaustedPolicy::kStall; 176*6dbdd20aSAndroid Build Coastguard Worker ... 177*6dbdd20aSAndroid Build Coastguard Worker}; 178*6dbdd20aSAndroid Build Coastguard Worker``` 179*6dbdd20aSAndroid Build Coastguard Worker 180*6dbdd20aSAndroid Build Coastguard Worker## Debugging data losses 181*6dbdd20aSAndroid Build Coastguard Worker 182*6dbdd20aSAndroid Build Coastguard Worker#### Ftrace kernel buffer losses 183*6dbdd20aSAndroid Build Coastguard Worker 184*6dbdd20aSAndroid Build Coastguard WorkerWhen using the Linux kernel ftrace data source, losses can occur in the 185*6dbdd20aSAndroid Build Coastguard Workerkernel -> userspace path if the `traced_probes` process gets blocked for too 186*6dbdd20aSAndroid Build Coastguard Workerlong. 187*6dbdd20aSAndroid Build Coastguard Worker 188*6dbdd20aSAndroid Build Coastguard WorkerAt the trace proto level, losses in this path are recorded: 189*6dbdd20aSAndroid Build Coastguard Worker* In the [`FtraceCpuStats`][FtraceCpuStats] messages, emitted both at the 190*6dbdd20aSAndroid Build Coastguard Worker beginning and end of the trace. If the `overrun` field is non-zero, data has 191*6dbdd20aSAndroid Build Coastguard Worker been lost. 192*6dbdd20aSAndroid Build Coastguard Worker* In the [`FtraceEventBundle.lost_events`][FtraceEventBundle] field. This allows 193*6dbdd20aSAndroid Build Coastguard Worker to locate precisely the point where data loss happened. 194*6dbdd20aSAndroid Build Coastguard Worker 195*6dbdd20aSAndroid Build Coastguard WorkerAt the TraceProcessor SQL level, this data is available in the `stats` table: 196*6dbdd20aSAndroid Build Coastguard Worker 197*6dbdd20aSAndroid Build Coastguard Worker```sql 198*6dbdd20aSAndroid Build Coastguard Worker> select * from stats where name like 'ftrace_cpu_overrun_end' 199*6dbdd20aSAndroid Build Coastguard Workername idx severity source value 200*6dbdd20aSAndroid Build Coastguard Worker-------------------- -------------------- -------------------- ------ ------ 201*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 0 data_loss trace 0 202*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 1 data_loss trace 0 203*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 2 data_loss trace 0 204*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 3 data_loss trace 0 205*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 4 data_loss trace 0 206*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 5 data_loss trace 0 207*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 6 data_loss trace 0 208*6dbdd20aSAndroid Build Coastguard Workerftrace_cpu_overrun_e 7 data_loss trace 0 209*6dbdd20aSAndroid Build Coastguard Worker``` 210*6dbdd20aSAndroid Build Coastguard Worker 211*6dbdd20aSAndroid Build Coastguard WorkerThese losses can be mitigated either increasing 212*6dbdd20aSAndroid Build Coastguard Worker[`TraceConfig.FtraceConfig.buffer_size_kb`][FtraceConfig] 213*6dbdd20aSAndroid Build Coastguard Worker or decreasing 214*6dbdd20aSAndroid Build Coastguard Worker[`TraceConfig.FtraceConfig.drain_period_ms`][FtraceConfig] 215*6dbdd20aSAndroid Build Coastguard Worker 216*6dbdd20aSAndroid Build Coastguard Worker#### Shared memory losses 217*6dbdd20aSAndroid Build Coastguard Worker 218*6dbdd20aSAndroid Build Coastguard WorkerTracing data can be lost in the shared memory due to bursts while traced is 219*6dbdd20aSAndroid Build Coastguard Workerblocked. 220*6dbdd20aSAndroid Build Coastguard Worker 221*6dbdd20aSAndroid Build Coastguard WorkerAt the trace proto level, losses in this path are recorded: 222*6dbdd20aSAndroid Build Coastguard Worker 223*6dbdd20aSAndroid Build Coastguard Worker* In [`TraceStats.BufferStats.trace_writer_packet_loss`][BufferStats]. 224*6dbdd20aSAndroid Build Coastguard Worker* In [`TracePacket.previous_packet_dropped`][TracePacket]. 225*6dbdd20aSAndroid Build Coastguard Worker Caveat: the very first packet emitted by every data source is also marked as 226*6dbdd20aSAndroid Build Coastguard Worker `previous_packet_dropped=true`. This is because the service has no way to 227*6dbdd20aSAndroid Build Coastguard Worker tell if that was the truly first packet or everything else before that was 228*6dbdd20aSAndroid Build Coastguard Worker lost. 229*6dbdd20aSAndroid Build Coastguard Worker 230*6dbdd20aSAndroid Build Coastguard WorkerAt the TraceProcessor SQL level, this data is available in the `stats` table: 231*6dbdd20aSAndroid Build Coastguard Worker```sql 232*6dbdd20aSAndroid Build Coastguard Worker> select * from stats where name = 'traced_buf_trace_writer_packet_loss' 233*6dbdd20aSAndroid Build Coastguard Workername idx severity source value 234*6dbdd20aSAndroid Build Coastguard Worker-------------------- -------------------- -------------------- --------- ----- 235*6dbdd20aSAndroid Build Coastguard Workertraced_buf_trace_wri 0 data_loss trace 0 236*6dbdd20aSAndroid Build Coastguard Worker``` 237*6dbdd20aSAndroid Build Coastguard Worker 238*6dbdd20aSAndroid Build Coastguard Worker#### Central buffer losses 239*6dbdd20aSAndroid Build Coastguard Worker 240*6dbdd20aSAndroid Build Coastguard WorkerData losses in the central buffer can happen for two different reasons: 241*6dbdd20aSAndroid Build Coastguard Worker 242*6dbdd20aSAndroid Build Coastguard Worker1. When using `fill_policy: RING_BUFFER`, older tracing data is overwritten by 243*6dbdd20aSAndroid Build Coastguard Worker virtue of wrapping in the ring buffer. 244*6dbdd20aSAndroid Build Coastguard Worker These losses are recorded, at the trace proto level, in 245*6dbdd20aSAndroid Build Coastguard Worker [`TraceStats.BufferStats.chunks_overwritten`][BufferStats]. 246*6dbdd20aSAndroid Build Coastguard Worker 247*6dbdd20aSAndroid Build Coastguard Worker2. When using `fill_policy: DISCARD`, newer tracing data committed after the 248*6dbdd20aSAndroid Build Coastguard Worker buffer is full is dropped. 249*6dbdd20aSAndroid Build Coastguard Worker These losses are recorded, at the trace proto level, in 250*6dbdd20aSAndroid Build Coastguard Worker [`TraceStats.BufferStats.chunks_discarded`][BufferStats]. 251*6dbdd20aSAndroid Build Coastguard Worker 252*6dbdd20aSAndroid Build Coastguard WorkerAt the TraceProcessor SQL level, this data is available in the `stats` table, 253*6dbdd20aSAndroid Build Coastguard Workerone entry per central buffer: 254*6dbdd20aSAndroid Build Coastguard Worker 255*6dbdd20aSAndroid Build Coastguard Worker```sql 256*6dbdd20aSAndroid Build Coastguard Worker> select * from stats where name = 'traced_buf_chunks_overwritten' or name = 'traced_buf_chunks_discarded' 257*6dbdd20aSAndroid Build Coastguard Workername idx severity source value 258*6dbdd20aSAndroid Build Coastguard Worker-------------------- -------------------- -------------------- ------- ----- 259*6dbdd20aSAndroid Build Coastguard Workertraced_buf_chunks_di 0 info trace 0 260*6dbdd20aSAndroid Build Coastguard Workertraced_buf_chunks_ov 0 data_loss trace 0 261*6dbdd20aSAndroid Build Coastguard Worker``` 262*6dbdd20aSAndroid Build Coastguard Worker 263*6dbdd20aSAndroid Build Coastguard WorkerSummary: the best way to detect and debug data losses is to use Trace Processor 264*6dbdd20aSAndroid Build Coastguard Workerand issue the query: 265*6dbdd20aSAndroid Build Coastguard Worker`select * from stats where severity = 'data_loss' and value != 0` 266*6dbdd20aSAndroid Build Coastguard Worker 267*6dbdd20aSAndroid Build Coastguard Worker## Atomicity and ordering guarantees 268*6dbdd20aSAndroid Build Coastguard Worker 269*6dbdd20aSAndroid Build Coastguard WorkerA "writer sequence" is the sequence of trace packets emitted by a given 270*6dbdd20aSAndroid Build Coastguard WorkerTraceWriter from a data source. In almost all cases 1 data source == 271*6dbdd20aSAndroid Build Coastguard Worker1+ TraceWriter(s). Some data sources that support writing from multiple threads 272*6dbdd20aSAndroid Build Coastguard Workertypically create one TraceWriter per thread. 273*6dbdd20aSAndroid Build Coastguard Worker 274*6dbdd20aSAndroid Build Coastguard Worker* Trace packets written from a sequence are emitted in the trace file in the 275*6dbdd20aSAndroid Build Coastguard Worker same order they have been written. 276*6dbdd20aSAndroid Build Coastguard Worker 277*6dbdd20aSAndroid Build Coastguard Worker* There is no ordering guarantee between packets written by different sequences. 278*6dbdd20aSAndroid Build Coastguard Worker Sequences are, by design, concurrent and more than one linearization is 279*6dbdd20aSAndroid Build Coastguard Worker possible. The service does NOT respect global timestamp ordering across 280*6dbdd20aSAndroid Build Coastguard Worker different sequences. If two packets from two sequences were emitted in 281*6dbdd20aSAndroid Build Coastguard Worker global timestamp order, the service can still emit them in the trace file in 282*6dbdd20aSAndroid Build Coastguard Worker the opposite order. 283*6dbdd20aSAndroid Build Coastguard Worker 284*6dbdd20aSAndroid Build Coastguard Worker* Trace packets are atomic. If a trace packet is emitted in the trace file, it 285*6dbdd20aSAndroid Build Coastguard Worker is guaranteed to be contain all the fields that the data source wrote. If a 286*6dbdd20aSAndroid Build Coastguard Worker trace packet is large and spans across several shared memory buffer pages, the 287*6dbdd20aSAndroid Build Coastguard Worker service will save it in the trace file only if it can observe that all 288*6dbdd20aSAndroid Build Coastguard Worker fragments have been committed without gaps. 289*6dbdd20aSAndroid Build Coastguard Worker 290*6dbdd20aSAndroid Build Coastguard Worker* If a trace packet is lost (e.g. because of wrapping in the ring buffer 291*6dbdd20aSAndroid Build Coastguard Worker or losses in the shared memory buffer), no further trace packet will be 292*6dbdd20aSAndroid Build Coastguard Worker emitted for that sequence, until all packets before are dropped as well. 293*6dbdd20aSAndroid Build Coastguard Worker In other words, if the tracing service ends up in a situation where it sees 294*6dbdd20aSAndroid Build Coastguard Worker packets 1,2,5,6 for a sequence, it will only emit 1, 2. If, however, new 295*6dbdd20aSAndroid Build Coastguard Worker packets (e.g., 7, 8, 9) are written and they overwrite 1, 2, clearing the gap, 296*6dbdd20aSAndroid Build Coastguard Worker the full sequence 5, 6, 7, 8, 9 will be emitted. 297*6dbdd20aSAndroid Build Coastguard Worker This behavior, however, doesn't hold when using [streaming mode] because, 298*6dbdd20aSAndroid Build Coastguard Worker in that case, the periodic read will consume the packets in the buffer and 299*6dbdd20aSAndroid Build Coastguard Worker clear the gaps, allowing the sequence to restart. 300*6dbdd20aSAndroid Build Coastguard Worker 301*6dbdd20aSAndroid Build Coastguard Worker## Incremental state in trace packets 302*6dbdd20aSAndroid Build Coastguard Worker 303*6dbdd20aSAndroid Build Coastguard WorkerIn many cases trace packets are fully independent of each other and can be 304*6dbdd20aSAndroid Build Coastguard Workerprocessed and interpreted without further context. 305*6dbdd20aSAndroid Build Coastguard WorkerIn some cases, however, they can have _incremental state_ and behave similarly 306*6dbdd20aSAndroid Build Coastguard Workerto inter-frame video encoding techniques, where some frames require the keyframe 307*6dbdd20aSAndroid Build Coastguard Workerto be present to be meaningfully decoded. 308*6dbdd20aSAndroid Build Coastguard Worker 309*6dbdd20aSAndroid Build Coastguard WorkerHere are two concrete examples: 310*6dbdd20aSAndroid Build Coastguard Worker 311*6dbdd20aSAndroid Build Coastguard Worker1. Ftrace scheduling slices and /proc/pid scans. ftrace scheduling events are 312*6dbdd20aSAndroid Build Coastguard Worker keyed by thread id. In most cases users want to map those events back to the 313*6dbdd20aSAndroid Build Coastguard Worker parent process (the thread-group). To solve this, when both the 314*6dbdd20aSAndroid Build Coastguard Worker `linux.ftrace` and the `linux.process_stats` data sources are enabled in a 315*6dbdd20aSAndroid Build Coastguard Worker Perfetto trace, the latter does capture process<>thread associations from 316*6dbdd20aSAndroid Build Coastguard Worker the /proc pseudo-filesystem, whenever a new thread-id is seen by ftrace. 317*6dbdd20aSAndroid Build Coastguard Worker A typical trace in this case looks as follows: 318*6dbdd20aSAndroid Build Coastguard Worker ```bash 319*6dbdd20aSAndroid Build Coastguard Worker # From process_stats's /proc scanner. 320*6dbdd20aSAndroid Build Coastguard Worker pid: 610; ppid: 1; cmdline: "/system/bin/surfaceflinger" 321*6dbdd20aSAndroid Build Coastguard Worker 322*6dbdd20aSAndroid Build Coastguard Worker # From ftrace 323*6dbdd20aSAndroid Build Coastguard Worker timestamp: 95054961131912; sched_wakeup: pid: 610; target_cpu: 2; 324*6dbdd20aSAndroid Build Coastguard Worker timestamp: 95054977528943; sched_switch: prev_pid: 610 prev_prio: 98 325*6dbdd20aSAndroid Build Coastguard Worker ``` 326*6dbdd20aSAndroid Build Coastguard Worker The /proc entry is emitted only once per process to avoid bloating the size of 327*6dbdd20aSAndroid Build Coastguard Worker the trace. In lack of data losses this is fine to be able to reconstruct all 328*6dbdd20aSAndroid Build Coastguard Worker scheduling events for that pid. If, however, the process_stats packet gets 329*6dbdd20aSAndroid Build Coastguard Worker dropped in the ring buffer, there will be no way left to work out the process 330*6dbdd20aSAndroid Build Coastguard Worker details for all the other ftrace events that refer to that PID. 331*6dbdd20aSAndroid Build Coastguard Worker 332*6dbdd20aSAndroid Build Coastguard Worker2. The [Track Event library](/docs/instrumentation/track-events) in the Perfetto 333*6dbdd20aSAndroid Build Coastguard Worker SDK makes extensive use of string interning. Most strings and descriptors 334*6dbdd20aSAndroid Build Coastguard Worker (e.g. details about processes / threads) are emitted only once and later 335*6dbdd20aSAndroid Build Coastguard Worker referred to using a monotonic ID. In case a loss of the descriptor packet, 336*6dbdd20aSAndroid Build Coastguard Worker it is not possible to make fully sense of those events. 337*6dbdd20aSAndroid Build Coastguard Worker 338*6dbdd20aSAndroid Build Coastguard WorkerTrace Processor has built-in mechanism that detect loss of interning data and 339*6dbdd20aSAndroid Build Coastguard Workerskips ingesting packets that refer to missing interned strings or descriptors. 340*6dbdd20aSAndroid Build Coastguard Worker 341*6dbdd20aSAndroid Build Coastguard WorkerWhen using tracing in ring-buffer mode, these types of losses are very likely to 342*6dbdd20aSAndroid Build Coastguard Workerhappen. 343*6dbdd20aSAndroid Build Coastguard Worker 344*6dbdd20aSAndroid Build Coastguard WorkerThere are two mitigations for this: 345*6dbdd20aSAndroid Build Coastguard Worker 346*6dbdd20aSAndroid Build Coastguard Worker1. Issuing periodic invalidations of the incremental state via 347*6dbdd20aSAndroid Build Coastguard Worker [`TraceConfig.IncrementalStateConfig.clear_period_ms`][IncrStateConfig]. 348*6dbdd20aSAndroid Build Coastguard Worker This will cause the data sources that make use of incremental state to 349*6dbdd20aSAndroid Build Coastguard Worker periodically drop the interning / process mapping tables and re-emit the 350*6dbdd20aSAndroid Build Coastguard Worker descriptors / strings on the next occurrence. This mitigates quite well the 351*6dbdd20aSAndroid Build Coastguard Worker problem in the context of ring-buffer traces, as long as the 352*6dbdd20aSAndroid Build Coastguard Worker `clear_period_ms` is one order of magnitude lower than the estimated length 353*6dbdd20aSAndroid Build Coastguard Worker of trace data in the central trace buffer. 354*6dbdd20aSAndroid Build Coastguard Worker 355*6dbdd20aSAndroid Build Coastguard Worker2. Recording the incremental state into a dedicated buffer (via 356*6dbdd20aSAndroid Build Coastguard Worker `DataSourceConfig.target_buffer`). This technique is quite commonly used with 357*6dbdd20aSAndroid Build Coastguard Worker in the ftrace + process_stats example mentioned before, recording the 358*6dbdd20aSAndroid Build Coastguard Worker process_stats packet in a dedicated buffer less likely to wrap (ftrace events 359*6dbdd20aSAndroid Build Coastguard Worker are much more frequent than descriptors for new processes). 360*6dbdd20aSAndroid Build Coastguard Worker 361*6dbdd20aSAndroid Build Coastguard Worker## Flushes and windowed trace importing 362*6dbdd20aSAndroid Build Coastguard Worker 363*6dbdd20aSAndroid Build Coastguard WorkerAnother common problem experienced in traces that involve multiple data sources 364*6dbdd20aSAndroid Build Coastguard Workeris the non-synchronous nature of trace commits. As explained in the 365*6dbdd20aSAndroid Build Coastguard Worker[Life of a trace packet](#life-of-a-trace-packet) section above, trace data is 366*6dbdd20aSAndroid Build Coastguard Workercommitted only when a full memory page of the shared memory buffer is filled (or 367*6dbdd20aSAndroid Build Coastguard Workerat when the tracing session ends). In most cases, if data sources produce events 368*6dbdd20aSAndroid Build Coastguard Workerat a regular cadence, pages are filled quite quickly and events are committed 369*6dbdd20aSAndroid Build Coastguard Workerin the central buffers within seconds. 370*6dbdd20aSAndroid Build Coastguard Worker 371*6dbdd20aSAndroid Build Coastguard WorkerIn some other cases, however, a data source can emit events only sporadically. 372*6dbdd20aSAndroid Build Coastguard WorkerImagine the case of a data source that emits events when the display is turned 373*6dbdd20aSAndroid Build Coastguard Workeron/off. Such an infrequent event might end up being staged in the shared memory 374*6dbdd20aSAndroid Build Coastguard Workerbuffer for very long times and can end up being committed in the trace buffer 375*6dbdd20aSAndroid Build Coastguard Workerhours after it happened. 376*6dbdd20aSAndroid Build Coastguard Worker 377*6dbdd20aSAndroid Build Coastguard WorkerAnother scenario where this can happen is when using ftrace and when a 378*6dbdd20aSAndroid Build Coastguard Workerparticular CPU is idle most of the time or gets hot-unplugged (ftrace uses 379*6dbdd20aSAndroid Build Coastguard Workerper-cpu buffers). In this case a CPU might record little-or-no data for several 380*6dbdd20aSAndroid Build Coastguard Workerminutes while the other CPUs pump thousands of new trace events per second. 381*6dbdd20aSAndroid Build Coastguard Worker 382*6dbdd20aSAndroid Build Coastguard WorkerThis causes two side effects that end up breaking user expectations or causing 383*6dbdd20aSAndroid Build Coastguard Workerbugs: 384*6dbdd20aSAndroid Build Coastguard Worker 385*6dbdd20aSAndroid Build Coastguard Worker* The UI can show an abnormally long timeline with a huge gap in the middle. 386*6dbdd20aSAndroid Build Coastguard Worker The packet ordering of events doesn't matter for the UI because events are 387*6dbdd20aSAndroid Build Coastguard Worker sorted by timestamp at import time. The trace in this case will contain very 388*6dbdd20aSAndroid Build Coastguard Worker recent events plus a handful of stale events that happened hours before. The 389*6dbdd20aSAndroid Build Coastguard Worker UI, for correctness, will try to display all events, showing a handful of 390*6dbdd20aSAndroid Build Coastguard Worker early events, followed by a huge temporal gap when nothing happened, 391*6dbdd20aSAndroid Build Coastguard Worker followed by the stream of recent events. 392*6dbdd20aSAndroid Build Coastguard Worker 393*6dbdd20aSAndroid Build Coastguard Worker* When recording long traces, Trace Processor can show import errors of the form 394*6dbdd20aSAndroid Build Coastguard Worker "XXX event out-of-order". This is because. in order to limit the memory usage 395*6dbdd20aSAndroid Build Coastguard Worker at import time, Trace Processor sorts events using a sliding window. If trace 396*6dbdd20aSAndroid Build Coastguard Worker packets are too out-of-order (trace file order vs timestamp order), the 397*6dbdd20aSAndroid Build Coastguard Worker sorting will fail and some packets will be dropped. 398*6dbdd20aSAndroid Build Coastguard Worker 399*6dbdd20aSAndroid Build Coastguard Worker#### Mitigations 400*6dbdd20aSAndroid Build Coastguard Worker 401*6dbdd20aSAndroid Build Coastguard WorkerThe best mitigation for these sort of problems is to specify a 402*6dbdd20aSAndroid Build Coastguard Worker[`flush_period_ms`][TraceConfig] in the trace config (10-30 seconds is usually 403*6dbdd20aSAndroid Build Coastguard Workergood enough for most cases), especially when recording long traces. 404*6dbdd20aSAndroid Build Coastguard Worker 405*6dbdd20aSAndroid Build Coastguard WorkerThis will cause the tracing service to issue periodic flush requests to data 406*6dbdd20aSAndroid Build Coastguard Workersources. A flush requests causes the data source to commit the shared memory 407*6dbdd20aSAndroid Build Coastguard Workerbuffer pages into the central buffer, even if they are not completely full. 408*6dbdd20aSAndroid Build Coastguard WorkerBy default, a flush issued only at the end of the trace. 409*6dbdd20aSAndroid Build Coastguard Worker 410*6dbdd20aSAndroid Build Coastguard WorkerIn case of long traces recorded without `flush_period_ms`, another option is to 411*6dbdd20aSAndroid Build Coastguard Workerpass the `--full-sort` option to `trace_processor_shell` when importing the 412*6dbdd20aSAndroid Build Coastguard Workertrace. Doing so will disable the windowed sorting at the cost of a higher 413*6dbdd20aSAndroid Build Coastguard Workermemory usage (the trace file will be fully buffered in memory before parsing). 414*6dbdd20aSAndroid Build Coastguard Worker 415*6dbdd20aSAndroid Build Coastguard Worker[streaming mode]: /docs/concepts/config#long-traces 416*6dbdd20aSAndroid Build Coastguard Worker[TraceConfig]: /docs/reference/trace-config-proto.autogen#TraceConfig 417*6dbdd20aSAndroid Build Coastguard Worker[FtraceConfig]: /docs/reference/trace-config-proto.autogen#FtraceConfig 418*6dbdd20aSAndroid Build Coastguard Worker[IncrStateConfig]: /docs/reference/trace-config-proto.autogen#FtraceConfig.IncrementalStateConfig 419*6dbdd20aSAndroid Build Coastguard Worker[FtraceCpuStats]: /docs/reference/trace-packet-proto.autogen#FtraceCpuStats 420*6dbdd20aSAndroid Build Coastguard Worker[FtraceEventBundle]: /docs/reference/trace-packet-proto.autogen#FtraceEventBundle 421*6dbdd20aSAndroid Build Coastguard Worker[TracePacket]: /docs/reference/trace-packet-proto.autogen#TracePacket 422*6dbdd20aSAndroid Build Coastguard Worker[BufferStats]: /docs/reference/trace-packet-proto.autogen#TraceStats.BufferStats 423