xref: /aosp_15_r20/external/bcc/tools/criticalstat_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1Demonstrations of criticalstat: Find long atomic critical sections in the kernel.
2
3criticalstat traces and reports occurrences of atomic critical sections in the
4kernel with useful stacktraces showing the origin of them. Such critical
5sections frequently occur due to use of spinlocks, or if interrupts or
6preemption were explicitly disabled by a driver. IRQ routines in Linux are also
7executed with interrupts disabled. There are many reasons. Such critical
8sections are a source of long latency/responsive issues for real-time systems.
9
10This works by probing the preempt/irq and cpuidle tracepoints in the kernel.
11Since this uses BPF, only the root user can use this tool. Further, the kernel
12has to be built with certain CONFIG options enabled inorder for it to work:
13CONFIG_PREEMPTIRQ_EVENTS before kernel 4.19
14CONFIG_PREEMPTIRQ_TRACEPOINTS in kernel 4.19 and later
15CONFIG_DEBUG_PREEMPT
16CONFIG_PREEMPT_TRACER
17Additionally, the following options should be turned off on older kernels:
18CONFIG_PROVE_LOCKING
19CONFIG_LOCKDEP
20
21USAGE:
22# ./criticalstat -h
23usage: criticalstat [-h] [-p] [-i] [-d DURATION]
24
25Trace long critical sections
26
27optional arguments:
28  -h, --help            Show this help message and exit
29  -p, --preemptoff      Find long sections where preemption was off
30  -i, --irqoff          Find long sections where IRQ was off
31  -d DURATION, --duration DURATION
32                        Duration in uS (microseconds) below which we filter
33
34examples:
35    ./criticalstat          	# run with default options: irq off for more than 100 uS
36    ./criticalstat -p       	# find sections with preemption disabled for more than 100 uS
37    ./criticalstat -d 500   	# find sections with IRQs disabled for more than 500 uS
38    ./criticalstat -p -d 500	# find sections with preemption disabled for more than 500 uS
39
40The tool runs continuously until interrupted by Ctrl-C
41
42
43By default, criticalstat finds IRQ disable sections for > 100us.
44
45# ./criticalstat
46Finding critical section with IRQ disabled for > 100us
47===================================
48TASK: kworker/u16:5 (pid  5903 tid  5903) Total Time: 194.427  us
49
50Section start: __schedule -> schedule
51Section end:   _raw_spin_unlock_irq -> finish_task_switch
52  trace_hardirqs_on+0xdc
53  trace_hardirqs_on+0xdc
54  _raw_spin_unlock_irq+0x18
55  finish_task_switch+0xf0
56  __schedule+0x8c8
57  preempt_schedule_irq+0x38
58  el1_preempt+0x8
59===================================
60
61
62If too many sections are showing up, the user can raise the threshold to only
63show critical sections that are > 500us by passing "-d" option:
64
65# ./criticalstat -d 500
66Finding critical section with IRQ disabled for > 500us
67===================================
68TASK: crtc_commit:111 (pid   246 tid   246) Total Time: 580.730  us
69
70Section start: clk_enable_lock -> clk_enable
71Section end:   _raw_spin_unlock_irqrestore -> clk_enable
72  trace_hardirqs_on+0xdc
73  trace_hardirqs_on+0xdc
74  _raw_spin_unlock_irqrestore+0x24
75  clk_enable+0x80
76  msm_dss_enable_clk+0x7c
77  sde_power_resource_enable+0x578
78  _sde_crtc_vblank_enable_no_lock+0x68
79  sde_crtc_vblank+0x8c
80  sde_kms_enable_vblank+0x18
81  vblank_ctrl_worker+0xd0
82  kthread_worker_fn+0xf8
83  kthread+0x114
84  ret_from_fork+0x10
85===================================
86
87
88If instead of irq disabled sections, we want to see preempt disabled sections,
89then pass the "-p" option. Below we try to find preempt-disabled critical
90sections that are > 500us.
91
92# ./criticalstat -p -d 500
93Finding critical section with preempt disabled for > 500us
94===================================
95TASK: swapper/1 (pid     0 tid     0) Total Time: 618.437  us
96
97Section start: preempt_count_add -> preempt_count_add
98Section end:   preempt_count_sub -> preempt_count_sub
99  trace_preempt_on+0x98
100  trace_preempt_on+0x98
101  preempt_latency_stop+0x164
102  preempt_count_sub+0x50
103  schedule+0x74
104  schedule_preempt_disabled+0x14
105  cpu_startup_entry+0x84
106  secondary_start_kernel+0x1c8
107  [unknown]
108===================================
109
110
111criticalstat -p can also reflect kernel scheduler issues sometimes. These may
112show up as long preempt-off sections if the functions in the scheduler take a
113long time to run (such as pick_next_task_fair which selects the CPU for a task
114
115Follow is a report showing a preempt-off latency of 700us during the schedule
116loop's execution:
117
118===================================
119TASK: irq/296-cs35l36 (pid   666 tid   666) Total Time: 732.657  us
120
121Section start: schedule -> schedule
122Section end:   schedule -> schedule
123  trace_preempt_on+0x98
124  trace_preempt_on+0x98
125  preempt_count_sub+0xa4
126  schedule+0x78
127  schedule_timeout+0x80
128  wait_for_common+0xb4
129  wait_for_completion_timeout+0x28
130  geni_i2c_xfer+0x298
131  __i2c_transfer+0x4e0
132  i2c_transfer+0x8
133  irq_thread_fn+0x2c
134  irq_thread+0x160
135  kthread+0x118
136  ret_from_fork+0x10
137===================================
138
139
140See Also: Linux kernel's preemptoff and irqoff tracers which provide similar
141tracing but with some limitations.
142