1Demonstrations of criticalstat: Find long atomic critical sections in the kernel. 2 3criticalstat traces and reports occurrences of atomic critical sections in the 4kernel with useful stacktraces showing the origin of them. Such critical 5sections frequently occur due to use of spinlocks, or if interrupts or 6preemption were explicitly disabled by a driver. IRQ routines in Linux are also 7executed with interrupts disabled. There are many reasons. Such critical 8sections are a source of long latency/responsive issues for real-time systems. 9 10This works by probing the preempt/irq and cpuidle tracepoints in the kernel. 11Since this uses BPF, only the root user can use this tool. Further, the kernel 12has to be built with certain CONFIG options enabled inorder for it to work: 13CONFIG_PREEMPTIRQ_EVENTS before kernel 4.19 14CONFIG_PREEMPTIRQ_TRACEPOINTS in kernel 4.19 and later 15CONFIG_DEBUG_PREEMPT 16CONFIG_PREEMPT_TRACER 17Additionally, the following options should be turned off on older kernels: 18CONFIG_PROVE_LOCKING 19CONFIG_LOCKDEP 20 21USAGE: 22# ./criticalstat -h 23usage: criticalstat [-h] [-p] [-i] [-d DURATION] 24 25Trace long critical sections 26 27optional arguments: 28 -h, --help Show this help message and exit 29 -p, --preemptoff Find long sections where preemption was off 30 -i, --irqoff Find long sections where IRQ was off 31 -d DURATION, --duration DURATION 32 Duration in uS (microseconds) below which we filter 33 34examples: 35 ./criticalstat # run with default options: irq off for more than 100 uS 36 ./criticalstat -p # find sections with preemption disabled for more than 100 uS 37 ./criticalstat -d 500 # find sections with IRQs disabled for more than 500 uS 38 ./criticalstat -p -d 500 # find sections with preemption disabled for more than 500 uS 39 40The tool runs continuously until interrupted by Ctrl-C 41 42 43By default, criticalstat finds IRQ disable sections for > 100us. 44 45# ./criticalstat 46Finding critical section with IRQ disabled for > 100us 47=================================== 48TASK: kworker/u16:5 (pid 5903 tid 5903) Total Time: 194.427 us 49 50Section start: __schedule -> schedule 51Section end: _raw_spin_unlock_irq -> finish_task_switch 52 trace_hardirqs_on+0xdc 53 trace_hardirqs_on+0xdc 54 _raw_spin_unlock_irq+0x18 55 finish_task_switch+0xf0 56 __schedule+0x8c8 57 preempt_schedule_irq+0x38 58 el1_preempt+0x8 59=================================== 60 61 62If too many sections are showing up, the user can raise the threshold to only 63show critical sections that are > 500us by passing "-d" option: 64 65# ./criticalstat -d 500 66Finding critical section with IRQ disabled for > 500us 67=================================== 68TASK: crtc_commit:111 (pid 246 tid 246) Total Time: 580.730 us 69 70Section start: clk_enable_lock -> clk_enable 71Section end: _raw_spin_unlock_irqrestore -> clk_enable 72 trace_hardirqs_on+0xdc 73 trace_hardirqs_on+0xdc 74 _raw_spin_unlock_irqrestore+0x24 75 clk_enable+0x80 76 msm_dss_enable_clk+0x7c 77 sde_power_resource_enable+0x578 78 _sde_crtc_vblank_enable_no_lock+0x68 79 sde_crtc_vblank+0x8c 80 sde_kms_enable_vblank+0x18 81 vblank_ctrl_worker+0xd0 82 kthread_worker_fn+0xf8 83 kthread+0x114 84 ret_from_fork+0x10 85=================================== 86 87 88If instead of irq disabled sections, we want to see preempt disabled sections, 89then pass the "-p" option. Below we try to find preempt-disabled critical 90sections that are > 500us. 91 92# ./criticalstat -p -d 500 93Finding critical section with preempt disabled for > 500us 94=================================== 95TASK: swapper/1 (pid 0 tid 0) Total Time: 618.437 us 96 97Section start: preempt_count_add -> preempt_count_add 98Section end: preempt_count_sub -> preempt_count_sub 99 trace_preempt_on+0x98 100 trace_preempt_on+0x98 101 preempt_latency_stop+0x164 102 preempt_count_sub+0x50 103 schedule+0x74 104 schedule_preempt_disabled+0x14 105 cpu_startup_entry+0x84 106 secondary_start_kernel+0x1c8 107 [unknown] 108=================================== 109 110 111criticalstat -p can also reflect kernel scheduler issues sometimes. These may 112show up as long preempt-off sections if the functions in the scheduler take a 113long time to run (such as pick_next_task_fair which selects the CPU for a task 114 115Follow is a report showing a preempt-off latency of 700us during the schedule 116loop's execution: 117 118=================================== 119TASK: irq/296-cs35l36 (pid 666 tid 666) Total Time: 732.657 us 120 121Section start: schedule -> schedule 122Section end: schedule -> schedule 123 trace_preempt_on+0x98 124 trace_preempt_on+0x98 125 preempt_count_sub+0xa4 126 schedule+0x78 127 schedule_timeout+0x80 128 wait_for_common+0xb4 129 wait_for_completion_timeout+0x28 130 geni_i2c_xfer+0x298 131 __i2c_transfer+0x4e0 132 i2c_transfer+0x8 133 irq_thread_fn+0x2c 134 irq_thread+0x160 135 kthread+0x118 136 ret_from_fork+0x10 137=================================== 138 139 140See Also: Linux kernel's preemptoff and irqoff tracers which provide similar 141tracing but with some limitations. 142