1*287e80b3SSadaf EbrahimiSQLHIST(1) 2*287e80b3SSadaf Ebrahimi========== 3*287e80b3SSadaf Ebrahimi 4*287e80b3SSadaf EbrahimiNAME 5*287e80b3SSadaf Ebrahimi---- 6*287e80b3SSadaf Ebrahimisqlhist - Tool that uses SQL language to create / show creation of tracefs histograms and synthetic events. 7*287e80b3SSadaf Ebrahimi 8*287e80b3SSadaf EbrahimiSYNOPSIS 9*287e80b3SSadaf Ebrahimi-------- 10*287e80b3SSadaf Ebrahimi*sqlhist* ['OPTIONS'] ['SQL-select-command'] 11*287e80b3SSadaf Ebrahimi 12*287e80b3SSadaf EbrahimiDESCRIPTION 13*287e80b3SSadaf Ebrahimi----------- 14*287e80b3SSadaf EbrahimiThe sqlhist(1) will take an SQL like statement to create tracefs histograms and 15*287e80b3SSadaf Ebrahimisynthetic events that can perform various actions for various handling of the 16*287e80b3SSadaf Ebrahimidata. 17*287e80b3SSadaf Ebrahimi 18*287e80b3SSadaf EbrahimiThe tracefs file system interfaces with the Linux tracing infrastructure that 19*287e80b3SSadaf Ebrahimihas various dynamic and static events through out the kernel. Each of these 20*287e80b3SSadaf Ebrahimievents can have a "histogram" attached to it, where the fields of the event 21*287e80b3SSadaf Ebrahimiwill define the buckets of the histogram. 22*287e80b3SSadaf Ebrahimi 23*287e80b3SSadaf EbrahimiA synthetic event is a way to attach two separate events and use the fields 24*287e80b3SSadaf Ebrahimiand time stamps of those events to create a new dynamic event. This new 25*287e80b3SSadaf Ebrahimidynamic event is call a synthetic event. The fields of each event can have 26*287e80b3SSadaf Ebrahimisimple calculations done on them where, for example, the delta between 27*287e80b3SSadaf Ebrahimia field of one event to a field of the other event can be taken. This also 28*287e80b3SSadaf Ebrahimiworks for the time stamps of the events where the time delta between the 29*287e80b3SSadaf Ebrahimitwo events can also be extracted and placed into the synthetic event. 30*287e80b3SSadaf Ebrahimi 31*287e80b3SSadaf EbrahimiOther actions can be done from the fields of the events. A snapshot can 32*287e80b3SSadaf Ebrahimibe taken of the kernel ring buffer a variable used in the synthetic 33*287e80b3SSadaf Ebrahimievent creating hits a max, or simply changes. 34*287e80b3SSadaf Ebrahimi 35*287e80b3SSadaf EbrahimiThe commands to create histograms and synthetic events are complex and 36*287e80b3SSadaf Ebrahiminot easy to remember. *sqlhist* is used to convert SQL syntax into the 37*287e80b3SSadaf Ebrahimicommands needed to create the histogram or synthetic event. 38*287e80b3SSadaf Ebrahimi 39*287e80b3SSadaf EbrahimiThe *SQL-select-command* is a SQL string defined by *tracefs_sql*(3). 40*287e80b3SSadaf Ebrahimi 41*287e80b3SSadaf EbrahimiNote, this must be run as root (or sudo) as interacting with the tracefs 42*287e80b3SSadaf Ebrahimidirectory requires root privilege, unless the *-t* option is given with 43*287e80b3SSadaf Ebrahimia copy of the _tracefs_ directory and its events. 44*287e80b3SSadaf Ebrahimi 45*287e80b3SSadaf EbrahimiThe *sqlhist* is a simple program where its code actual exists in the 46*287e80b3SSadaf Ebrahimi*tracefs_sql*(3) man page. 47*287e80b3SSadaf Ebrahimi 48*287e80b3SSadaf EbrahimiOPTIONS 49*287e80b3SSadaf Ebrahimi------- 50*287e80b3SSadaf Ebrahimi*-n* 'name':: 51*287e80b3SSadaf Ebrahimi The name of the synthetic event to create. This event can then be 52*287e80b3SSadaf Ebrahimi used like any other event, and enabled via *trace-cmd*(1). 53*287e80b3SSadaf Ebrahimi 54*287e80b3SSadaf Ebrahimi*-t* 'tracefs-dir':: 55*287e80b3SSadaf Ebrahimi In order to test this out as non root user, a copy of the tracefs directory 56*287e80b3SSadaf Ebrahimi can be used, and passing that directory with this option will allow 57*287e80b3SSadaf Ebrahimi the program to work. Obviously, *-e* will not work as non-root because 58*287e80b3SSadaf Ebrahimi it will not be able to execute. 59*287e80b3SSadaf Ebrahimi 60*287e80b3SSadaf Ebrahimi # mkdir /tmp/tracing 61*287e80b3SSadaf Ebrahimi # cp -r /sys/kernel/tracing/events /tmp/tracing 62*287e80b3SSadaf Ebrahimi # exit 63*287e80b3SSadaf Ebrahimi $ ./sqlhist -t /tmp/tracing ... 64*287e80b3SSadaf Ebrahimi 65*287e80b3SSadaf Ebrahimi*-e*:: 66*287e80b3SSadaf Ebrahimi Not only display the commands to create the histogram, but also execute them. 67*287e80b3SSadaf Ebrahimi This requires root privilege. 68*287e80b3SSadaf Ebrahimi 69*287e80b3SSadaf Ebrahimi*-f* 'file':: 70*287e80b3SSadaf Ebrahimi Instead of reading the SQL commands from the command line, read them from 71*287e80b3SSadaf Ebrahimi _file_. If _file_ is '-' then read from standard input. 72*287e80b3SSadaf Ebrahimi 73*287e80b3SSadaf Ebrahimi*-m* 'var':: 74*287e80b3SSadaf Ebrahimi Do the given action when the variable _var_ hits a new maximum. This can 75*287e80b3SSadaf Ebrahimi not be used with *-c*. 76*287e80b3SSadaf Ebrahimi 77*287e80b3SSadaf Ebrahimi*-c* 'var':: 78*287e80b3SSadaf Ebrahimi Do the given action when the variable _var_ changes its value. This can 79*287e80b3SSadaf Ebrahimi not be used with *-m*. 80*287e80b3SSadaf Ebrahimi 81*287e80b3SSadaf Ebrahimi*-s*:: 82*287e80b3SSadaf Ebrahimi Perform a snapshot instead of calling the synthetic event. 83*287e80b3SSadaf Ebrahimi 84*287e80b3SSadaf Ebrahimi*-T*:: 85*287e80b3SSadaf Ebrahimi Perform both a snapshot and trace the synthetic event. 86*287e80b3SSadaf Ebrahimi 87*287e80b3SSadaf Ebrahimi*-S* 'fields[,fields]':: 88*287e80b3SSadaf Ebrahimi Save the given fields. The fields must be fields of the "end" event given 89*287e80b3SSadaf Ebrahimi in the *SQL-select-command* 90*287e80b3SSadaf Ebrahimi 91*287e80b3SSadaf Ebrahimi*-B* 'instance':: 92*287e80b3SSadaf Ebrahimi For simple statements that only produce a histogram, the instance given here 93*287e80b3SSadaf Ebrahimi will be where the histogram will be created. This is ignored for full synthetic 94*287e80b3SSadaf Ebrahimi event creation, as sythetic events have a global affect on all tracing instances, 95*287e80b3SSadaf Ebrahimi where as, histograms only affect a single instance. 96*287e80b3SSadaf Ebrahimi 97*287e80b3SSadaf EbrahimiEXAMPLES 98*287e80b3SSadaf Ebrahimi-------- 99*287e80b3SSadaf Ebrahimi 100*287e80b3SSadaf EbrahimiCreate the sqlhist executable: 101*287e80b3SSadaf Ebrahimi 102*287e80b3SSadaf Ebrahimi[source, c] 103*287e80b3SSadaf Ebrahimi-- 104*287e80b3SSadaf Ebrahimi man tracefs_sql | sed -ne '/^EXAMPLE/,/FILES/ { /EXAMPLE/d ; /FILES/d ; p}' > sqlhist.c 105*287e80b3SSadaf Ebrahimi gcc -o sqlhist sqlhist.c `pkg-config --cflags --libs libtracefs` 106*287e80b3SSadaf Ebrahimi-- 107*287e80b3SSadaf Ebrahimi 108*287e80b3SSadaf EbrahimiAs described above, for testing purposes, make a copy of the event directory: 109*287e80b3SSadaf Ebrahimi[source, c] 110*287e80b3SSadaf Ebrahimi-- 111*287e80b3SSadaf Ebrahimi $ mkdir /tmp/tracing 112*287e80b3SSadaf Ebrahimi $ sudo cp -r /sys/kernel/tracing/events /tmp/tracing/ 113*287e80b3SSadaf Ebrahimi $ sudo chmod -R 0644 /tmp/tracing/ 114*287e80b3SSadaf Ebrahimi-- 115*287e80b3SSadaf Ebrahimi 116*287e80b3SSadaf EbrahimiFor an example of simple histogram output using the copy of the tracefs directory. 117*287e80b3SSadaf Ebrahimi[source, c] 118*287e80b3SSadaf Ebrahimi-- 119*287e80b3SSadaf Ebrahimi $ ./sqlhist -t /tmp/tracing/ 'SELECT CAST(call_site as SYM-OFFSET), bytes_req, CAST(bytes_alloc AS _COUNTER_) FROM kmalloc' 120*287e80b3SSadaf Ebrahimi-- 121*287e80b3SSadaf Ebrahimi 122*287e80b3SSadaf EbrahimiProduces the output: 123*287e80b3SSadaf Ebrahimi[source, c] 124*287e80b3SSadaf Ebrahimi-- 125*287e80b3SSadaf Ebrahimi echo 'hist:keys=call_site.sym-offset,bytes_req:vals=bytes_alloc' > /sys/kernel/tracing/events/kmem/kmalloc/trigger 126*287e80b3SSadaf Ebrahimi-- 127*287e80b3SSadaf Ebrahimi 128*287e80b3SSadaf EbrahimiWhich could be used by root: 129*287e80b3SSadaf Ebrahimi[source, c] 130*287e80b3SSadaf Ebrahimi-- 131*287e80b3SSadaf Ebrahimi # echo 'hist:keys=call_site.sym-offset,bytes_req:vals=bytes_alloc' > /sys/kernel/tracing/events/kmem/kmalloc/trigger 132*287e80b3SSadaf Ebrahimi # cat /sys/kernel/tracing/events/kmem/kmalloc/hist 133*287e80b3SSadaf Ebrahimi# event histogram 134*287e80b3SSadaf Ebrahimi# 135*287e80b3SSadaf Ebrahimi# trigger info: hist:keys=call_site.sym-offset,bytes_req:vals=hitcount,bytes_alloc:sort=hitcount:size=2048 [active] 136*287e80b3SSadaf Ebrahimi# 137*287e80b3SSadaf Ebrahimi 138*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813f8d8a] load_elf_phdrs+0x4a/0xb0 , bytes_req: 728 } hitcount: 1 bytes_alloc: 1024 139*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc0c69e74] nf_ct_ext_add+0xd4/0x1d0 [nf_conntrack] , bytes_req: 128 } hitcount: 1 bytes_alloc: 128 140*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff818355e6] dma_resv_get_fences+0xf6/0x440 , bytes_req: 8 } hitcount: 1 bytes_alloc: 8 141*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc06dc73f] intel_gt_get_buffer_pool+0x15f/0x290 [i915] , bytes_req: 424 } hitcount: 1 bytes_alloc: 512 142*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813f8d8a] load_elf_phdrs+0x4a/0xb0 , bytes_req: 616 } hitcount: 1 bytes_alloc: 1024 143*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8161a44c] __sg_alloc_table+0x11c/0x180 , bytes_req: 32 } hitcount: 1 bytes_alloc: 32 144*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc070749d] shmem_get_pages+0xad/0x5d0 [i915] , bytes_req: 16 } hitcount: 1 bytes_alloc: 16 145*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc07507f5] intel_framebuffer_create+0x25/0x60 [i915] , bytes_req: 408 } hitcount: 1 bytes_alloc: 512 146*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc06fc20f] eb_parse+0x34f/0x910 [i915] , bytes_req: 408 } hitcount: 1 bytes_alloc: 512 147*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc0700ebd] i915_gem_object_get_pages_internal+0x5d/0x270 [i915] , bytes_req: 16 } hitcount: 1 bytes_alloc: 16 148*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc0771188] intel_frontbuffer_get+0x38/0x220 [i915] , bytes_req: 400 } hitcount: 1 bytes_alloc: 512 149*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8161a44c] __sg_alloc_table+0x11c/0x180 , bytes_req: 128 } hitcount: 1 bytes_alloc: 128 150*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813f8f45] load_elf_binary+0x155/0x1680 , bytes_req: 28 } hitcount: 1 bytes_alloc: 32 151*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc07038c8] __assign_mmap_offset+0x208/0x3d0 [i915] , bytes_req: 288 } hitcount: 1 bytes_alloc: 512 152*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813737b2] alloc_bprm+0x32/0x2f0 , bytes_req: 416 } hitcount: 1 bytes_alloc: 512 153*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813f9027] load_elf_binary+0x237/0x1680 , bytes_req: 64 } hitcount: 1 bytes_alloc: 64 154*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8161a44c] __sg_alloc_table+0x11c/0x180 , bytes_req: 64 } hitcount: 1 bytes_alloc: 64 155*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc040ffe7] drm_vma_node_allow+0x27/0xe0 [drm] , bytes_req: 40 } hitcount: 2 bytes_alloc: 128 156*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813cda98] __do_sys_timerfd_create+0x58/0x1c0 , bytes_req: 336 } hitcount: 2 bytes_alloc: 1024 157*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff818355e6] dma_resv_get_fences+0xf6/0x440 , bytes_req: 40 } hitcount: 2 bytes_alloc: 128 158*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8139b75a] single_open+0x2a/0xa0 , bytes_req: 32 } hitcount: 2 bytes_alloc: 64 159*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff815df715] bio_kmalloc+0x25/0x80 , bytes_req: 136 } hitcount: 2 bytes_alloc: 384 160*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc071e5cd] i915_vma_work+0x1d/0x50 [i915] , bytes_req: 416 } hitcount: 3 bytes_alloc: 1536 161*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff81390d0d] alloc_fdtable+0x4d/0x100 , bytes_req: 56 } hitcount: 3 bytes_alloc: 192 162*287e80b3SSadaf Ebrahimi{ call_site: [ffffffffc06ff65f] i915_gem_do_execbuffer+0x158f/0x2440 [i915] , bytes_req: 16 } hitcount: 4 bytes_alloc: 64 163*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8137713c] alloc_pipe_info+0x5c/0x230 , bytes_req: 384 } hitcount: 5 bytes_alloc: 2560 164*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff813771b4] alloc_pipe_info+0xd4/0x230 , bytes_req: 640 } hitcount: 5 bytes_alloc: 5120 165*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff81834cdb] dma_resv_list_alloc+0x1b/0x40 , bytes_req: 40 } hitcount: 6 bytes_alloc: 384 166*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff81834cdb] dma_resv_list_alloc+0x1b/0x40 , bytes_req: 56 } hitcount: 9 bytes_alloc: 576 167*287e80b3SSadaf Ebrahimi{ call_site: [ffffffff8120086e] tracing_map_sort_entries+0x9e/0x3e0 , bytes_req: 24 } hitcount: 60 bytes_alloc: 1920 168*287e80b3SSadaf Ebrahimi 169*287e80b3SSadaf EbrahimiTotals: 170*287e80b3SSadaf Ebrahimi Hits: 122 171*287e80b3SSadaf Ebrahimi Entries: 30 172*287e80b3SSadaf Ebrahimi Dropped: 0 173*287e80b3SSadaf Ebrahimi-- 174*287e80b3SSadaf Ebrahimi 175*287e80b3SSadaf EbrahimiNote, although the examples use uppercase for the SQL keywords, they do not have 176*287e80b3SSadaf Ebrahimito be. 'SELECT' could also be 'select' or even 'sElEcT'. 177*287e80b3SSadaf Ebrahimi 178*287e80b3SSadaf EbrahimiBy using the full SQL language, synthetic events can be made and processed. 179*287e80b3SSadaf EbrahimiFor example, using *sqlhist* along with *trace-cmd*(1), wake up latency can 180*287e80b3SSadaf Ebrahimibe recorded by creating a synthetic event by attaching the _sched_waking_ 181*287e80b3SSadaf Ebrahimiand the _sched_switch_ events. 182*287e80b3SSadaf Ebrahimi 183*287e80b3SSadaf Ebrahimi[source, c] 184*287e80b3SSadaf Ebrahimi-- 185*287e80b3SSadaf Ebrahimi # sqlhist -n wakeup_lat -e -T -m lat 'SELECT end.next_comm AS comm, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) AS lat FROM ' \ 186*287e80b3SSadaf Ebrahimi 'sched_waking AS start JOIN sched_switch AS end ON start.pid = end.next_pid WHERE end.next_prio < 100 && end.next_comm == "cyclictest"' 187*287e80b3SSadaf Ebrahimi # trace-cmd start -e all -e wakeup_lat -R stacktrace 188*287e80b3SSadaf Ebrahimi # cyclictest -l 1000 -p80 -i250 -a -t -q -m -d 0 -b 1000 --tracemark 189*287e80b3SSadaf Ebrahimi # trace-cmd show -s | tail -30 190*287e80b3SSadaf Ebrahimi <idle>-0 [002] dNh4 23454.902246: sched_wakeup: comm=cyclictest pid=12272 prio=120 target_cpu=002 191*287e80b3SSadaf Ebrahimi <idle>-0 [005] ...1 23454.902246: cpu_idle: state=4294967295 cpu_id=5 192*287e80b3SSadaf Ebrahimi <idle>-0 [007] d..1 23454.902246: cpu_idle: state=0 cpu_id=7 193*287e80b3SSadaf Ebrahimi <idle>-0 [002] dNh1 23454.902247: hrtimer_expire_exit: hrtimer=0000000037956dc2 194*287e80b3SSadaf Ebrahimi <idle>-0 [005] d..1 23454.902248: cpu_idle: state=0 cpu_id=5 195*287e80b3SSadaf Ebrahimi <idle>-0 [002] dNh1 23454.902248: write_msr: 6e0, value 4866ce957272 196*287e80b3SSadaf Ebrahimi <idle>-0 [006] ...1 23454.902248: cpu_idle: state=4294967295 cpu_id=6 197*287e80b3SSadaf Ebrahimi <idle>-0 [002] dNh1 23454.902249: local_timer_exit: vector=236 198*287e80b3SSadaf Ebrahimi <idle>-0 [006] d..1 23454.902250: cpu_idle: state=0 cpu_id=6 199*287e80b3SSadaf Ebrahimi <idle>-0 [002] .N.1 23454.902250: cpu_idle: state=4294967295 cpu_id=2 200*287e80b3SSadaf Ebrahimi <idle>-0 [002] dN.1 23454.902251: rcu_utilization: Start context switch 201*287e80b3SSadaf Ebrahimi <idle>-0 [002] dN.1 23454.902252: rcu_utilization: End context switch 202*287e80b3SSadaf Ebrahimi <idle>-0 [001] ...1 23454.902252: cpu_idle: state=4294967295 cpu_id=1 203*287e80b3SSadaf Ebrahimi <idle>-0 [002] dN.3 23454.902253: prandom_u32: ret=3692516021 204*287e80b3SSadaf Ebrahimi <idle>-0 [001] d..1 23454.902254: cpu_idle: state=0 cpu_id=1 205*287e80b3SSadaf Ebrahimi <idle>-0 [002] d..2 23454.902254: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=cyclictest next_pid=12275 next_prio=19 206*287e80b3SSadaf Ebrahimi <idle>-0 [002] d..4 23454.902256: wakeup_lat: next_comm=cyclictest lat=17 207*287e80b3SSadaf Ebrahimi <idle>-0 [002] d..5 23454.902258: <stack trace> 208*287e80b3SSadaf Ebrahimi => trace_event_raw_event_synth 209*287e80b3SSadaf Ebrahimi => action_trace 210*287e80b3SSadaf Ebrahimi => event_hist_trigger 211*287e80b3SSadaf Ebrahimi => event_triggers_call 212*287e80b3SSadaf Ebrahimi => trace_event_buffer_commit 213*287e80b3SSadaf Ebrahimi => trace_event_raw_event_sched_switch 214*287e80b3SSadaf Ebrahimi => __traceiter_sched_switch 215*287e80b3SSadaf Ebrahimi => __schedule 216*287e80b3SSadaf Ebrahimi => schedule_idle 217*287e80b3SSadaf Ebrahimi => do_idle 218*287e80b3SSadaf Ebrahimi => cpu_startup_entry 219*287e80b3SSadaf Ebrahimi => secondary_startup_64_no_verify 220*287e80b3SSadaf Ebrahimi-- 221*287e80b3SSadaf Ebrahimi 222*287e80b3SSadaf EbrahimiHere's the options for *sqlhist* explained: 223*287e80b3SSadaf Ebrahimi 224*287e80b3SSadaf Ebrahimi *-n wakeup_lat* :: 225*287e80b3SSadaf Ebrahimi Name the synthetic event to use *wakeup_lat*. 226*287e80b3SSadaf Ebrahimi 227*287e80b3SSadaf Ebrahimi *-e*:: 228*287e80b3SSadaf Ebrahimi Execute the commands that are printed. 229*287e80b3SSadaf Ebrahimi 230*287e80b3SSadaf Ebrahimi *-T*:: 231*287e80b3SSadaf Ebrahimi Perform both a trace action and then a snapshot action (swap the buffer into the static 'snapshot' buffer). 232*287e80b3SSadaf Ebrahimi 233*287e80b3SSadaf Ebrahimi *-m lat*:: 234*287e80b3SSadaf Ebrahimi Trigger the actions whenever 'lat' hits a new maximum value. 235*287e80b3SSadaf Ebrahimi 236*287e80b3SSadaf EbrahimiNow a breakdown of the SQL statement: 237*287e80b3SSadaf Ebrahimi[source, c] 238*287e80b3SSadaf Ebrahimi-- 239*287e80b3SSadaf Ebrahimi 'SELECT end.next_comm AS comm, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) AS lat FROM ' \ 240*287e80b3SSadaf Ebrahimi 'sched_waking AS start JOIN sched_switch AS end ON start.pid = end.next_pid WHERE end.next_prio < 100 && end.next_comm == "cyclictest"' 241*287e80b3SSadaf Ebrahimi-- 242*287e80b3SSadaf Ebrahimi *end.next_comm AS comm*:: 243*287e80b3SSadaf Ebrahimi Save the 'sched_switch' field *next_comm* and place it into the *comm* field of the 'wakeup_lat' synthetic event. 244*287e80b3SSadaf Ebrahimi 245*287e80b3SSadaf Ebrahimi *(end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) AS lat*:: 246*287e80b3SSadaf Ebrahimi Take the delta of the time stamps from the 'sched_switch' event and the 'sched_waking' event. 247*287e80b3SSadaf Ebrahimi As time stamps are usually recorded in nanoseconds, *TIMESTAMP* would give the full nanosecond time stamp, 248*287e80b3SSadaf Ebrahimi but here, the *TIMESTAMP_USECS* will truncate it into microseconds. The value is saved in the 249*287e80b3SSadaf Ebrahimi variable *lat*, which will also be recorded in the synthetic event. 250*287e80b3SSadaf Ebrahimi 251*287e80b3SSadaf Ebrahimi *FROM 'sched_waking' AS start JOIN sched_switch AS end ON start.pid = end.next_pid*:: 252*287e80b3SSadaf Ebrahimi Create the synthetic event by joining _sched_waking_ to _sched_switch_, matching 253*287e80b3SSadaf Ebrahimi the _sched_waking_ 'pid' field with the _sched_switch_ 'next_pid' field. 254*287e80b3SSadaf Ebrahimi Also make *start* an alias for _sched_waking_ and *end* an alias for _sched_switch_ 255*287e80b3SSadaf Ebrahimi which then an use *start* and *end* as a subsitute for _sched_waking_ and _sched_switch_ 256*287e80b3SSadaf Ebrahimi respectively through out the rest of the SQL statement. 257*287e80b3SSadaf Ebrahimi 258*287e80b3SSadaf Ebrahimi *WHERE end.next_prio < 100 && end.next_comm == "cyclictest"*:: 259*287e80b3SSadaf Ebrahimi Filter the logic where it executes only if _sched_waking_ 'next_prio' field 260*287e80b3SSadaf Ebrahimi is less than 100. (Note, in the Kernel, priorities are inverse, and the real-time 261*287e80b3SSadaf Ebrahimi priorities are represented from 0-100 where 0 is the highest priority). 262*287e80b3SSadaf Ebrahimi Also only trace when the 'next_comm' (the task scheduling in) of the _sched_switch_ 263*287e80b3SSadaf Ebrahimi event has the name "cyclictest". 264*287e80b3SSadaf Ebrahimi 265*287e80b3SSadaf EbrahimiFor the *trace-cmd*(3) command: 266*287e80b3SSadaf Ebrahimi[source, c] 267*287e80b3SSadaf Ebrahimi-- 268*287e80b3SSadaf Ebrahimi trace-cmd start -e all -e wakeup_lat -R stacktrace 269*287e80b3SSadaf Ebrahimi-- 270*287e80b3SSadaf Ebrahimi 271*287e80b3SSadaf Ebrahimi *trace-cmd start*:: 272*287e80b3SSadaf Ebrahimi Enables tracing (does not record to a file). 273*287e80b3SSadaf Ebrahimi 274*287e80b3SSadaf Ebrahimi *-e all*:: 275*287e80b3SSadaf Ebrahimi Enable all events 276*287e80b3SSadaf Ebrahimi 277*287e80b3SSadaf Ebrahimi *-e wakeup_lat -R stacktrace*:: 278*287e80b3SSadaf Ebrahimi have the "wakeup_lat" event (our synthetic event) enable the *stacktrace* trigger, were 279*287e80b3SSadaf Ebrahimi for every instance of the "wakeup_lat" event, a kernel stack trace will be recorded 280*287e80b3SSadaf Ebrahimi in the ring buffer. 281*287e80b3SSadaf Ebrahimi 282*287e80b3SSadaf EbrahimiAfter calling *cyclictest* (a real-time tool to measure wakeup latency), read the snapshot 283*287e80b3SSadaf Ebrahimibuffer. 284*287e80b3SSadaf Ebrahimi 285*287e80b3SSadaf Ebrahimi *trace-cmd show -s*:: 286*287e80b3SSadaf Ebrahimi *trace-cmd show* reads the kernel ring buffer, and the *-s* option will read the *snapshot* 287*287e80b3SSadaf Ebrahimi buffer instead of the normal one. 288*287e80b3SSadaf Ebrahimi 289*287e80b3SSadaf Ebrahimi[source, c] 290*287e80b3SSadaf Ebrahimi-- 291*287e80b3SSadaf Ebrahimi <idle>-0 [002] d..4 23454.902256: wakeup_lat: next_comm=cyclictest lat=17 292*287e80b3SSadaf Ebrahimi-- 293*287e80b3SSadaf Ebrahimi We see on the "wakeup_lat" event happened on CPU 2, with a wake up latency 17 microseconds. 294*287e80b3SSadaf Ebrahimi 295*287e80b3SSadaf EbrahimiThis can be extracted into a *trace.dat* file that *trace-cmd*(3) can read and do further 296*287e80b3SSadaf Ebrahimianalysis, as well as *kernelshark*. 297*287e80b3SSadaf Ebrahimi 298*287e80b3SSadaf Ebrahimi[source, c] 299*287e80b3SSadaf Ebrahimi-- 300*287e80b3SSadaf Ebrahimi # trace-cmd extract -s 301*287e80b3SSadaf Ebrahimi # trace-cmd report --cpu 2 | tail -30 302*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902238: prandom_u32: ret=1633425088 303*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902239: sched_wakeup: cyclictest:12275 [19] CPU:002 304*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902241: hrtimer_expire_exit: hrtimer=0xffffbbd68286fe60 305*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902241: hrtimer_cancel: hrtimer=0xffffbbd6826efe70 306*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902242: hrtimer_expire_entry: hrtimer=0xffffbbd6826efe70 now=23455294430750 function=hrtimer_wakeup/0x0 307*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902243: sched_waking: comm=cyclictest pid=12272 prio=120 target_cpu=002 308*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902244: prandom_u32: ret=1102749734 309*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902246: sched_wakeup: cyclictest:12272 [120] CPU:002 310*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902247: hrtimer_expire_exit: hrtimer=0xffffbbd6826efe70 311*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902248: write_msr: 6e0, value 4866ce957272 312*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902249: local_timer_exit: vector=236 313*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902250: cpu_idle: state=4294967295 cpu_id=2 314*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902251: rcu_utilization: Start context switch 315*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902252: rcu_utilization: End context switch 316*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902253: prandom_u32: ret=3692516021 317*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902254: sched_switch: swapper/2:0 [120] R ==> cyclictest:12275 [19] 318*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902256: wakeup_lat: next_comm=cyclictest lat=17 319*287e80b3SSadaf Ebrahimi <idle>-0 [002] 23454.902258: kernel_stack: <stack trace > 320*287e80b3SSadaf Ebrahimi=> trace_event_raw_event_synth (ffffffff8121a0db) 321*287e80b3SSadaf Ebrahimi=> action_trace (ffffffff8121e9fb) 322*287e80b3SSadaf Ebrahimi=> event_hist_trigger (ffffffff8121ca8d) 323*287e80b3SSadaf Ebrahimi=> event_triggers_call (ffffffff81216c72) 324*287e80b3SSadaf Ebrahimi=> trace_event_buffer_commit (ffffffff811f7618) 325*287e80b3SSadaf Ebrahimi=> trace_event_raw_event_sched_switch (ffffffff8110fda4) 326*287e80b3SSadaf Ebrahimi=> __traceiter_sched_switch (ffffffff8110d449) 327*287e80b3SSadaf Ebrahimi=> __schedule (ffffffff81c02002) 328*287e80b3SSadaf Ebrahimi=> schedule_idle (ffffffff81c02c86) 329*287e80b3SSadaf Ebrahimi=> do_idle (ffffffff8111e898) 330*287e80b3SSadaf Ebrahimi=> cpu_startup_entry (ffffffff8111eba9) 331*287e80b3SSadaf Ebrahimi=> secondary_startup_64_no_verify (ffffffff81000107) 332*287e80b3SSadaf Ebrahimi-- 333*287e80b3SSadaf Ebrahimi 334*287e80b3SSadaf EbrahimiBUGS 335*287e80b3SSadaf Ebrahimi---- 336*287e80b3SSadaf Ebrahimi 337*287e80b3SSadaf EbrahimiAs *sqlhist* is just example code from a man page, it is guaranteed to contain 338*287e80b3SSadaf Ebrahimilots of bugs. For one thing, not all error paths are covered properly. 339*287e80b3SSadaf Ebrahimi 340*287e80b3SSadaf EbrahimiSEE ALSO 341*287e80b3SSadaf Ebrahimi-------- 342*287e80b3SSadaf Ebrahimitrace-cmd(1), tracefs_sql(3) 343*287e80b3SSadaf Ebrahimi 344*287e80b3SSadaf EbrahimiAUTHOR 345*287e80b3SSadaf Ebrahimi------ 346*287e80b3SSadaf EbrahimiWritten by Steven Rostedt, <[email protected]> 347*287e80b3SSadaf Ebrahimi 348*287e80b3SSadaf EbrahimiRESOURCES 349*287e80b3SSadaf Ebrahimi--------- 350*287e80b3SSadaf Ebrahimihttps://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/ 351*287e80b3SSadaf Ebrahimi 352*287e80b3SSadaf EbrahimiCOPYING 353*287e80b3SSadaf Ebrahimi------- 354*287e80b3SSadaf EbrahimiCopyright \(C) 2021 , Inc. Free use of this software is granted under 355*287e80b3SSadaf Ebrahimithe terms of the GNU Public License (GPL). 356*287e80b3SSadaf Ebrahimi 357