1TRACE-CMD-RECORD(1) 2=================== 3 4NAME 5---- 6trace-cmd-record - record a trace from the Ftrace Linux internal tracer 7 8SYNOPSIS 9-------- 10*trace-cmd record* ['OPTIONS'] ['command'] 11 12DESCRIPTION 13----------- 14The trace-cmd(1) record command will set up the Ftrace Linux kernel tracer to 15record the specified plugins or events that happen while the 'command' 16executes. If no command is given, then it will record until the user hits 17Ctrl-C. 18 19The record command of trace-cmd will set up the Ftrace tracer to start tracing 20the various events or plugins that are given on the command line. It will then 21create a number of tracing processes (one per CPU) that will start recording 22from the kernel ring buffer straight into temporary files. When the command is 23complete (or Ctrl-C is hit) all the files will be combined into a trace.dat 24file that can later be read (see trace-cmd-report(1)). 25 26OPTIONS 27------- 28*-p* 'tracer':: 29 Specify a tracer. Tracers usually do more than just trace an event. 30 Common tracers are: *function*, *function_graph*, *preemptirqsoff*, 31 *irqsoff*, *preemptoff* and *wakeup*. A tracer must be supported by the 32 running kernel. To see a list of available tracers, see trace-cmd-list(1). 33 34*-e* 'event':: 35 Specify an event to trace. Various static trace points have been added to 36 the Linux kernel. They are grouped by subsystem where you can enable all 37 events of a given subsystem or specify specific events to be enabled. The 38 'event' is of the format "subsystem:event-name". You can also just specify 39 the subsystem without the ':event-name' or the event-name without the 40 "subsystem:". Using "-e sched_switch" will enable the "sched_switch" event 41 where as, "-e sched" will enable all events under the "sched" subsystem. 42 43 The 'event' can also contain glob expressions. That is, "*stat*" will 44 select all events (or subsystems) that have the characters "stat" in their 45 names. 46 47 The keyword 'all' can be used to enable all events. 48 49*-a*:: 50 Every event that is being recorded has its output format file saved 51 in the output file to be able to display it later. But if other 52 events are enabled in the trace without trace-cmd's knowledge, the 53 formats of those events will not be recorded and trace-cmd report will 54 not be able to display them. If this is the case, then specify the 55 *-a* option and the format for all events in the system will be saved. 56 57*-T*:: 58 Enable a stacktrace on each event. For example: 59 60 <idle>-0 [003] 58549.289091: sched_switch: kworker/0:1:0 [120] R ==> trace-cmd:2603 [120] 61 <idle>-0 [003] 58549.289092: kernel_stack: <stack trace> 62=> schedule (ffffffff814b260e) 63=> cpu_idle (ffffffff8100a38c) 64=> start_secondary (ffffffff814ab828) 65 66*--func-stack*:: 67 Enable a stack trace on all functions. Note this is only applicable 68 for the "function" plugin tracer, and will only take effect if the 69 -l option is used and succeeds in limiting functions. If the function 70 tracer is not filtered, and the stack trace is enabled, you can live 71 lock the machine. 72 73*-f* 'filter':: 74 Specify a filter for the previous event. This must come after a *-e*. This 75 will filter what events get recorded based on the content of the event. 76 Filtering is passed to the kernel directly so what filtering is allowed 77 may depend on what version of the kernel you have. Basically, it will 78 let you use C notation to check if an event should be processed or not. 79 80---------------------------------------- 81 ==, >=, <=, >, <, &, |, && and || 82---------------------------------------- 83 84 The above are usually safe to use to compare fields. 85 86*--no-filter*:: 87 Do not filter out the trace-cmd threads. By default, the threads are 88 filtered out to not be traced by events. This option will have the trace-cmd 89 threads also be traced. 90 91*-R* 'trigger':: 92 Specify a trigger for the previous event. This must come after a *-e*. 93 This will add a given trigger to the given event. To only enable the trigger 94 and not the event itself, then place the event after the *-v* option. 95 96 See Documentation/trace/events.txt in the Linux kernel source for more 97 information on triggers. 98 99*-v*:: 100 This will cause all events specified after it on the command line to not 101 be traced. This is useful for selecting a subsystem to be traced but to 102 leave out various events. For Example: "-e sched -v -e "\*stat\*"" will 103 enable all events in the sched subsystem except those that have "stat" in 104 their names. 105 106 Note: the *-v* option was taken from the way grep(1) inverts the following 107 matches. 108 109*-F*:: 110 This will filter only the executable that is given on the command line. If 111 no command is given, then it will filter itself (pretty pointless). 112 Using *-F* will let you trace only events that are caused by the given 113 command. 114 115*-P* 'pid':: 116 Similar to *-F* but lets you specify a process ID to trace. 117 118*-c*:: 119 Used with either *-F* (or *-P* if kernel supports it) to trace the process' 120 children too. 121 122*--user*:: 123 Execute the specified *command* as given user. 124 125*-C* 'clock':: 126 Set the trace clock to "clock". 127 128 Use trace-cmd(1) list -C to see what clocks are available. 129 130*-o* 'output-file':: 131 By default, trace-cmd report will create a 'trace.dat' file. You can 132 specify a different file to write to with the *-o* option. 133 134*-l* 'function-name':: 135 This will limit the 'function' and 'function_graph' tracers to only trace 136 the given function name. More than one *-l* may be specified on the 137 command line to trace more than one function. This supports both full 138 regex(3) parsing, or basic glob parsing. If the filter has only alphanumeric, 139 '_', '*', '?' and '.' characters, then it will be parsed as a basic glob. 140 to force it to be a regex, prefix the filter with '^' or append it with '$' 141 and it will then be parsed as a regex. 142 143*-g* 'function-name':: 144 This option is for the function_graph plugin. It will graph the given 145 function. That is, it will only trace the function and all functions that 146 it calls. You can have more than one *-g* on the command line. 147 148*-n* 'function-name':: 149 This has the opposite effect of *-l*. The function given with the *-n* 150 option will not be traced. This takes precedence, that is, if you include 151 the same function for both *-n* and *-l*, it will not be traced. 152 153*-d*:: 154 Some tracer plugins enable the function tracer by default. Like the 155 latency tracers. This option prevents the function tracer from being 156 enabled at start up. 157 158*-D*:: 159 The option *-d* will try to use the function-trace option to disable the 160 function tracer (if available), otherwise it defaults to the proc file: 161 /proc/sys/kernel/ftrace_enabled, but will not touch it if the function-trace 162 option is available. The *-D* option will disable both the ftrace_enabled 163 proc file as well as the function-trace option if it exists. 164 165 Note, this disable function tracing for all users, which includes users 166 outside of ftrace tracers (stack_tracer, perf, etc). 167 168*-O* 'option':: 169 Ftrace has various options that can be enabled or disabled. This allows 170 you to set them. Appending the text 'no' to an option disables it. 171 For example: "-O nograph-time" will disable the "graph-time" Ftrace 172 option. 173 174*-s* 'interval':: 175 The processes that trace-cmd creates to record from the ring buffer need 176 to wake up to do the recording. Setting the 'interval' to zero will cause 177 the processes to wakeup every time new data is written into the buffer. 178 But since Ftrace is recording kernel activity, the act of this processes 179 going back to sleep may cause new events into the ring buffer which will 180 wake the process back up. This will needlessly add extra data into the 181 ring buffer. 182 183 The 'interval' metric is microseconds. The default is set to 1000 (1 ms). 184 This is the time each recording process will sleep before waking up to 185 record any new data that was written to the ring buffer. 186 187*-r* 'priority':: 188 The priority to run the capture threads at. In a busy system the trace 189 capturing threads may be staved and events can be lost. This increases 190 the priority of those threads to the real time (FIFO) priority. 191 But use this option with care, it can also change the behaviour of 192 the system being traced. 193 194*-b* 'size':: 195 This sets the ring buffer size to 'size' kilobytes. Because the Ftrace 196 ring buffer is per CPU, this size is the size of each per CPU ring buffer 197 inside the kernel. Using "-b 10000" on a machine with 4 CPUs will make 198 Ftrace have a total buffer size of 40 Megs. 199 200*-B* 'buffer-name':: 201 If the kernel supports multiple buffers, this will add a buffer with 202 the given name. If the buffer name already exists, that buffer is just 203 reset and will not be deleted at the end of record execution. If the 204 buffer is created, it will be removed at the end of execution (unless 205 the *-k* is set, or 'start' command was used). 206 207 After a buffer name is stated, all events added after that will be 208 associated with that buffer. If no buffer is specified, or an event 209 is specified before a buffer name, it will be associated with the 210 main (toplevel) buffer. 211 212 trace-cmd record -e sched -B block -e block -B time -e timer sleep 1 213 214 The above is will enable all sched events in the main buffer. It will 215 then create a 'block' buffer instance and enable all block events within 216 that buffer. A 'time' buffer instance is created and all timer events 217 will be enabled for that event. 218 219*-m* 'size':: 220 The max size in kilobytes that a per cpu buffer should be. Note, due 221 to rounding to page size, the number may not be totally correct. 222 Also, this is performed by switching between two buffers that are half 223 the given size thus the output may not be of the given size even if 224 much more was written. 225 226 Use this to prevent running out of diskspace for long runs. 227 228*-M* 'cpumask':: 229 Set the cpumask for to trace. It only affects the last buffer instance 230 given. If supplied before any buffer instance, then it affects the 231 main buffer. The value supplied must be a hex number. 232 233 trace-cmd record -p function -M c -B events13 -e all -M 5 234 235 If the -M is left out, then the mask stays the same. To enable all 236 CPUs, pass in a value of '-1'. 237 238*-k*:: 239 By default, when trace-cmd is finished tracing, it will reset the buffers 240 and disable all the tracing that it enabled. This option keeps trace-cmd 241 from disabling the tracer and reseting the buffer. This option is useful for 242 debugging trace-cmd. 243 244 Note: usually trace-cmd will set the "tracing_on" file back to what it 245 was before it was called. This option will leave that file set to zero. 246 247*-i*:: 248 By default, if an event is listed that trace-cmd does not find, it 249 will exit with an error. This option will just ignore events that are 250 listed on the command line but are not found on the system. 251 252*-N* 'host:port':: 253 If another machine is running "trace-cmd listen", this option is used to 254 have the data sent to that machine with UDP packets. Instead of writing 255 to an output file, the data is sent off to a remote box. This is ideal for 256 embedded machines with little storage, or having a single machine that 257 will keep all the data in a single repository. 258 259 Note: This option is not supported with latency tracer plugins: 260 wakeup, wakeup_rt, irqsoff, preemptoff and preemptirqsoff 261 262*-V* 'cid:port':: 263 If recording on a guest VM and the host is running *trace-cmd listen* with 264 the *-V* option as well, or if this is recording on the host, and a guest 265 in running *trace-cmd listen* with the *-V* option, then connect to the 266 listener (the same as connecting with the *-N* option via the network). 267 This has the same limitations as the *-N* option above with respect to 268 latency tracer plugins. 269 270*-t*:: 271 This option is used with *-N*, when there's a need to send the live data 272 with TCP packets instead of UDP. Although TCP is not nearly as fast as 273 sending the UDP packets, but it may be needed if the network is not that 274 reliable, the amount of data is not that intensive, and a guarantee is 275 needed that all traced information is transfered successfully. 276 277*-q* | *--quiet*:: 278 For use with recording an application. Suppresses normal output 279 (except for errors) to allow only the application's output to be displayed. 280 281*--date*:: 282 With the *--date* option, "trace-cmd" will write timestamps into the 283 trace buffer after it has finished recording. It will then map the 284 timestamp to gettimeofday which will allow wall time output from the 285 timestamps reading the created 'trace.dat' file. 286 287*--max-graph-depth* 'depth':: 288 Set the maximum depth the function_graph tracer will trace into a function. 289 A value of one will only show where userspace enters the kernel but not any 290 functions called in the kernel. The default is zero, which means no limit. 291 292*--cmdlines-size* 'size':: 293 Set the number of entries the kernel tracing file "saved_cmdlines" can 294 contain. This file is a circular buffer which stores the mapping between 295 cmdlines and PIDs. If full, it leads to unresolved cmdlines ("<...>") within 296 the trace. The kernel default value is 128. 297 298*--module* 'module':: 299 Filter a module's name in function tracing. It is equivalent to adding 300 ':mod:module' after all other functions being filtered. If no other function 301 filter is listed, then all modules functions will be filtered in the filter. 302 303 '--module snd' is equivalent to '-l :mod:snd' 304 305 '--module snd -l "*jack*"' is equivalent to '-l "*jack*:mod:snd"' 306 307 '--module snd -n "*"' is equivalent to '-n :mod:snd' 308 309*--proc-map*:: 310 Save the traced process address map into the trace.dat file. The traced 311 processes can be specified using the option *-P*, or as a given 'command'. 312 313*--profile*:: 314 With the *--profile* option, "trace-cmd" will enable tracing that can 315 be used with trace-cmd-report(1) --profile option. If a tracer *-p* is 316 not set, and function graph depth is supported by the kernel, then 317 the function_graph tracer will be enabled with a depth of one (only 318 show where userspace enters into the kernel). It will also enable 319 various tracepoints with stack tracing such that the report can show 320 where tasks have been blocked for the longest time. 321 322 See trace-cmd-profile(1) for more details and examples. 323 324*-G*:: 325 Set interrupt (soft and hard) events as global (associated to CPU 326 instead of tasks). Only works for --profile. 327 328*-H* 'event-hooks':: 329 Add custom event matching to connect any two events together. When not 330 used with *--profile*, it will save the parameter and this will be 331 used by trace-cmd report --profile, too. That is: 332 333 trace-cmd record -H hrtimer_expire_entry,hrtimer/hrtimer_expire_exit,hrtimer,sp 334 trace-cmd report --profile 335 336 Will profile hrtimer_expire_entry and hrtimer_expire_ext times. 337 338 See trace-cmd-profile(1) for format. 339 340*-S*:: (for --profile only) 341 Only enable the tracer or events speficied on the command line. 342 With this option, the function_graph tracer is not enabled, nor are 343 any events (like sched_switch), unless they are specifically specified 344 on the command line (i.e. -p function -e sched_switch -e sched_wakeup) 345 346*--ts-offset offset*:: 347 Add an offset for the timestamp in the trace.dat file. This will add a 348 offset option into the trace.dat file such that a trace-cmd report will 349 offset all the timestamps of the events by the given offset. The offset 350 is in raw units. That is, if the event timestamps are in nanoseconds 351 the offset will also be in nanoseconds even if the displayed units are 352 in microseconds. 353 354*--tsync-interval*:: 355 Set the loop interval, in ms, for timestamps synchronization with guests: 356 If a negative number is specified, timestamps synchronization is disabled 357 If 0 is specified, no loop is performed - timestamps offset is calculated only twice," 358 at the beginning and at the end of the trace. 359 Timestamps synchronization with guests works only if there is support for VSOCK.\n" 360 361*--tsc2nsec*:: 362 Convert the current clock to nanoseconds, using tsc multiplier and shift from the Linux 363 kernel's perf interface. This option does not change the trace clock, just assumes that 364 the tsc multiplier and shift are applicable for the selected clock. You may use the 365 "-C tsc2nsec" clock, if not sure what clock to select. 366*--stderr*:: 367 Have output go to stderr instead of stdout, but the output of the command 368 executed will not be changed. This is useful if you want to monitor the 369 output of the command being executed, but not see the output from trace-cmd. 370 371*--poll*:: 372 Waiting for data to be available on the trace ring-buffers may trigger 373 IPIs. This might generate unacceptable trace noise when tracing low latency 374 or real time systems. The poll option forces trace-cmd to use O_NONBLOCK. 375 Traces are extracted by busy waiting, which will hog the CPUs, so only use 376 when really needed. 377 378*--name*:: 379 Give a specific name for the current agent being processed. Used after *-A* to 380 give the guest being traced a name. Useful when using the vsocket ID instead of 381 a name of the guest. 382 383*--verbose*[='level']:: 384 Set the log level. Supported log levels are "none", "critical", "error", "warning", 385 "info", "debug", "all" or their identifiers "0", "1", "2", "3", "4", "5", "6". Setting the log 386 level to specific value enables all logs from that and all previous levels. 387 The level will default to "info" if one is not specified. 388 389 Example: enable all critical, error and warning logs 390 391 trace-cmd record --verbose=warning 392 393*--file-version*:: 394 Desired version of the output file. Supported versions are 6 or 7. 395 396*--compression*:: 397 Compression of the trace output file, one of these strings can be passed: 398 399 'any' - auto select the best available compression algorithm 400 401 'none' - do not compress the trace file 402 403 'name' - the name of the desired compression algorithms. Available algorithms can be listed with 404 trace-cmd list -c 405 406EXAMPLES 407-------- 408 409The basic way to trace all events: 410 411------------------------------ 412 # trace-cmd record -e all ls > /dev/null 413 # trace-cmd report 414 trace-cmd-13541 [003] 106260.693809: filemap_fault: address=0x128122 offset=0xce 415 trace-cmd-13543 [001] 106260.693809: kmalloc: call_site=81128dd4 ptr=0xffff88003dd83800 bytes_req=768 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_ZERO 416 ls-13545 [002] 106260.693809: kfree: call_site=810a7abb ptr=0x0 417 ls-13545 [002] 106260.693818: sys_exit_write: 0x1 418 419 420------------------------------ 421 422 423 424To use the function tracer with sched switch tracing: 425 426------------------------------ 427 # trace-cmd record -p function -e sched_switch ls > /dev/null 428 # trace-cmd report 429 ls-13587 [002] 106467.860310: function: hrtick_start_fair <-- pick_next_task_fair 430 ls-13587 [002] 106467.860313: sched_switch: prev_comm=trace-cmd prev_pid=13587 prev_prio=120 prev_state=R ==> next_comm=trace-cmd next_pid=13583 next_prio=120 431 trace-cmd-13585 [001] 106467.860314: function: native_set_pte_at <-- __do_fault 432 trace-cmd-13586 [003] 106467.860314: function: up_read <-- do_page_fault 433 ls-13587 [002] 106467.860317: function: __phys_addr <-- schedule 434 trace-cmd-13585 [001] 106467.860318: function: _raw_spin_unlock <-- __do_fault 435 ls-13587 [002] 106467.860320: function: native_load_sp0 <-- __switch_to 436 trace-cmd-13586 [003] 106467.860322: function: down_read_trylock <-- do_page_fault 437 438 439------------------------------ 440 441Here is a nice way to find what interrupts have the highest latency: 442------------------------------------------ 443 # trace-cmd record -p function_graph -e irq_handler_entry -l do_IRQ sleep 10 444 # trace-cmd report 445 <idle>-0 [000] 157412.933969: funcgraph_entry: | do_IRQ() { 446 <idle>-0 [000] 157412.933974: irq_handler_entry: irq=48 name=eth0 447 <idle>-0 [000] 157412.934004: funcgraph_exit: + 36.358 us | } 448 <idle>-0 [000] 157413.895004: funcgraph_entry: | do_IRQ() { 449 <idle>-0 [000] 157413.895011: irq_handler_entry: irq=48 name=eth0 450 <idle>-0 [000] 157413.895026: funcgraph_exit: + 24.014 us | } 451 <idle>-0 [000] 157415.891762: funcgraph_entry: | do_IRQ() { 452 <idle>-0 [000] 157415.891769: irq_handler_entry: irq=48 name=eth0 453 <idle>-0 [000] 157415.891784: funcgraph_exit: + 22.928 us | } 454 <idle>-0 [000] 157415.934869: funcgraph_entry: | do_IRQ() { 455 <idle>-0 [000] 157415.934874: irq_handler_entry: irq=48 name=eth0 456 <idle>-0 [000] 157415.934906: funcgraph_exit: + 37.512 us | } 457 <idle>-0 [000] 157417.888373: funcgraph_entry: | do_IRQ() { 458 <idle>-0 [000] 157417.888381: irq_handler_entry: irq=48 name=eth0 459 <idle>-0 [000] 157417.888398: funcgraph_exit: + 25.943 us | } 460 461 462------------------------------------------ 463 464An example of the profile: 465------------------------------------------ 466 # trace-cmd record --profile sleep 1 467 # trace-cmd report --profile --comm sleep 468task: sleep-21611 469 Event: sched_switch:R (1) Total: 99442 Avg: 99442 Max: 99442 Min:99442 470 <stack> 1 total:99442 min:99442 max:99442 avg=99442 471 => ftrace_raw_event_sched_switch (0xffffffff8105f812) 472 => __schedule (0xffffffff8150810a) 473 => preempt_schedule (0xffffffff8150842e) 474 => ___preempt_schedule (0xffffffff81273354) 475 => cpu_stop_queue_work (0xffffffff810b03c5) 476 => stop_one_cpu (0xffffffff810b063b) 477 => sched_exec (0xffffffff8106136d) 478 => do_execve_common.isra.27 (0xffffffff81148c89) 479 => do_execve (0xffffffff811490b0) 480 => SyS_execve (0xffffffff811492c4) 481 => return_to_handler (0xffffffff8150e3c8) 482 => stub_execve (0xffffffff8150c699) 483 Event: sched_switch:S (1) Total: 1000506680 Avg: 1000506680 Max: 1000506680 Min:1000506680 484 <stack> 1 total:1000506680 min:1000506680 max:1000506680 avg=1000506680 485 => ftrace_raw_event_sched_switch (0xffffffff8105f812) 486 => __schedule (0xffffffff8150810a) 487 => schedule (0xffffffff815084b8) 488 => do_nanosleep (0xffffffff8150b22c) 489 => hrtimer_nanosleep (0xffffffff8108d647) 490 => SyS_nanosleep (0xffffffff8108d72c) 491 => return_to_handler (0xffffffff8150e3c8) 492 => tracesys_phase2 (0xffffffff8150c304) 493 Event: sched_wakeup:21611 (1) Total: 30326 Avg: 30326 Max: 30326 Min:30326 494 <stack> 1 total:30326 min:30326 max:30326 avg=30326 495 => ftrace_raw_event_sched_wakeup_template (0xffffffff8105f653) 496 => ttwu_do_wakeup (0xffffffff810606eb) 497 => ttwu_do_activate.constprop.124 (0xffffffff810607c8) 498 => try_to_wake_up (0xffffffff8106340a) 499------------------------------------------ 500 501SEE ALSO 502-------- 503trace-cmd(1), trace-cmd-report(1), trace-cmd-start(1), trace-cmd-stop(1), 504trace-cmd-extract(1), trace-cmd-reset(1), trace-cmd-split(1), 505trace-cmd-list(1), trace-cmd-listen(1), trace-cmd-profile(1) 506 507AUTHOR 508------ 509Written by Steven Rostedt, <[email protected]> 510 511RESOURCES 512--------- 513https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/ 514 515COPYING 516------- 517Copyright \(C) 2010 Red Hat, Inc. Free use of this software is granted under 518the terms of the GNU Public License (GPL). 519 520