1Demonstrations of funclatency, the Linux eBPF/bcc version. 2 3 4Timing the do_sys_open() kernel function until Ctrl-C: 5 6# ./funclatency do_sys_open 7Tracing do_sys_open... Hit Ctrl-C to end. 8^C 9 nsecs : count distribution 10 0 -> 1 : 0 | | 11 2 -> 3 : 0 | | 12 4 -> 7 : 0 | | 13 8 -> 15 : 0 | | 14 16 -> 31 : 0 | | 15 32 -> 63 : 0 | | 16 64 -> 127 : 0 | | 17 128 -> 255 : 0 | | 18 256 -> 511 : 0 | | 19 512 -> 1023 : 0 | | 20 1024 -> 2047 : 0 | | 21 2048 -> 4095 : 124 |**************** | 22 4096 -> 8191 : 291 |**************************************| 23 8192 -> 16383 : 36 |**** | 24 16384 -> 32767 : 16 |** | 25 32768 -> 65535 : 8 |* | 26 65536 -> 131071 : 0 | | 27 131072 -> 262143 : 0 | | 28 262144 -> 524287 : 0 | | 29 524288 -> 1048575 : 0 | | 30 1048576 -> 2097151 : 0 | | 31 2097152 -> 4194303 : 1 | | 32 33avg = 13746 nsecs, total: 6543360 nsecs, count: 476 34 35Detaching... 36 37The output shows a histogram of function latency (call time), measured from when 38the function began executing (was called) to when it finished (returned). 39 40This example output shows that most of the time, do_sys_open() took between 412048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution 42shows 291 calls of between 4096 and 8191 nanoseconds. There was also one 43occurrence, an outlier, in the 2 to 4 millisecond range. 44 45How this works: the function entry and return are traced using the kernel kprobe 46and kretprobe tracer. Timestamps are collected, the delta time calculated, which 47is the bucketized and stored as an in-kernel histogram for efficiency. The 48histogram is visible in the output: it's the "count" column; everything else is 49decoration. Only the count column is copied to user-level on output. This is an 50efficient way to time kernel functions and examine their latency distribution. 51 52 53Now trace a user function, pthread_mutex_lock in libpthread, to determine if 54there is considerable lock contention: 55 56# ./funclatency pthread:pthread_mutex_lock -p $(pidof contentions) 57Tracing 1 function for "pthread:pthread_mutex_lock"... Hit Ctrl-C to end. 58 nsecs : count distribution 59 0 -> 1 : 0 | | 60 2 -> 3 : 0 | | 61 4 -> 7 : 0 | | 62 8 -> 15 : 0 | | 63 16 -> 31 : 0 | | 64 32 -> 63 : 0 | | 65 64 -> 127 : 0 | | 66 128 -> 255 : 0 | | 67 256 -> 511 : 0 | | 68 512 -> 1023 : 0 | | 69 1024 -> 2047 : 0 | | 70 2048 -> 4095 : 508967 |****************************************| 71 4096 -> 8191 : 70072 |***** | 72 8192 -> 16383 : 27686 |** | 73 16384 -> 32767 : 5075 | | 74 32768 -> 65535 : 2318 | | 75 65536 -> 131071 : 581 | | 76 131072 -> 262143 : 38 | | 77 262144 -> 524287 : 5 | | 78 524288 -> 1048575 : 1 | | 79 1048576 -> 2097151 : 9 | | 80 81avg = 4317 nsecs, total: 2654426112 nsecs, count: 614752 82 83Detaching... 84 85It seems that most calls to pthread_mutex_lock completed rather quickly (in 86under 4us), but there were some cases of considerable contention, sometimes 87over a full millisecond. 88 89 90Run a quick-and-dirty profiler over all the functions in an executable: 91# ./funclatency /home/user/primes:* -p $(pidof primes) -F 92Tracing 15 functions for "/home/user/primes:*"... Hit Ctrl-C to end. 93^C 94 95Function = is_prime [6556] 96 nsecs : count distribution 97 0 -> 1 : 0 | | 98 2 -> 3 : 0 | | 99 4 -> 7 : 0 | | 100 8 -> 15 : 0 | | 101 16 -> 31 : 0 | | 102 32 -> 63 : 0 | | 103 64 -> 127 : 0 | | 104 128 -> 255 : 0 | | 105 256 -> 511 : 0 | | 106 512 -> 1023 : 0 | | 107 1024 -> 2047 : 0 | | 108 2048 -> 4095 : 1495322 |****************************************| 109 4096 -> 8191 : 95744 |** | 110 8192 -> 16383 : 9926 | | 111 16384 -> 32767 : 3070 | | 112 32768 -> 65535 : 1415 | | 113 65536 -> 131071 : 112 | | 114 131072 -> 262143 : 9 | | 115 262144 -> 524287 : 3 | | 116 524288 -> 1048575 : 0 | | 117 1048576 -> 2097151 : 8 | | 118 119Function = insert_result [6556] 120 nsecs : count distribution 121 0 -> 1 : 0 | | 122 2 -> 3 : 0 | | 123 4 -> 7 : 0 | | 124 8 -> 15 : 0 | | 125 16 -> 31 : 0 | | 126 32 -> 63 : 0 | | 127 64 -> 127 : 0 | | 128 128 -> 255 : 0 | | 129 256 -> 511 : 0 | | 130 512 -> 1023 : 0 | | 131 1024 -> 2047 : 0 | | 132 2048 -> 4095 : 111047 |****************************************| 133 4096 -> 8191 : 3998 |* | 134 8192 -> 16383 : 720 | | 135 16384 -> 32767 : 238 | | 136 32768 -> 65535 : 106 | | 137 65536 -> 131071 : 5 | | 138 131072 -> 262143 : 4 | | 139 140avg = 3404 nsecs, total: 5862276096 nsecs, count: 1721727 141 142Detaching... 143 144From the results, we can see that the is_prime function has something resembling 145an exponential distribution -- very few primes take a very long time to test, 146while most numbers are verified as prime or composite in less than 4us. The 147insert_result function exhibits a similar phenomenon, likely due to contention 148over a shared results container. 149 150 151Now vfs_read() is traced, and a microseconds histogram printed: 152 153# ./funclatency -u vfs_read 154Tracing vfs_read... Hit Ctrl-C to end. 155^C 156 usecs : count distribution 157 0 -> 1 : 1143 |**************************************| 158 2 -> 3 : 420 |************* | 159 4 -> 7 : 159 |***** | 160 8 -> 15 : 295 |********* | 161 16 -> 31 : 25 | | 162 32 -> 63 : 5 | | 163 64 -> 127 : 1 | | 164 128 -> 255 : 0 | | 165 256 -> 511 : 0 | | 166 512 -> 1023 : 0 | | 167 1024 -> 2047 : 1 | | 168 2048 -> 4095 : 0 | | 169 4096 -> 8191 : 5 | | 170 8192 -> 16383 : 0 | | 171 16384 -> 32767 : 0 | | 172 32768 -> 65535 : 0 | | 173 65536 -> 131071 : 7 | | 174 131072 -> 262143 : 7 | | 175 262144 -> 524287 : 3 | | 176 524288 -> 1048575 : 7 | | 177 178avg = 4229 nsecs, total: 8789145 nsecs, count: 2078 179 180Detaching... 181 182This shows a bimodal distribution. Many vfs_read() calls were faster than 15 183microseconds, however, there was also a small handful between 65 milliseconds 184and 1 second, seen at the bottom of the table. These are likely network reads 185from SSH, waiting on interactive keystrokes. 186 187 188Tracing do_nanosleep() in milliseconds: 189 190# ./funclatency -m do_nanosleep 191Tracing do_nanosleep... Hit Ctrl-C to end. 192^C 193 msecs : count distribution 194 0 -> 1 : 0 | | 195 2 -> 3 : 0 | | 196 4 -> 7 : 0 | | 197 8 -> 15 : 0 | | 198 16 -> 31 : 0 | | 199 32 -> 63 : 0 | | 200 64 -> 127 : 0 | | 201 128 -> 255 : 0 | | 202 256 -> 511 : 0 | | 203 512 -> 1023 : 328 |**************************************| 204 1024 -> 2047 : 0 | | 205 2048 -> 4095 : 0 | | 206 4096 -> 8191 : 32 |*** | 207 8192 -> 16383 : 0 | | 208 16384 -> 32767 : 0 | | 209 32768 -> 65535 : 2 | | 210 211avg = 1510 nsecs, total: 546816 nsecs, count: 326 212 213Detaching... 214 215This looks like it has found threads that are sleeping every 1, 5, and 60 216seconds. 217 218 219An interval can be provided using -i, and timestamps added using -T. For 220example, tracing vfs_read() latency in milliseconds and printing output 221every 5 seconds: 222 223# ./funclatency -mTi 5 vfs_read 224Tracing vfs_read... Hit Ctrl-C to end. 225 22620:10:08 227 msecs : count distribution 228 0 -> 1 : 1500 |*************************************+| 229 2 -> 3 : 3 | | 230 4 -> 7 : 1 | | 231 8 -> 15 : 2 | | 232 16 -> 31 : 0 | | 233 32 -> 63 : 0 | | 234 64 -> 127 : 4 | | 235 128 -> 255 : 3 | | 236 256 -> 511 : 1 | | 237 512 -> 1023 : 7 | | 238 239avg = 5 nsecs, total: 8259 nsecs, count: 1521 240 24120:10:13 242 msecs : count distribution 243 0 -> 1 : 1251 |*************************************+| 244 2 -> 3 : 3 | | 245 4 -> 7 : 2 | | 246 8 -> 15 : 0 | | 247 16 -> 31 : 2 | | 248 32 -> 63 : 3 | | 249 64 -> 127 : 5 | | 250 128 -> 255 : 5 | | 251 256 -> 511 : 3 | | 252 512 -> 1023 : 6 | | 253 1024 -> 2047 : 2 | | 254 255avg = 9 nsecs, total: 11736 nsecs, count: 1282 256 25720:10:18 258 msecs : count distribution 259 0 -> 1 : 1265 |*************************************+| 260 2 -> 3 : 0 | | 261 4 -> 7 : 5 | | 262 8 -> 15 : 9 | | 263 16 -> 31 : 7 | | 264 32 -> 63 : 1 | | 265 64 -> 127 : 2 | | 266 128 -> 255 : 3 | | 267 256 -> 511 : 5 | | 268 512 -> 1023 : 5 | | 269 1024 -> 2047 : 0 | | 270 2048 -> 4095 : 1 | | 271 272avg = 8 nsecs, total: 11219 nsecs, count: 1303 273 274^C 27520:10:20 276 msecs : count distribution 277 0 -> 1 : 249 |*************************************+| 278 2 -> 3 : 0 | | 279 4 -> 7 : 0 | | 280 8 -> 15 : 1 | | 281 16 -> 31 : 0 | | 282 32 -> 63 : 0 | | 283 64 -> 127 : 0 | | 284 128 -> 255 : 0 | | 285 256 -> 511 : 0 | | 286 512 -> 1023 : 1 | | 287 288avg = 4 nsecs, total: 1029 nsecs, count: 251 289 290Detaching... 291 292 293A single process can be traced, which filters in-kernel for efficiency. Here, 294the vfs_read() function is timed as milliseconds for PID 17064, which is a 295bash shell: 296 297# ./funclatency -mp 17064 vfs_read 298Tracing vfs_read... Hit Ctrl-C to end. 299^C 300 msecs : count distribution 301 0 -> 1 : 1 |** | 302 2 -> 3 : 0 | | 303 4 -> 7 : 0 | | 304 8 -> 15 : 1 |** | 305 16 -> 31 : 2 |***** | 306 32 -> 63 : 0 | | 307 64 -> 127 : 13 |**************************************| 308 128 -> 255 : 10 |***************************** | 309 256 -> 511 : 4 |*********** | 310 311avg = 153 nsecs, total: 4765 nsecs, count: 31 312 313Detaching... 314 315The distribution between 64 and 511 milliseconds shows keystroke latency. 316 317 318The -F option can be used to print a histogram per function. Eg: 319 320# ./funclatency -uF 'vfs_r*' 321Tracing 5 functions for "vfs_r*"... Hit Ctrl-C to end. 322^C 323 324Function = vfs_read 325 usecs : count distribution 326 0 -> 1 : 1044 |****************************************| 327 2 -> 3 : 383 |************** | 328 4 -> 7 : 76 |** | 329 8 -> 15 : 41 |* | 330 16 -> 31 : 26 | | 331 32 -> 63 : 0 | | 332 64 -> 127 : 1 | | 333 128 -> 255 : 0 | | 334 256 -> 511 : 0 | | 335 512 -> 1023 : 0 | | 336 1024 -> 2047 : 0 | | 337 2048 -> 4095 : 4 | | 338 4096 -> 8191 : 2 | | 339 8192 -> 16383 : 0 | | 340 16384 -> 32767 : 0 | | 341 32768 -> 65535 : 2 | | 342 65536 -> 131071 : 5 | | 343 131072 -> 262143 : 5 | | 344 262144 -> 524287 : 3 | | 345 524288 -> 1048575 : 7 | | 346 347Function = vfs_rename 348 usecs : count distribution 349 0 -> 1 : 2 |**** | 350 2 -> 3 : 2 |**** | 351 4 -> 7 : 2 |**** | 352 8 -> 15 : 0 | | 353 16 -> 31 : 6 |************* | 354 32 -> 63 : 18 |****************************************| 355 356avg = 5087 nsecs, total: 8287001 nsecs, count: 1629 357 358Detaching... 359 360 361 362USAGE message: 363 364# ./funclatency -h 365usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r] [-v] 366 pattern 367 368Time functions and print latency as a histogram 369 370positional arguments: 371 pattern search expression for functions 372 373optional arguments: 374 -h, --help show this help message and exit 375 -p PID, --pid PID trace this PID only 376 -i INTERVAL, --interval INTERVAL 377 summary interval, in seconds 378 -d DURATION, --duration DURATION 379 total duration of trace, in seconds 380 -T, --timestamp include timestamp on output 381 -u, --microseconds microsecond histogram 382 -m, --milliseconds millisecond histogram 383 -F, --function show a separate histogram per function 384 -r, --regexp use regular expressions. Default is "*" wildcards 385 only. 386 -l LEVEL, --level LEVEL 387 set the level of nested or recursive functions 388 -v, --verbose print the BPF program (for debugging purposes) 389 390examples: 391 ./funclatency do_sys_open # time the do_sys_open() kernel function 392 ./funclatency c:read # time the read() C library function 393 ./funclatency -u vfs_read # time vfs_read(), in microseconds 394 ./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds 395 ./funclatency -i 2 -d 10 c:open # output every 2 seconds, for duration 10s 396 ./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps 397 ./funclatency -p 181 vfs_read # time process 181 only 398 ./funclatency 'vfs_fstat*' # time both vfs_fstat() and vfs_fstatat() 399 ./funclatency 'c:*printf' # time the *printf family of functions 400 ./funclatency -F 'vfs_r*' # show one histogram per matched function 401