1Demonstrations of syscount, the Linux/eBPF version. 2 3 4syscount summarizes syscall counts across the system or a specific process, 5with optional latency information. It is very useful for general workload 6characterization, for example: 7 8# syscount 9Tracing syscalls, printing top 10... Ctrl+C to quit. 10[09:39:04] 11SYSCALL COUNT 12write 10739 13read 10584 14wait4 1460 15nanosleep 1457 16select 795 17rt_sigprocmask 689 18clock_gettime 653 19rt_sigaction 128 20futex 86 21ioctl 83 22^C 23 24These are the top 10 entries; you can get more by using the -T switch. Here, 25the output indicates that the write and read syscalls were very common, followed 26immediately by wait4, nanosleep, and so on. By default, syscount counts across 27the entire system, but we can point it to a specific process of interest: 28 29# syscount -p $(pidof dd) 30Tracing syscalls, printing top 10... Ctrl+C to quit. 31[09:40:21] 32SYSCALL COUNT 33read 7878397 34write 7878397 35^C 36 37Indeed, dd's workload is a bit easier to characterize. Occasionally, the count 38of syscalls is not enough, and you'd also want an aggregate latency: 39 40# syscount -L 41Tracing syscalls, printing top 10... Ctrl+C to quit. 42[09:41:32] 43SYSCALL COUNT TIME (us) 44select 16 3415860.022 45nanosleep 291 12038.707 46ftruncate 1 122.939 47write 4 63.389 48stat 1 23.431 49fstat 1 5.088 50[unknown: 321] 32 4.965 51timerfd_settime 1 4.830 52ioctl 3 4.802 53kill 1 4.342 54^C 55 56The select and nanosleep calls are responsible for a lot of time, but remember 57these are blocking calls. This output was taken from a mostly idle system. Note 58the "unknown" entry -- syscall 321 is the bpf() syscall, which is not in the 59table used by this tool (borrowed from strace sources). 60 61Another direction would be to understand which processes are making a lot of 62syscalls, thus responsible for a lot of activity. This is what the -P switch 63does: 64 65# syscount -P 66Tracing syscalls, printing top 10... Ctrl+C to quit. 67[09:58:13] 68PID COMM COUNT 6913820 vim 548 7030216 sshd 149 7129633 bash 72 7225188 screen 70 7325776 mysqld 30 7431285 python 10 75529 systemd-udevd 9 761 systemd 8 77494 systemd-journal 5 78^C 79 80This is again from a mostly idle system over an interval of a few seconds. 81 82Sometimes, you'd only care about failed syscalls -- these are the ones that 83might be worth investigating with follow-up tools like opensnoop, execsnoop, 84or trace. Use the -x switch for this; the following example also demonstrates 85the -i switch, for printing at predefined intervals: 86 87# syscount -x -i 5 88Tracing failed syscalls, printing top 10... Ctrl+C to quit. 89[09:44:16] 90SYSCALL COUNT 91futex 13 92getxattr 10 93stat 8 94open 6 95wait4 3 96access 2 97[unknown: 321] 1 98 99[09:44:21] 100SYSCALL COUNT 101futex 12 102getxattr 10 103[unknown: 321] 2 104wait4 1 105access 1 106pause 1 107^C 108 109Similar to -x/--failures, sometimes you only care about certain syscall 110errors like EPERM or ENONET -- these are the ones that might be worth 111investigating with follow-up tools like opensnoop, execsnoop, or 112trace. Use the -e/--errno switch for this; the following example also 113demonstrates the -e switch, for printing ENOENT failures at predefined intervals: 114 115# syscount -e ENOENT -i 5 116Tracing syscalls, printing top 10... Ctrl+C to quit. 117[13:15:57] 118SYSCALL COUNT 119stat 4669 120open 1951 121access 561 122lstat 62 123openat 42 124readlink 8 125execve 4 126newfstatat 1 127 128[13:16:02] 129SYSCALL COUNT 130lstat 18506 131stat 13087 132open 2907 133access 412 134openat 19 135readlink 12 136execve 7 137connect 6 138unlink 1 139rmdir 1 140^C 141 142Sometimes, you'd only care about a single syscall rather than all syscalls. 143Use the --syscall option for this; the following example also demonstrates 144the --syscall option, for printing at predefined intervals: 145 146# syscount --syscall stat -i 1 147Tracing syscall 'stat'... Ctrl+C to quit. 148[12:51:06] 149SYSCALL COUNT 150stat 310 151 152[12:51:07] 153SYSCALL COUNT 154stat 316 155^C 156 157USAGE: 158# syscount -h 159usage: syscount.py [-h] [-p PID] [-t TID] [-i INTERVAL] [-d DURATION] [-T TOP] 160 [-x] [-e ERRNO] [-L] [-m] [-P] [-l] [--syscall SYSCALL] 161 162Summarize syscall counts and latencies. 163 164optional arguments: 165 -h, --help show this help message and exit 166 -p PID, --pid PID trace only this pid 167 -t TID, --tid TID trace only this tid 168 -c PPID, --ppid PPID trace only child of this pid 169 -i INTERVAL, --interval INTERVAL 170 print summary at this interval (seconds) 171 -d DURATION, --duration DURATION 172 total duration of trace, in seconds 173 -T TOP, --top TOP print only the top syscalls by count or latency 174 -x, --failures trace only failed syscalls (return < 0) 175 -e ERRNO, --errno ERRNO 176 trace only syscalls that return this error (numeric or 177 EPERM, etc.) 178 -L, --latency collect syscall latency 179 -m, --milliseconds display latency in milliseconds (default: 180 microseconds) 181 -P, --process count by process and not by syscall 182 -l, --list print list of recognized syscalls and exit 183 --syscall SYSCALL trace this syscall only (use option -l to get all 184 recognized syscalls) 185