1Lua Tools for BCC 2----------------- 3 4This directory contains Lua tooling for [BCC][bcc] 5(the BPF Compiler Collection). 6 7BCC is a toolkit for creating userspace and kernel tracing programs. By 8default, it comes with a library `libbcc`, some example tooling and a Python 9frontend for the library. 10 11Here we present an alternate frontend for `libbcc` implemented in LuaJIT. This 12lets you write the userspace part of your tracer in Lua instead of Python. 13 14Since LuaJIT is a JIT compiled language, tracers implemented in `bcc-lua` 15exhibit significantly reduced overhead compared to their Python equivalents. 16This is particularly noticeable in tracers that actively use the table APIs to 17get information from the kernel. 18 19If your tracer makes extensive use of `BPF_MAP_TYPE_PERF_EVENT_ARRAY` or 20`BPF_MAP_TYPE_HASH`, you may find the performance characteristics of this 21implementation very appealing, as LuaJIT can compile to native code a lot of 22the callchain to process the events, and this wrapper has been designed to 23benefit from such JIT compilation. 24 25## Quickstart Guide 26 27The following instructions assume Ubuntu 18.04 LTS. 28 291. Clone this repository 30 31 ``` 32 $ git clone https://github.com/iovisor/bcc.git 33 $ cd bcc/ 34 ``` 35 362. As per the [Ubuntu - Binary](https://github.com/iovisor/bcc/blob/master/INSTALL.md#ubuntu---binary) installation istructions, install the required upstream stable and signed packages 37 38 ``` 39 $ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4052245BD4284CDD 40 $ echo "deb https://repo.iovisor.org/apt/$(lsb_release -cs) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/iovisor.list 41 $ sudo apt-get update 42 $ sudo apt-get install bcc-tools libbcc-examples linux-headers-$(uname -r) 43 ``` 44 453. Install LuaJit and the corresponding development files 46 47 ``` 48 $ sudo apt-get install luajit luajit-5.1-dev 49 ``` 50 514. Test one of the examples to ensure `libbcc` is properly installed 52 53 ``` 54 $ sudo src/lua/bcc-probe examples/lua/task_switch.lua 55 ``` 56 57## LuaJIT BPF compiler 58 59Now it is also possible to write Lua functions and compile them transparently to BPF bytecode, here is a simple socket filter example: 60 61```lua 62local S = require('syscall') 63local bpf = require('bpf') 64local map = bpf.map('array', 256) 65-- Kernel-space part of the program 66local prog = assert(bpf(function () 67 local proto = pkt.ip.proto -- Get byte (ip.proto) from frame at [23] 68 xadd(map[proto], 1) -- Increment packet count 69end)) 70-- User-space part of the program 71local sock = assert(bpf.socket('lo', prog)) 72for i=1,10 do 73 local icmp, udp, tcp = map[1], map[17], map[6] 74 print('TCP', tcp, 'UDP', udp, 'ICMP', icmp, 'packets') 75 S.sleep(1) 76end 77``` 78 79The other application of BPF programs is attaching to probes for [perf event tracing][tracing]. That means you can trace events inside the kernel (or user-space), and then collect results - for example histogram of `sendto()` latency, off-cpu time stack traces, syscall latency, and so on. While kernel probes and perf events have unstable ABI, with a dynamic language we can create and use proper type based on the tracepoint ABI on runtime. 80 81Runtime automatically recognizes reads that needs a helper to be accessed. The type casts denote source of the objects, for example the [bashreadline][bashreadline] example that prints entered bash commands from all running shells: 82 83```lua 84local ffi = require('ffi') 85local bpf = require('bpf') 86-- Perf event map 87local sample_t = 'struct { uint64_t pid; char str[80]; }' 88local events = bpf.map('perf_event_array') 89-- Kernel-space part of the program 90bpf.uprobe('/bin/bash:readline' function (ptregs) 91 local sample = ffi.new(sample_t) 92 sample.pid = pid_tgid() 93 ffi.copy(sample.str, ffi.cast('char *', req.ax)) -- Cast `ax` to string pointer and copy to buffer 94 perf_submit(events, sample) -- Write sample to perf event map 95end, true, -1, 0) 96-- User-space part of the program 97local log = events:reader(nil, 0, sample_t) -- Must specify PID or CPU_ID to observe 98while true do 99 log:block() -- Wait until event reader is readable 100 for _,e in log:read() do -- Collect available reader events 101 print(tonumber(e.pid), ffi.string(e.str)) 102 end 103end 104``` 105 106Where cast to `struct pt_regs` flags the source of data as probe arguments, which means any pointer derived 107from this structure points to kernel and a helper is needed to access it. Casting `req.ax` to pointer is then required for `ffi.copy` semantics, otherwise it would be treated as `u64` and only it's value would be 108copied. The type detection is automatic most of the times (socket filters and `bpf.tracepoint`), but not with uprobes and kprobes. 109 110### Installation 111 112```bash 113$ luarocks install bpf 114``` 115 116### Examples 117 118See `examples/lua` directory. 119 120### Helpers 121 122* `print(...)` is a wrapper for `bpf_trace_printk`, the output is captured in `cat /sys/kernel/debug/tracing/trace_pipe` 123* `bit.*` library **is** supported (`lshift, rshift, arshift, bnot, band, bor, bxor`) 124* `math.*` library *partially* supported (`log2, log, log10`) 125* `ffi.cast()` is implemented (including structures and arrays) 126* `ffi.new(...)` allocates memory on stack, initializers are NYI 127* `ffi.copy(...)` copies memory (possibly using helpers) between stack/kernel/registers 128* `ntoh(x[, width])` - convert from network to host byte order. 129* `hton(x[, width])` - convert from host to network byte order. 130* `xadd(dst, inc)` - exclusive add, a synchronous `*dst += b` if Lua had `+=` operator 131 132Below is a list of BPF-specific helpers: 133 134* `time()` - return current monotonic time in nanoseconds (uses `bpf_ktime_get_ns`) 135* `cpu()` - return current CPU number (uses `bpf_get_smp_processor_id`) 136* `pid_tgid()` - return caller `tgid << 32 | pid` (uses `bpf_get_current_pid_tgid`) 137* `uid_gid()` - return caller `gid << 32 | uid` (uses `bpf_get_current_uid_gid`) 138* `comm(var)` - write current process name (uses `bpf_get_current_comm`) 139* `perf_submit(map, var)` - submit variable to perf event array BPF map 140* `stack_id(map, flags)` - return stack trace identifier from stack trace BPF map 141* `load_bytes(off, var)` - helper for direct packet access with `skb_load_bytes()` 142 143### Current state 144 145* Not all LuaJIT bytecode opcodes are supported *(notable mentions below)* 146* Closures `UCLO` will probably never be supported, although you can use upvalues inside compiled function. 147* Type narrowing is opportunistic. Numbers are 64-bit by default, but 64-bit immediate loads are not supported (e.g. `local x = map[ffi.cast('uint64_t', 1000)]`) 148* Tail calls `CALLT`, and iterators `ITERI` are NYI (as of now) 149* Arbitrary ctype **is** supported both for map keys and values 150* Basic optimisations like: constant propagation, partial DCE, liveness analysis and speculative register allocation are implement, but there's no control flow analysis yet. This means the compiler has the visibility when things are used and dead-stores occur, but there's no rewriter pass to eliminate them. 151* No register sub-allocations, no aggressive use of caller-saved `R1-5`, no aggressive narrowing (this would require variable range assertions and variable relationships) 152* Slices with not 1/2/4/8 length are NYI (requires allocating a memory on stack and using pointer type) 153 154 155[bcc]: https://github.com/iovisor/bcc 156[tracing]: http://www.brendangregg.com/blog/2016-03-05/linux-bpf-superpowers.html 157[bashreadline]: http://www.brendangregg.com/blog/2016-02-08/linux-ebpf-bcc-uprobes.html