1# The afl-fuzz approach 2 3AFL++ is a brute-force fuzzer coupled with an exceedingly simple but rock-solid 4instrumentation-guided genetic algorithm. It uses a modified form of edge 5coverage to effortlessly pick up subtle, local-scale changes to program control 6flow. 7 8Note: If you are interested in a more current up-to-date deep dive how AFL++ 9works then we commend this blog post: 10[https://blog.ritsec.club/posts/afl-under-hood/](https://blog.ritsec.club/posts/afl-under-hood/) 11 12Simplifying a bit, the overall algorithm can be summed up as: 13 141) Load user-supplied initial test cases into the queue. 15 162) Take the next input file from the queue. 17 183) Attempt to trim the test case to the smallest size that doesn't alter the 19 measured behavior of the program. 20 214) Repeatedly mutate the file using a balanced and well-researched variety of 22 traditional fuzzing strategies. 23 245) If any of the generated mutations resulted in a new state transition recorded 25 by the instrumentation, add mutated output as a new entry in the queue. 26 276) Go to 2. 28 29The discovered test cases are also periodically culled to eliminate ones that 30have been obsoleted by newer, higher-coverage finds; and undergo several other 31instrumentation-driven effort minimization steps. 32 33As a side result of the fuzzing process, the tool creates a small, 34self-contained corpus of interesting test cases. These are extremely useful for 35seeding other, labor- or resource-intensive testing regimes - for example, for 36stress-testing browsers, office applications, graphics suites, or closed-source 37tools. 38 39The fuzzer is thoroughly tested to deliver out-of-the-box performance far 40superior to blind fuzzing or coverage-only tools. 41 42## Understanding the status screen 43 44This section provides an overview of the status screen - plus tips for 45troubleshooting any warnings and red text shown in the UI. 46 47For the general instruction manual, see [README.md](README.md). 48 49### A note about colors 50 51The status screen and error messages use colors to keep things readable and 52attract your attention to the most important details. For example, red almost 53always means "consult this doc" :-) 54 55Unfortunately, the UI will only render correctly if your terminal is using 56traditional un*x palette (white text on black background) or something close to 57that. 58 59If you are using inverse video, you may want to change your settings, say: 60 61- For GNOME Terminal, go to `Edit > Profile` preferences, select the "colors" 62 tab, and from the list of built-in schemes, choose "white on black". 63- For the MacOS X Terminal app, open a new window using the "Pro" scheme via the 64 `Shell > New Window` menu (or make "Pro" your default). 65 66Alternatively, if you really like your current colors, you can edit config.h to 67comment out USE_COLORS, then do `make clean all`. 68 69We are not aware of any other simple way to make this work without causing other 70side effects - sorry about that. 71 72With that out of the way, let's talk about what's actually on the screen... 73 74### The status bar 75 76``` 77american fuzzy lop ++3.01a (default) [fast] {0} 78``` 79 80The top line shows you which mode afl-fuzz is running in (normal: "american 81fuzzy lop", crash exploration mode: "peruvian rabbit mode") and the version of 82AFL++. Next to the version is the banner, which, if not set with -T by hand, 83will either show the binary name being fuzzed, or the -M/-S main/secondary name 84for parallel fuzzing. Second to last is the power schedule mode being run 85(default: fast). Finally, the last item is the CPU id. 86 87### Process timing 88 89``` 90 +----------------------------------------------------+ 91 | run time : 0 days, 8 hrs, 32 min, 43 sec | 92 | last new find : 0 days, 0 hrs, 6 min, 40 sec | 93 | last uniq crash : none seen yet | 94 | last uniq hang : 0 days, 1 hrs, 24 min, 32 sec | 95 +----------------------------------------------------+ 96``` 97 98This section is fairly self-explanatory: it tells you how long the fuzzer has 99been running and how much time has elapsed since its most recent finds. This is 100broken down into "paths" (a shorthand for test cases that trigger new execution 101patterns), crashes, and hangs. 102 103When it comes to timing: there is no hard rule, but most fuzzing jobs should be 104expected to run for days or weeks; in fact, for a moderately complex project, 105the first pass will probably take a day or so. Every now and then, some jobs 106will be allowed to run for months. 107 108There's one important thing to watch out for: if the tool is not finding new 109paths within several minutes of starting, you're probably not invoking the 110target binary correctly and it never gets to parse the input files that are 111thrown at it; other possible explanations are that the default memory limit 112(`-m`) is too restrictive and the program exits after failing to allocate a 113buffer very early on; or that the input files are patently invalid and always 114fail a basic header check. 115 116If there are no new paths showing up for a while, you will eventually see a big 117red warning in this section, too :-) 118 119### Overall results 120 121``` 122 +-----------------------+ 123 | cycles done : 0 | 124 | total paths : 2095 | 125 | uniq crashes : 0 | 126 | uniq hangs : 19 | 127 +-----------------------+ 128``` 129 130The first field in this section gives you the count of queue passes done so far 131- that is, the number of times the fuzzer went over all the interesting test 132 cases discovered so far, fuzzed them, and looped back to the very beginning. 133 Every fuzzing session should be allowed to complete at least one cycle; and 134 ideally, should run much longer than that. 135 136As noted earlier, the first pass can take a day or longer, so sit back and 137relax. 138 139To help make the call on when to hit `Ctrl-C`, the cycle counter is color-coded. 140It is shown in magenta during the first pass, progresses to yellow if new finds 141are still being made in subsequent rounds, then blue when that ends - and 142finally, turns green after the fuzzer hasn't been seeing any action for a longer 143while. 144 145The remaining fields in this part of the screen should be pretty obvious: 146there's the number of test cases ("paths") discovered so far, and the number of 147unique faults. The test cases, crashes, and hangs can be explored in real-time 148by browsing the output directory, see 149[#interpreting-output](#interpreting-output). 150 151### Cycle progress 152 153``` 154 +-------------------------------------+ 155 | now processing : 1296 (61.86%) | 156 | paths timed out : 0 (0.00%) | 157 +-------------------------------------+ 158``` 159 160This box tells you how far along the fuzzer is with the current queue cycle: it 161shows the ID of the test case it is currently working on, plus the number of 162inputs it decided to ditch because they were persistently timing out. 163 164The "*" suffix sometimes shown in the first line means that the currently 165processed path is not "favored" (a property discussed later on). 166 167### Map coverage 168 169``` 170 +--------------------------------------+ 171 | map density : 10.15% / 29.07% | 172 | count coverage : 4.03 bits/tuple | 173 +--------------------------------------+ 174``` 175 176The section provides some trivia about the coverage observed by the 177instrumentation embedded in the target binary. 178 179The first line in the box tells you how many branch tuples already were hit, in 180proportion to how much the bitmap can hold. The number on the left describes the 181current input; the one on the right is the value for the entire input corpus. 182 183Be wary of extremes: 184 185- Absolute numbers below 200 or so suggest one of three things: that the program 186 is extremely simple; that it is not instrumented properly (e.g., due to being 187 linked against a non-instrumented copy of the target library); or that it is 188 bailing out prematurely on your input test cases. The fuzzer will try to mark 189 this in pink, just to make you aware. 190- Percentages over 70% may very rarely happen with very complex programs that 191 make heavy use of template-generated code. Because high bitmap density makes 192 it harder for the fuzzer to reliably discern new program states, we recommend 193 recompiling the binary with `AFL_INST_RATIO=10` or so and trying again (see 194 [env_variables.md](env_variables.md)). The fuzzer will flag high percentages 195 in red. Chances are, you will never see that unless you're fuzzing extremely 196 hairy software (say, v8, perl, ffmpeg). 197 198The other line deals with the variability in tuple hit counts seen in the 199binary. In essence, if every taken branch is always taken a fixed number of 200times for all the inputs that were tried, this will read `1.00`. As we manage to 201trigger other hit counts for every branch, the needle will start to move toward 202`8.00` (every bit in the 8-bit map hit), but will probably never reach that 203extreme. 204 205Together, the values can be useful for comparing the coverage of several 206different fuzzing jobs that rely on the same instrumented binary. 207 208### Stage progress 209 210``` 211 +-------------------------------------+ 212 | now trying : interest 32/8 | 213 | stage execs : 3996/34.4k (11.62%) | 214 | total execs : 27.4M | 215 | exec speed : 891.7/sec | 216 +-------------------------------------+ 217``` 218 219This part gives you an in-depth peek at what the fuzzer is actually doing right 220now. It tells you about the current stage, which can be any of: 221 222- calibration - a pre-fuzzing stage where the execution path is examined to 223 detect anomalies, establish baseline execution speed, and so on. Executed very 224 briefly whenever a new find is being made. 225- trim L/S - another pre-fuzzing stage where the test case is trimmed to the 226 shortest form that still produces the same execution path. The length (L) and 227 stepover (S) are chosen in general relationship to file size. 228- bitflip L/S - deterministic bit flips. There are L bits toggled at any given 229 time, walking the input file with S-bit increments. The current L/S variants 230 are: `1/1`, `2/1`, `4/1`, `8/8`, `16/8`, `32/8`. 231- arith L/8 - deterministic arithmetics. The fuzzer tries to subtract or add 232 small integers to 8-, 16-, and 32-bit values. The stepover is always 8 bits. 233- interest L/8 - deterministic value overwrite. The fuzzer has a list of known 234 "interesting" 8-, 16-, and 32-bit values to try. The stepover is 8 bits. 235- extras - deterministic injection of dictionary terms. This can be shown as 236 "user" or "auto", depending on whether the fuzzer is using a user-supplied 237 dictionary (`-x`) or an auto-created one. You will also see "over" or 238 "insert", depending on whether the dictionary words overwrite existing data or 239 are inserted by offsetting the remaining data to accommodate their length. 240- havoc - a sort-of-fixed-length cycle with stacked random tweaks. The 241 operations attempted during this stage include bit flips, overwrites with 242 random and "interesting" integers, block deletion, block duplication, plus 243 assorted dictionary-related operations (if a dictionary is supplied in the 244 first place). 245- splice - a last-resort strategy that kicks in after the first full queue cycle 246 with no new paths. It is equivalent to 'havoc', except that it first splices 247 together two random inputs from the queue at some arbitrarily selected 248 midpoint. 249- sync - a stage used only when `-M` or `-S` is set (see 250 [fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores)). 251 No real fuzzing is involved, but the tool scans the output from other fuzzers 252 and imports test cases as necessary. The first time this is done, it may take 253 several minutes or so. 254 255The remaining fields should be fairly self-evident: there's the exec count 256progress indicator for the current stage, a global exec counter, and a benchmark 257for the current program execution speed. This may fluctuate from one test case 258to another, but the benchmark should be ideally over 500 execs/sec most of the 259time - and if it stays below 100, the job will probably take very long. 260 261The fuzzer will explicitly warn you about slow targets, too. If this happens, 262see the [best_practices.md#improving-speed](best_practices.md#improving-speed) 263for ideas on how to speed things up. 264 265### Findings in depth 266 267``` 268 +--------------------------------------+ 269 | favored paths : 879 (41.96%) | 270 | new edges on : 423 (20.19%) | 271 | total crashes : 0 (0 unique) | 272 | total tmouts : 24 (19 unique) | 273 +--------------------------------------+ 274``` 275 276This gives you several metrics that are of interest mostly to complete nerds. 277The section includes the number of paths that the fuzzer likes the most based on 278a minimization algorithm baked into the code (these will get considerably more 279air time), and the number of test cases that actually resulted in better edge 280coverage (versus just pushing the branch hit counters up). There are also 281additional, more detailed counters for crashes and timeouts. 282 283Note that the timeout counter is somewhat different from the hang counter; this 284one includes all test cases that exceeded the timeout, even if they did not 285exceed it by a margin sufficient to be classified as hangs. 286 287### Fuzzing strategy yields 288 289``` 290 +-----------------------------------------------------+ 291 | bit flips : 57/289k, 18/289k, 18/288k | 292 | byte flips : 0/36.2k, 4/35.7k, 7/34.6k | 293 | arithmetics : 53/2.54M, 0/537k, 0/55.2k | 294 | known ints : 8/322k, 12/1.32M, 10/1.70M | 295 | dictionary : 9/52k, 1/53k, 1/24k | 296 |havoc/splice : 1903/20.0M, 0/0 | 297 |py/custom/rq : unused, 53/2.54M, unused | 298 | trim/eff : 20.31%/9201, 17.05% | 299 +-----------------------------------------------------+ 300``` 301 302This is just another nerd-targeted section keeping track of how many paths were 303netted, in proportion to the number of execs attempted, for each of the fuzzing 304strategies discussed earlier on. This serves to convincingly validate 305assumptions about the usefulness of the various approaches taken by afl-fuzz. 306 307The trim strategy stats in this section are a bit different than the rest. The 308first number in this line shows the ratio of bytes removed from the input files; 309the second one corresponds to the number of execs needed to achieve this goal. 310Finally, the third number shows the proportion of bytes that, although not 311possible to remove, were deemed to have no effect and were excluded from some of 312the more expensive deterministic fuzzing steps. 313 314Note that when deterministic mutation mode is off (which is the default because 315it is not very efficient) the first five lines display "disabled (default, 316enable with -D)". 317 318Only what is activated will have counter shown. 319 320### Path geometry 321 322``` 323 +---------------------+ 324 | levels : 5 | 325 | pending : 1570 | 326 | pend fav : 583 | 327 | own finds : 0 | 328 | imported : 0 | 329 | stability : 100.00% | 330 +---------------------+ 331``` 332 333The first field in this section tracks the path depth reached through the guided 334fuzzing process. In essence: the initial test cases supplied by the user are 335considered "level 1". The test cases that can be derived from that through 336traditional fuzzing are considered "level 2"; the ones derived by using these as 337inputs to subsequent fuzzing rounds are "level 3"; and so forth. The maximum 338depth is therefore a rough proxy for how much value you're getting out of the 339instrumentation-guided approach taken by afl-fuzz. 340 341The next field shows you the number of inputs that have not gone through any 342fuzzing yet. The same stat is also given for "favored" entries that the fuzzer 343really wants to get to in this queue cycle (the non-favored entries may have to 344wait a couple of cycles to get their chance). 345 346Next is the number of new paths found during this fuzzing section and imported 347from other fuzzer instances when doing parallelized fuzzing; and the extent to 348which identical inputs appear to sometimes produce variable behavior in the 349tested binary. 350 351That last bit is actually fairly interesting: it measures the consistency of 352observed traces. If a program always behaves the same for the same input data, 353it will earn a score of 100%. When the value is lower but still shown in purple, 354the fuzzing process is unlikely to be negatively affected. If it goes into red, 355you may be in trouble, since AFL++ will have difficulty discerning between 356meaningful and "phantom" effects of tweaking the input file. 357 358Now, most targets will just get a 100% score, but when you see lower figures, 359there are several things to look at: 360 361- The use of uninitialized memory in conjunction with some intrinsic sources of 362 entropy in the tested binary. Harmless to AFL, but could be indicative of a 363 security bug. 364- Attempts to manipulate persistent resources, such as left over temporary files 365 or shared memory objects. This is usually harmless, but you may want to 366 double-check to make sure the program isn't bailing out prematurely. Running 367 out of disk space, SHM handles, or other global resources can trigger this, 368 too. 369- Hitting some functionality that is actually designed to behave randomly. 370 Generally harmless. For example, when fuzzing sqlite, an input like `select 371 random();` will trigger a variable execution path. 372- Multiple threads executing at once in semi-random order. This is harmless when 373 the 'stability' metric stays over 90% or so, but can become an issue if not. 374 Here's what to try: 375 * Use afl-clang-fast from [instrumentation](../instrumentation/) - it uses a 376 thread-local tracking model that is less prone to concurrency issues, 377 * See if the target can be compiled or run without threads. Common 378 `./configure` options include `--without-threads`, `--disable-pthreads`, or 379 `--disable-openmp`. 380 * Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which 381 allows you to use a deterministic scheduler. 382- In persistent mode, minor drops in the "stability" metric can be normal, 383 because not all the code behaves identically when re-entered; but major dips 384 may signify that the code within `__AFL_LOOP()` is not behaving correctly on 385 subsequent iterations (e.g., due to incomplete clean-up or reinitialization of 386 the state) and that most of the fuzzing effort goes to waste. 387 388The paths where variable behavior is detected are marked with a matching entry 389in the `<out_dir>/queue/.state/variable_behavior/` directory, so you can look 390them up easily. 391 392### CPU load 393 394``` 395 [cpu: 25%] 396``` 397 398This tiny widget shows the apparent CPU utilization on the local system. It is 399calculated by taking the number of processes in the "runnable" state, and then 400comparing it to the number of logical cores on the system. 401 402If the value is shown in green, you are using fewer CPU cores than available on 403your system and can probably parallelize to improve performance; for tips on how 404to do that, see 405[fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores). 406 407If the value is shown in red, your CPU is *possibly* oversubscribed, and running 408additional fuzzers may not give you any benefits. 409 410Of course, this benchmark is very simplistic; it tells you how many processes 411are ready to run, but not how resource-hungry they may be. It also doesn't 412distinguish between physical cores, logical cores, and virtualized CPUs; the 413performance characteristics of each of these will differ quite a bit. 414 415If you want a more accurate measurement, you can run the `afl-gotcpu` utility 416from the command line. 417 418## Interpreting output 419 420See [#understanding-the-status-screen](#understanding-the-status-screen) for 421information on how to interpret the displayed stats and monitor the health of 422the process. Be sure to consult this file especially if any UI elements are 423highlighted in red. 424 425The fuzzing process will continue until you press Ctrl-C. At a minimum, you want 426to allow the fuzzer to at least one queue cycle without any new finds, which may 427take anywhere from a couple of hours to a week or so. 428 429There are three subdirectories created within the output directory and updated 430in real-time: 431 432- queue/ - test cases for every distinctive execution path, plus all the 433 starting files given by the user. This is the synthesized corpus. 434 435 Before using this corpus for any other purposes, you can shrink 436 it to a smaller size using the afl-cmin tool. The tool will find 437 a smaller subset of files offering equivalent edge coverage. 438 439- crashes/ - unique test cases that cause the tested program to receive a fatal 440 signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are grouped by 441 the received signal. 442 443- hangs/ - unique test cases that cause the tested program to time out. The 444 default time limit before something is classified as a hang is the 445 larger of 1 second and the value of the -t parameter. The value can 446 be fine-tuned by setting AFL_HANG_TMOUT, but this is rarely 447 necessary. 448 449Crashes and hangs are considered "unique" if the associated execution paths 450involve any state transitions not seen in previously-recorded faults. If a 451single bug can be reached in multiple ways, there will be some count inflation 452early in the process, but this should quickly taper off. 453 454The file names for crashes and hangs are correlated with the parent, 455non-faulting queue entries. This should help with debugging. 456 457## Visualizing 458 459If you have gnuplot installed, you can also generate some pretty graphs for any 460active fuzzing task using afl-plot. For an example of how this looks like, see 461[https://lcamtuf.coredump.cx/afl/plot/](https://lcamtuf.coredump.cx/afl/plot/). 462 463You can also manually build and install afl-plot-ui, which is a helper utility 464for showing the graphs generated by afl-plot in a graphical window using GTK. 465You can build and install it as follows: 466 467```shell 468sudo apt install libgtk-3-0 libgtk-3-dev pkg-config 469cd utils/plot_ui 470make 471cd ../../ 472sudo make install 473``` 474 475To learn more about remote monitoring and metrics visualization with StatsD, see 476[rpc_statsd.md](rpc_statsd.md). 477 478### Addendum: status and plot files 479 480For unattended operation, some of the key status screen information can be also 481found in a machine-readable format in the fuzzer_stats file in the output 482directory. This includes: 483 484- `start_time` - unix time indicating the start time of afl-fuzz 485- `last_update` - unix time corresponding to the last update of this file 486- `run_time` - run time in seconds to the last update of this file 487- `fuzzer_pid` - PID of the fuzzer process 488- `cycles_done` - queue cycles completed so far 489- `cycles_wo_finds` - number of cycles without any new paths found 490- `time_wo_finds` - longest time in seconds no new path was found 491- `execs_done` - number of execve() calls attempted 492- `execs_per_sec` - overall number of execs per second 493- `corpus_count` - total number of entries in the queue 494- `corpus_favored` - number of queue entries that are favored 495- `corpus_found` - number of entries discovered through local fuzzing 496- `corpus_imported` - number of entries imported from other instances 497- `max_depth` - number of levels in the generated data set 498- `cur_item` - currently processed entry number 499- `pending_favs` - number of favored entries still waiting to be fuzzed 500- `pending_total` - number of all entries waiting to be fuzzed 501- `corpus_variable` - number of test cases showing variable behavior 502- `stability` - percentage of bitmap bytes that behave consistently 503- `bitmap_cvg` - percentage of edge coverage found in the map so far 504- `saved_crashes` - number of unique crashes recorded 505- `saved_hangs` - number of unique hangs encountered 506- `last_find` - seconds since the last find was found 507- `last_crash` - seconds since the last crash was found 508- `last_hang` - seconds since the last hang was found 509- `execs_since_crash` - execs since the last crash was found 510- `exec_timeout` - the -t command line value 511- `slowest_exec_ms` - real time of the slowest execution in ms 512- `peak_rss_mb` - max rss usage reached during fuzzing in MB 513- `edges_found` - how many edges have been found 514- `var_byte_count` - how many edges are non-deterministic 515- `afl_banner` - banner text (e.g., the target name) 516- `afl_version` - the version of AFL++ used 517- `target_mode` - default, persistent, qemu, unicorn, non-instrumented 518- `command_line` - full command line used for the fuzzing session 519 520Most of these map directly to the UI elements discussed earlier on. 521 522On top of that, you can also find an entry called `plot_data`, containing a 523plottable history for most of these fields. If you have gnuplot installed, you 524can turn this into a nice progress report with the included `afl-plot` tool. 525 526### Addendum: automatically sending metrics with StatsD 527 528In a CI environment or when running multiple fuzzers, it can be tedious to log 529into each of them or deploy scripts to read the fuzzer statistics. Using 530`AFL_STATSD` (and the other related environment variables `AFL_STATSD_HOST`, 531`AFL_STATSD_PORT`, `AFL_STATSD_TAGS_FLAVOR`) you can automatically send metrics 532to your favorite StatsD server. Depending on your StatsD server, you will be 533able to monitor, trigger alerts, or perform actions based on these metrics 534(e.g.: alert on slow exec/s for a new build, threshold of crashes, time since 535last crash > X, etc.). 536 537The selected metrics are a subset of all the metrics found in the status and in 538the plot file. The list is the following: `cycle_done`, `cycles_wo_finds`, 539`execs_done`,`execs_per_sec`, `corpus_count`, `corpus_favored`, `corpus_found`, 540`corpus_imported`, `max_depth`, `cur_item`, `pending_favs`, `pending_total`, 541`corpus_variable`, `saved_crashes`, `saved_hangs`, `total_crashes`, 542`slowest_exec_ms`, `edges_found`, `var_byte_count`, `havoc_expansion`. Their 543definitions can be found in the addendum above. 544 545When using multiple fuzzer instances with StatsD, it is *strongly* recommended 546to setup the flavor (`AFL_STATSD_TAGS_FLAVOR`) to match your StatsD server. This 547will allow you to see individual fuzzer performance, detect bad ones, see the 548progress of each strategy...