1# Custom Mutators in AFL++ 2 3This file describes how you can implement custom mutations to be used in AFL. 4For now, we support C/C++ library and Python module, collectively named as the 5custom mutator. 6 7There is also experimental support for Rust in `custom_mutators/rust`. For 8documentation, refer to that directory. Run `cargo doc -p custom_mutator --open` 9in that directory to view the documentation in your web browser. 10 11Implemented by 12- C/C++ library (`*.so`): Khaled Yakdan from Code Intelligence 13 (<[email protected]>) 14- Python module: Christian Holler from Mozilla (<[email protected]>) 15 16## 1) Introduction 17 18Custom mutators can be passed to `afl-fuzz` to perform custom mutations on test 19cases beyond those available in AFL. For example, to enable structure-aware 20fuzzing by using libraries that perform mutations according to a given grammar. 21 22The custom mutator is passed to `afl-fuzz` via the `AFL_CUSTOM_MUTATOR_LIBRARY` 23or `AFL_PYTHON_MODULE` environment variable, and must export a fuzz function. 24Now AFL++ also supports multiple custom mutators which can be specified in the 25same `AFL_CUSTOM_MUTATOR_LIBRARY` environment variable like this. 26 27```bash 28export AFL_CUSTOM_MUTATOR_LIBRARY="full/path/to/mutator_first.so;full/path/to/mutator_second.so" 29``` 30 31For details, see [APIs](#2-apis) and [Usage](#3-usage). 32 33The custom mutation stage is set to be the first non-deterministic stage (right 34before the havoc stage). 35 36Note: If `AFL_CUSTOM_MUTATOR_ONLY` is set, all mutations will solely be 37performed with the custom mutator. 38 39## 2) APIs 40 41**IMPORTANT NOTE**: If you use our C/C++ API and you want to increase the size 42of an **out_buf buffer, you have to use `afl_realloc()` for this, so include 43`include/alloc-inl.h` - otherwise afl-fuzz will crash when trying to free 44your buffers. 45 46C/C++: 47 48```c 49void *afl_custom_init(afl_state_t *afl, unsigned int seed); 50unsigned int afl_custom_fuzz_count(void *data, const unsigned char *buf, size_t buf_size); 51void afl_custom_splice_optout(void *data); 52size_t afl_custom_fuzz(void *data, unsigned char *buf, size_t buf_size, unsigned char **out_buf, unsigned char *add_buf, size_t add_buf_size, size_t max_size); 53const char *afl_custom_describe(void *data, size_t max_description_len); 54size_t afl_custom_post_process(void *data, unsigned char *buf, size_t buf_size, unsigned char **out_buf); 55int afl_custom_init_trim(void *data, unsigned char *buf, size_t buf_size); 56size_t afl_custom_trim(void *data, unsigned char **out_buf); 57int afl_custom_post_trim(void *data, unsigned char success); 58size_t afl_custom_havoc_mutation(void *data, unsigned char *buf, size_t buf_size, unsigned char **out_buf, size_t max_size); 59unsigned char afl_custom_havoc_mutation_probability(void *data); 60unsigned char afl_custom_queue_get(void *data, const unsigned char *filename); 61void (*afl_custom_fuzz_send)(void *data, const u8 *buf, size_t buf_size); 62u8 afl_custom_queue_new_entry(void *data, const unsigned char *filename_new_queue, const unsigned int *filename_orig_queue); 63const char* afl_custom_introspection(my_mutator_t *data); 64void afl_custom_deinit(void *data); 65``` 66 67Python: 68 69```python 70def init(seed): 71 pass 72 73def fuzz_count(buf): 74 return cnt 75 76def splice_optout(): 77 pass 78 79def fuzz(buf, add_buf, max_size): 80 return mutated_out 81 82def describe(max_description_length): 83 return "description_of_current_mutation" 84 85def post_process(buf): 86 return out_buf 87 88def init_trim(buf): 89 return cnt 90 91def trim(): 92 return out_buf 93 94def post_trim(success): 95 return next_index 96 97def havoc_mutation(buf, max_size): 98 return mutated_out 99 100def havoc_mutation_probability(): 101 return probability # int in [0, 100] 102 103def queue_get(filename): 104 return True 105 106def fuzz_send(buf): 107 pass 108 109def queue_new_entry(filename_new_queue, filename_orig_queue): 110 return False 111 112def introspection(): 113 return string 114 115def deinit(): # optional for Python 116 pass 117``` 118 119### Custom Mutation 120 121- `init` (optional in Python): 122 123 This method is called when AFL++ starts up and is used to seed RNG and set 124 up buffers and state. 125 126- `queue_get` (optional): 127 128 This method determines whether AFL++ should fuzz the current 129 queue entry or not: all defined custom mutators as well as 130 all AFL++'s mutators. 131 132- `fuzz_count` (optional): 133 134 When a queue entry is selected to be fuzzed, afl-fuzz selects the number of 135 fuzzing attempts with this input based on a few factors. If, however, the 136 custom mutator wants to set this number instead on how often it is called 137 for a specific queue entry, use this function. This function is most useful 138 if `AFL_CUSTOM_MUTATOR_ONLY` is **not** used. 139 140- `splice_optout` (optional): 141 142 If this function is present, no splicing target is passed to the `fuzz` 143 function. This saves time if splicing data is not needed by the custom 144 fuzzing function. 145 This function is never called, just needs to be present to activate. 146 147- `fuzz` (optional): 148 149 This method performs your custom mutations on a given input. 150 The add_buf is the contents of another queue item that can be used for 151 splicing - or anything else - and can also be ignored. If you are not 152 using this additional data then define `splice_optout` (see above). 153 This function is optional. 154 Returing a length of 0 is valid and is interpreted as skipping this 155 one mutation result. 156 For non-Python: the returned output buffer is under **your** memory 157 management! 158 159- `describe` (optional): 160 161 When this function is called, it shall describe the current test case, 162 generated by the last mutation. This will be called, for example, to name 163 the written test case file after a crash occurred. Using it can help to 164 reproduce crashing mutations. 165 166- `havoc_mutation` and `havoc_mutation_probability` (optional): 167 168 `havoc_mutation` performs a single custom mutation on a given input. This 169 mutation is stacked with other mutations in havoc. The other method, 170 `havoc_mutation_probability`, returns the probability that `havoc_mutation` 171 is called in havoc. By default, it is 6%. 172 173- `post_process` (optional): 174 175 For some cases, the format of the mutated data returned from the custom 176 mutator is not suitable to directly execute the target with this input. For 177 example, when using libprotobuf-mutator, the data returned is in a protobuf 178 format which corresponds to a given grammar. In order to execute the target, 179 the protobuf data must be converted to the plain-text format expected by the 180 target. In such scenarios, the user can define the `post_process` function. 181 This function is then transforming the data into the format expected by the 182 API before executing the target. 183 184 This can return any python object that implements the buffer protocol and 185 supports PyBUF_SIMPLE. These include bytes, bytearray, etc. 186 187 You can decide in the post_process mutator to not send the mutated data 188 to the target, e.g. if it is too short, too corrupted, etc. If so, 189 return a NULL buffer and zero length (or a 0 length string in Python). 190 191 NOTE: Do not make any random changes to the data in this function! 192 193 PERFORMANCE for C/C++: If possible make the changes in-place (so modify 194 the `*data` directly, and return it as `*outbuf = data`. 195 196- `fuzz_send` (optional): 197 198 This method can be used if you want to send data to the target yourself, 199 e.g. via IPC. This replaces some usage of utils/afl_proxy but requires 200 that you start the target with afl-fuzz. 201 Example: [custom_mutators/examples/custom_send.c](../custom_mutators/examples/custom_send.c) 202 203- `queue_new_entry` (optional): 204 205 This methods is called after adding a new test case to the queue. If the 206 contents of the file was changed, return True, False otherwise. 207 208- `introspection` (optional): 209 210 This method is called after a new queue entry, crash or timeout is 211 discovered if compiled with INTROSPECTION. The custom mutator can then 212 return a string (const char *) that reports the exact mutations used. 213 214- `deinit` (optional in Python): 215 216 The last method to be called, deinitializing the state. 217 218Note that there are also three functions for trimming as described in the next 219section. 220 221### Trimming Support 222 223The generic trimming routines implemented in AFL++ can easily destroy the 224structure of complex formats, possibly leading to a point where you have a lot 225of test cases in the queue that your Python module cannot process anymore but 226your target application still accepts. This is especially the case when your 227target can process a part of the input (causing coverage) and then errors out on 228the remaining input. 229 230In such cases, it makes sense to implement a custom trimming routine. The API 231consists of multiple methods because after each trimming step, we have to go 232back into the C code to check if the coverage bitmap is still the same for the 233trimmed input. Here's a quick API description: 234 235- `init_trim` (optional): 236 237 This method is called at the start of each trimming operation and receives 238 the initial buffer. It should return the amount of iteration steps possible 239 on this input (e.g., if your input has n elements and you want to remove 240 them one by one, return n, if you do a binary search, return log(n), and so 241 on). 242 243 If your trimming algorithm doesn't allow to determine the amount of 244 (remaining) steps easily (esp. while running), then you can alternatively 245 return 1 here and always return 0 in `post_trim` until you are finished and 246 no steps remain. In that case, returning 1 in `post_trim` will end the 247 trimming routine. The whole current index/max iterations stuff is only used 248 to show progress. 249 250- `trim` (optional) 251 252 This method is called for each trimming operation. It doesn't have any 253 arguments because there is already the initial buffer from `init_trim` and 254 we can memorize the current state in the data variables. This can also save 255 reparsing steps for each iteration. It should return the trimmed input 256 buffer. 257 258- `post_trim` (optional) 259 260 This method is called after each trim operation to inform you if your 261 trimming step was successful or not (in terms of coverage). If you receive a 262 failure here, you should reset your input to the last known good state. In 263 any case, this method must return the next trim iteration index (from 0 to 264 the maximum amount of steps you returned in `init_trim`). 265 266Omitting any of three trimming methods will cause the trimming to be disabled 267and trigger a fallback to the built-in default trimming routine. 268 269### Environment Variables 270 271Optionally, the following environment variables are supported: 272 273- `AFL_CUSTOM_MUTATOR_ONLY` 274 275 Disable all other mutation stages. This can prevent broken test cases (those 276 that your Python module can't work with anymore) to fill up your queue. Best 277 combined with a custom trimming routine (see below) because trimming can 278 cause the same test breakage like havoc and splice. 279 280- `AFL_PYTHON_ONLY` 281 282 Deprecated and removed, use `AFL_CUSTOM_MUTATOR_ONLY` instead. 283 284- `AFL_DEBUG` 285 286 When combined with `AFL_NO_UI`, this causes the C trimming code to emit 287 additional messages about the performance and actions of your custom 288 trimmer. Use this to see if it works :) 289 290## 3) Usage 291 292### Prerequisite 293 294For Python mutators, the python 3 or 2 development package is required. On 295Debian/Ubuntu/Kali it can be installed like this: 296 297```bash 298sudo apt install python3-dev 299# or 300sudo apt install python-dev 301``` 302 303Then, AFL++ can be compiled with Python support. The AFL++ Makefile detects 304Python3 through `python-config`/`python3-config` if it is in the PATH and 305compiles `afl-fuzz` with the feature if available. 306 307Note: for some distributions, you might also need the package `python[3]-apt`. 308In case your setup is different, set the necessary variables like this: 309`PYTHON_INCLUDE=/path/to/python/include LDFLAGS=-L/path/to/python/lib make`. 310 311### Helpers 312 313For C/C++ custom mutators you get a pointer to `afl_state_t *afl` in the 314`afl_custom_init()` which contains all information that you need. 315Note that if you access it, you need to recompile your custom mutator if 316you update AFL++ because the structure might have changed! 317 318For mutators written in Python, Rust, GO, etc. there are a few environment 319variables set to help you to get started: 320 321`AFL_CUSTOM_INFO_PROGRAM` - the program name of the target that is executed. 322If your custom mutator is used with modes like Qemu (`-Q`), this will still 323contain the target program, not afl-qemu-trace. 324 325`AFL_CUSTOM_INFO_PROGRAM_INPUT` - if the `-f` parameter is used with afl-fuzz 326then this value is found in this environment variable. 327 328`AFL_CUSTOM_INFO_PROGRAM_ARGV` - this contains the parameters given to the 329target program and still has the `@@` identifier in there. 330 331Note: If `AFL_CUSTOM_INFO_PROGRAM_INPUT` is empty and `AFL_CUSTOM_INFO_PROGRAM_ARGV` 332is either empty or does not contain `@@` then the target gets the input via 333`stdin`. 334 335`AFL_CUSTOM_INFO_OUT` - This is the output directory for this fuzzer instance, 336so if `afl-fuzz` was called with `-o out -S foobar`, then this will be set to 337`out/foobar`. 338 339### Custom Mutator Preparation 340 341For C/C++ mutators, the source code must be compiled as a shared object: 342 343```bash 344gcc -shared -Wall -O3 example.c -o example.so 345``` 346 347Note that if you specify multiple custom mutators, the corresponding functions 348will be called in the order in which they are specified. E.g., the first 349`post_process` function of `example_first.so` will be called and then that of 350`example_second.so`. 351 352### Run 353 354C/C++ 355 356```bash 357export AFL_CUSTOM_MUTATOR_LIBRARY="/full/path/to/example_first.so;/full/path/to/example_second.so" 358afl-fuzz /path/to/program 359``` 360 361Python 362 363```bash 364export PYTHONPATH=`dirname /full/path/to/example.py` 365export AFL_PYTHON_MODULE=example 366afl-fuzz /path/to/program 367``` 368 369## 4) Example 370 371See [example.c](../custom_mutators/examples/example.c) and 372[example.py](../custom_mutators/examples/example.py). 373 374## 5) Other Resources 375 376- AFL libprotobuf mutator 377 - [bruce30262/libprotobuf-mutator_fuzzing_learning](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator) 378 - [thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator) 379- [XML Fuzzing@NullCon 2017](https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf) 380 - [A bug detected by AFL + XML-aware mutators](https://bugs.chromium.org/p/chromium/issues/detail?id=930663) 381