1*This document was originally written for a broad audience, and it was* 2*determined that it'd be good to hold in Bionic's docs, too. Due to the* 3*ever-changing nature of code, it tries to link to a stable tag of* 4*Bionic's libc, rather than the live code in Bionic. Same for Clang.* 5*Reader beware. :)* 6 7# The Anatomy of Clang FORTIFY 8 9## Objective 10 11The intent of this document is to run through the minutiae of how Clang FORTIFY 12actually works in Bionic at the time of writing. Other FORTIFY implementations 13that target Clang should use very similar mechanics. This document exists in part 14because many Clang-specific features serve multiple purposes simultaneously, so 15getting up-to-speed on how things function can be quite difficult. 16 17## Background 18 19FORTIFY is a broad suite of extensions to libc aimed at catching misuses of 20common library functions. Textually, these extensions exist purely in libc, but 21all implementations of FORTIFY rely heavily on C language extensions in order 22to function at all. 23 24Broadly, FORTIFY implementations try to guard against many misuses of C 25standard(-ish) libraries: 26- Buffer overruns in functions where pointers+sizes are passed (e.g., `memcpy`, 27 `poll`), or where sizes exist implicitly (e.g., `strcpy`). 28- Arguments with incorrect values passed to libc functions (e.g., 29 out-of-bounds bits in `umask`). 30- Missing arguments to functions (e.g., `open()` with `O_CREAT`, but no mode 31 bits). 32 33FORTIFY is traditionally enabled by passing `-D_FORTIFY_SOURCE=N` to your 34compiler. `N==0` disables FORTIFY, whereas `N==1`, `N==2`, and `N==3` enable 35increasingly strict versions of it. In general, FORTIFY doesn't require user 36code changes; that said, some code patterns 37are [incompatible with stricter versions of FORTIFY checking]. This is largely 38because FORTIFY has significant flexibility in what it considers to be an 39"out-of-bounds" access. 40 41FORTIFY implementations use a mix of compiler diagnostics and runtime checks to 42flag and/or mitigate the impacts of the misuses mentioned above. 43 44Further, given FORTIFY's design, the effectiveness of FORTIFY is a function of 45-- among other things -- the optimization level you're compiling your code at. 46Many FORTIFY implementations are implicitly disabled when building with `-O0`, 47since FORTIFY's design for both Clang and GCC relies on optimizations in order 48to provide useful run-time checks. For the purpose of this document, all 49analysis of FORTIFY functions and commentary on builtins assume that code is 50being built with some optimization level > `-O0`. 51 52### A note on GCC 53 54This document talks specifically about Bionic's FORTIFY implementation targeted 55at Clang. While GCC also provides a set of language extensions necessary to 56implement FORTIFY, these tools are different from what Clang offers. This 57divergence is an artifact of Clang and GCC's differing architecture as 58compilers. 59 60Textually, quite a bit can be shared between a FORTIFY implementation for GCC 61and one for Clang (e.g., see [ChromeOS' Glibc patch]), but this kind of sharing 62requires things like macros that expand to unbalanced braces depending on your 63compiler: 64 65```c 66/* 67 * Highly simplified; if you're interested in FORTIFY's actual implementation, 68 * please see the patch linked above. 69 */ 70#ifdef __clang__ 71# define FORTIFY_PRECONDITIONS 72# define FORTIFY_FUNCTION_END 73#else 74# define FORTIFY_PRECONDITIONS { 75# define FORTIFY_FUNCTION_END } 76#endif 77 78/* 79 * FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN is not defined, due to its 80 * complexity and irrelevance. It turns into a compile-time warning if the 81 * compiler can determine `*buf` has fewer than `size` bytes available. 82 */ 83 84char *getcwd(char *buf, size_t size) 85FORTIFY_PRECONDITIONS 86 FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN(buf, size, "`buf` is too smol.") 87{ 88 // Actual shared function implementation goes here. 89} 90FORTIFY_FUNCTION_END 91``` 92 93All talk of GCC-focused implementations and how to merge Clang and GCC 94implementations is out-of-scope for this doc, however. 95 96## The Life of a Clang FORTIFY Function 97 98As referenced in the Background section, FORTIFY performs many different checks 99for many functions. This section intends to go through real-world examples of 100FORTIFY functions in Bionic, breaking down how each part of these functions 101work, and how the pieces fit together to provide FORTIFY-like functionality. 102 103While FORTIFY implementations may differ between stdlibs, they broadly follow 104the same patterns when implementing their checks for Clang, and they try to 105make similar promises with respect to FORTIFY compiling to be zero-overhead in 106some cases, etc. Moreover, while this document specifically examines Bionic, 107many stdlibs will operate _very similarly_ to Bionic in their Clang FORTIFY 108implementations. 109 110**In general, when reading the below, be prepared for exceptions, subtlety, and 111corner cases. The individual function breakdowns below try to not offer 112redundant information. Each one focuses on different aspects of FORTIFY.** 113 114### Terminology 115 116Because FORTIFY should be mostly transparent to developers, there are inherent 117naming collisions here: `memcpy(x, y, z)` turns into fundamentally different 118generated code depending on the value of `_FORTIFY_SOURCE`. Further, said 119`memcpy` call with `_FORTIFY_SOURCE` enabled needs to be able to refer to the 120`memcpy` that would have been called, had `_FORTIFY_SOURCE` been disabled. 121Hence, the following convention is followed in the subsections below for all 122prose (namely, multiline code blocks are exempted from this): 123 124- Standard library function names preceded by `__builtin_` refer to the use of 125 the function with `_FORTIFY_SOURCE` disabled. 126- Standard library function names without a prefix refer to the use of the 127 function with `_FORTIFY_SOURCE` enabled. 128 129This convention also applies in `clang`. `__builtin_memcpy` will always call 130`memcpy` as though `_FORTIFY_SOURCE` were disabled. 131 132## Breakdown of `mempcpy` 133 134The [FORTIFY'ed version of `mempcpy`] is a full, featureful example of a 135FORTIFY'ed function from Bionic. From the user's perspective, it supports a few 136things: 137- Producing a compile-time error if the number of bytes to copy trivially 138 exceeds the number of bytes available at the destination pointer. 139- If the `mempcpy` has the potential to write to more bytes than what is 140 available at the destination, a run-time check is inserted to crash the 141 program if more bytes are written than what is allowed. 142- Compiling away to be zero overhead when none of the buffer sizes can be 143 determined at compile-time[^1]. 144 145The declaration in Bionic's headers for `__builtin_mempcpy` is: 146```c 147void* mempcpy(void* __dst, const void* __src, size_t __n) __INTRODUCED_IN(23); 148``` 149 150Which is annotated with nothing special, so it will be ignored. 151 152The [source for `mempcpy`] in Bionic's headers for is: 153```c 154__BIONIC_FORTIFY_INLINE 155void* mempcpy(void* const dst __pass_object_size0, const void* src, size_t copy_amount) 156 __overloadable 157 __clang_error_if(__bos_unevaluated_lt(__bos0(dst), copy_amount), 158 "'mempcpy' called with size bigger than buffer") { 159#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 160 size_t bos_dst = __bos0(dst); 161 if (!__bos_trivially_ge(bos_dst, copy_amount)) { 162 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 163 } 164#endif 165 return __builtin_mempcpy(dst, src, copy_amount); 166} 167``` 168 169Expanding some of the important macros here, this function expands to roughly: 170```c 171static 172__inline__ 173__attribute__((no_stack_protector)) 174__attribute__((always_inline)) 175void* mempcpy( 176 void* const dst __attribute__((pass_object_size(0))), 177 const void* src, 178 size_t copy_amount) 179 __attribute__((overloadable)) 180 __attribute__((diagnose_if( 181 __builtin_object_size(dst, 0) != -1 && __builtin_object_size(dst, 0) <= copy_amount), 182 "'mempcpy' called with size bigger than buffer"))) { 183#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 184 size_t bos_dst = __builtin_object_size(dst, 0); 185 if (!(__bos_trivially_ge(bos_dst, copy_amount))) { 186 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 187 } 188#endif 189 return __builtin_mempcpy(dst, src, copy_amount); 190} 191``` 192 193So let's walk through this step by step, to see how FORTIFY does what it says on 194the tin here. 195 196[^1]: "Zero overhead" in a way [similar to C++11's `std::unique_ptr`]: this will 197turn into a direct call `__builtin_mempcpy` (or an optimized form thereof) with 198no other surrounding checks at runtime. However, the additional complexity may 199hinder optimizations that are performed before the optimizer can prove that the 200`if (...) { ... }` can be optimized out. Depending on how late this happens, 201the additional complexity may skew inlining costs, hide opportunities for e.g., 202`memcpy` coalescing, etc etc. 203 204### How does Clang select `mempcpy`? 205 206First, it's critical to notice that `mempcpy` is marked `overloadable`. This 207function is a `static inline __attribute__((always_inline))` overload of 208`__builtin_mempcpy`: 209- `__attribute__((overloadable))` allows us to perform overloading in C. 210- `__attribute__((overloadable))` mangles all calls to functions marked with 211 `__attribute__((overloadable))`. 212- `__attribute__((overloadable))` allows exactly one function signature with a 213 given name to not be marked with `__attribute__((overloadable))`. Calls to 214 this overload will not be mangled. 215 216Second, one might note that this `mempcpy` implementation has the same C-level 217signature as `__builtin_mempcpy`. `pass_object_size` is a Clang attribute that 218is generally needed by FORTIFY, but it carries the side-effect that functions 219may be overloaded simply on the presence (or lack of presence) of 220`pass_object_size` attributes. Given two overloads of a function that only 221differ on the presence of `pass_object_size` attributes, the candidate with 222`pass_object_size` attributes is preferred. 223 224Finally, the prior paragraph gets thrown out if one tries to take the address of 225`mempcpy`. It is impossible to take the address of a function with one or more 226parameters that are annotated with `pass_object_size`. Hence, 227`&__builtin_mempcpy == &mempcpy`. Further, because this is an issue of overload 228resolution, `(&mempcpy)(x, y, z);` is functionally identical to 229`__builtin_mempcpy(x, y, z);`. 230 231All of this accomplishes the following: 232- Direct calls to `mempcpy` should call the FORTIFY-protected `mempcpy`. 233- Indirect calls to `&mempcpy` should call `__builtin_mempcpy`. 234 235### How does Clang offer compile-time diagnostics? 236 237Once one is convinced that the FORTIFY-enabled overload of `mempcpy` will be 238selected for direct calls, Clang's `diagnose_if` and `__builtin_object_size` do 239all of the work from there. 240 241Subtleties here primarily fall out of the discussion in the above section about 242`&__builtin_mempcpy == &mempcpy`: 243```c 244#define _FORTIFY_SOURCE 2 245#include <string.h> 246void example_code() { 247 char buf[4]; // ...Assume sizeof(char) == 1. 248 const char input_buf[] = "Hello, World"; 249 mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued. 250 251 mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5. 252 __builtin_mempcpy(buf, input_buf, 5); // No compile-time error. 253 (&mempcpy)(buf, input_buf, 5); // No compile-time error, since __builtin_mempcpy is selected. 254} 255``` 256 257Otherwise, the rest of this subsection is dedicated to preliminary discussion 258about `__builtin_object_size`. 259 260Clang's frontend can do one of two things with `__builtin_object_size(p, n)`: 261- Evaluate it as a constant. 262 - This can either mean declaring that the number of bytes at `p` is definitely 263 impossible to know, so the default value is used, or the number of bytes at 264 `p` can be known without optimizations. 265- Declare that the expression cannot form a constant, and lower it to 266 `@llvm.objectsize`, which is discussed in depth later. 267 268In the examples above, since `diagnose_if` is evaluated with context from the 269caller, Clang should be able to trivially determine that `buf` refers to a 270`char` array with 4 elements. 271 272The primary consequence of the above is that diagnostics can only be emitted if 273no optimizations are required to detect a broken code pattern. To be specific, 274clang's constexpr evaluator must be able to determine the logical object that 275any given pointer points to in order to fold `__builtin_object_size` to a 276constant, non-default answer: 277 278```c 279#define _FORTIFY_SOURCE 2 280#include <string.h> 281void example_code() { 282 char buf[4]; // ...Assume sizeof(char) == 1. 283 const char input_buf[] = "Hello, World"; 284 mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued. 285 mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5. 286 char *buf_ptr = buf; 287 mempcpy(buf_ptr, input_buf, 5); // No compile-time error; `buf_ptr`'s target can't be determined. 288} 289``` 290 291### How does Clang insert run-time checks? 292 293This section expands on the following statement: FORTIFY has zero runtime cost 294in instances where there is no chance of catching a bug at run-time. Otherwise, 295it introduces a tiny additional run-time cost to ensure that functions aren't 296misused. 297 298In prior sections, the following was established: 299- `overloadable` and `pass_object_size` prompt Clang to always select this 300 overload of `mempcpy` over `__builtin_mempcpy` for direct calls. 301- If a call to `mempcpy` was trivially broken, Clang would produce a 302 compile-time error, rather than producing a binary. 303 304Hence, the case we're interested in here is one where Clang's frontend selected 305a FORTIFY'ed function's implementation for a function call, but was unable to 306find anything seriously wrong with said function call. Since the frontend is 307powerless to detect bugs at this point, our focus shifts to the mechanisms that 308LLVM uses to support FORTIFY. 309 310Going back to Bionic's `mempcpy` implementation, we have the following (ignoring 311diagnose_if and assuming run-time checks are enabled): 312```c 313static 314__inline__ 315__attribute__((no_stack_protector)) 316__attribute__((always_inline)) 317void* mempcpy( 318 void* const dst __attribute__((pass_object_size(0))), 319 const void* src, 320 size_t copy_amount) 321 __attribute__((overloadable)) { 322 size_t bos_dst = __builtin_object_size(dst, 0); 323 if (bos_dst != -1 && 324 !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) { 325 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 326 } 327 return __builtin_mempcpy(dst, src, copy_amount); 328} 329``` 330 331In other words, we have a `static`, `always_inline` function which: 332- If `__builtin_object_size(dst, 0)` cannot be determined (in which case, it 333 returns -1), calls `__builtin_mempcpy`. 334- Otherwise, if `copy_amount` can be folded to a constant, and if 335 `__builtin_object_size(dst, 0) >= copy_amount`, calls `__builtin_mempcpy`. 336- Otherwise, calls `__builtin___mempcpy_chk`. 337 338 339How can this be "zero overhead"? Let's focus on the following part of the 340function: 341 342```c 343 size_t bos_dst = __builtin_object_size(dst, 0); 344 if (bos_dst != -1 && 345 !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) { 346``` 347 348If Clang's frontend cannot determine a value for `__builtin_object_size`, Clang 349lowers it to LLVM's `@llvm.objectsize` intrinsic. The `@llvm.objectsize` 350invocation corresponding to `__builtin_object_size(p, 0)` is guaranteed to 351always fold to a constant value by the time LLVM emits machine code. 352 353Hence, `bos_dst` is guaranteed to be a constant; if it's -1, the above branch 354can be eliminated entirely, since it folds to `if (false && ...)`. Further, the 355RHS of the `&&` in this branch has us call `__builtin_mempcpy` if `copy_amount` 356is a known value less than `bos_dst` (yet another constant value). Therefore, 357the entire condition is always knowable when LLVM is done with LLVM IR-level 358optimizations, so no condition is ever emitted to machine code in practice. 359 360#### Why is "zero overhead" in quotes? Why is `unique_ptr` relevant? 361 362`__builtin_object_size` and `__builtin_constant_p` are forced to be constants 363after most optimizations take place. Until LLVM replaces both of these with 364constants and optimizes them out, we have additional branches and function calls 365in our IR. This can have negative effects, such as distorting inlining costs and 366inhibiting optimizations that are conservative around branches in control-flow. 367 368So FORTIFY is free in these cases _in isolation of any of the code around it_. 369Due to its implementation, it may impact the optimizations that occur on code 370around the literal call to the FORTIFY-hardened libc function. 371 372`unique_ptr` was just the first thing that came to the author's mind for "the 373type should be zero cost with any level of optimization enabled, but edge-cases 374might make it only-mostly-free to use." 375 376### How is checking actually performed? 377 378In cases where checking can be performed (e.g., where we call 379`__builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);`), Bionic provides [an 380implementation for `__mempcpy_chk`]. This is: 381 382```c 383extern "C" void* __mempcpy_chk(void* dst, const void* src, size_t count, size_t dst_len) { 384 __check_count("mempcpy", "count", count); 385 __check_buffer_access("mempcpy", "write into", count, dst_len); 386 return mempcpy(dst, src, count); 387} 388``` 389This function itself boils down to a few small branches which abort the program 390if they fail, and a direct call to `__builtin_mempcpy`. 391 392### Wrapping up 393 394In the above breakdown, it was shown how Clang and Bionic work together to: 395- represent FORTIFY-hardened overloads of functions, 396- report misuses of stdlib functions at compile-time, and 397- insert run-time checks for uses of functions that might be incorrect, but only 398 if we have the potential of proving the incorrectness of these. 399 400## Breakdown of open 401 402In Bionic, the [FORTIFY'ed implementation of `open`] is quite large. Much like 403`mempcpy`, the `__builtin_open` declaration is simple: 404 405```c 406int open(const char* __path, int __flags, ...); 407``` 408 409With some macros expanded, the FORTIFY-hardened header implementation is: 410```c 411int __open_2(const char*, int); 412int __open_real(const char*, int, ...) __asm__(open); 413 414#define __open_modes_useful(flags) (((flags) & O_CREAT) || ((flags) & O_TMPFILE) == O_TMPFILE) 415 416static 417int open(const char* pathname, int flags, mode_t modes, ...) __overloadable 418 __attribute__((diagnose_if(1, "error", "too many arguments"))); 419 420static 421__inline__ 422__attribute__((no_stack_protector)) 423__attribute__((always_inline)) 424int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags) 425 __attribute__((overloadable)) 426 __attribute__((diagnose_if( 427 __open_modes_useful(flags), 428 "error", 429 "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) { 430#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 431 return __open_2(pathname, flags); 432#else 433 return __open_real(pathname, flags); 434#endif 435} 436static 437__inline__ 438__attribute__((no_stack_protector)) 439__attribute__((always_inline)) 440int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes) 441 __attribute__((overloadable)) 442 __clang_warning_if(!__open_modes_useful(flags) && modes, 443 "'open' has superfluous mode bits; missing O_CREAT?") { 444 return __open_real(pathname, flags, modes); 445} 446``` 447 448Which may be a lot to take in. 449 450Before diving too deeply, please note that the remainder of these subsections 451assume that the programmer didn't make any egregious typos. Moreover, there's no 452real way that Bionic tries to prevent calls to `open` like 453`open("foo", 0, "how do you convert a const char[N] to mode_t?");`. The only 454real C-compatible solution the author can think of is "stamp out many overloads 455to catch sort-of-common instances of this very uncommon typo." This isn't great. 456 457More directly, no effort is made below to recognize calls that, due to 458incompatible argument types, cannot go to any `open` implementation other than 459`__builtin_open`, since it's recognized right here. :) 460 461### Implementation breakdown 462 463This `open` implementation does a few things: 464- Turns calls to `open` with too many arguments into a compile-time error. 465- Diagnoses calls to `open` with missing modes at compile-time and run-time 466 (both cases turn into errors). 467- Emits warnings on calls to `open` with useless mode bits, unless the mode bits 468 are all 0. 469 470One common bit of code not explained below is the `__open_real` declaration above: 471```c 472int __open_real(const char*, int, ...) __asm__(open); 473``` 474 475This exists as a way for us to call `__builtin_open` without needing clang to 476have a pre-defined `__builtin_open` function. 477 478#### Compile-time error on too many arguments 479 480```c 481static 482int open(const char* pathname, int flags, mode_t modes, ...) __overloadable 483 __attribute__((diagnose_if(1, "error", "too many arguments"))); 484``` 485 486Which matches most calls to open that supply too many arguments, since 487`int(const char *, int, ...)` matches less strongly than 488`int(const char *, int, mode_t, ...)` for calls where the 3rd arg can be 489converted to `mode_t` without too much effort. Because of the `diagnose_if` 490attribute, all of these calls turn into compile-time errors. 491 492#### Compile-time or run-time error on missing arguments 493The following overload handles all two-argument calls to `open`. 494```c 495static 496__inline__ 497__attribute__((no_stack_protector)) 498__attribute__((always_inline)) 499int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags) 500 __attribute__((overloadable)) 501 __attribute__((diagnose_if( 502 __open_modes_useful(flags), 503 "error", 504 "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) { 505#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 506 return __open_2(pathname, flags); 507#else 508 return __open_real(pathname, flags); 509#endif 510} 511``` 512 513Like `mempcpy`, `diagnose_if` handles emitting a compile-time error if the call 514to `open` is broken in a way that's visible to Clang's frontend. This 515essentially boils down to "`open` is being called with a `flags` value that 516requires mode bits to be set." 517 518If that fails to catch a bug, we [unconditionally call `__open_2`], which 519performs a run-time check: 520```c 521int __open_2(const char* pathname, int flags) { 522 if (needs_mode(flags)) __fortify_fatal("open: called with O_CREAT/O_TMPFILE but no mode"); 523 return FDTRACK_CREATE_NAME("open", __openat(AT_FDCWD, pathname, force_O_LARGEFILE(flags), 0)); 524} 525``` 526 527#### Compile-time warning if modes are pointless 528 529Finally, we have the following `open` call: 530```c 531static 532__inline__ 533__attribute__((no_stack_protector)) 534__attribute__((always_inline)) 535int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes) 536 __attribute__((overloadable)) 537 __clang_warning_if(!__open_modes_useful(flags) && modes, 538 "'open' has superfluous mode bits; missing O_CREAT?") { 539 return __open_real(pathname, flags, modes); 540} 541``` 542 543This simply issues a warning if Clang's frontend can determine that `flags` 544isn't necessary. Due to conventions in existing code, a `modes` value of `0` is 545not diagnosed. 546 547#### What about `&open`? 548One yet-unaddressed aspect of the above is how `&open` works. This is thankfully 549a short answer: 550- It happens that `open` takes a parameter of type `const char*`. 551- It happens that `pass_object_size` -- an attribute only applicable to 552 parameters of type `T*` -- makes it impossible to take the address of a 553 function. 554 555Since clang doesn't support a "this function should never have its address 556taken," attribute, Bionic uses the next best thing: `pass_object_size`. :) 557 558## Breakdown of poll 559 560(Preemptively: at the time of writing, Clang has no literal `__builtin_poll` 561builtin. `__builtin_poll` is referenced below to remain consistent with the 562convention established in the Terminology section.) 563 564Bionic's `poll` implementation is closest to `mempcpy` above, though it has a 565few interesting aspects worth examining. 566 567The [full header implementation of `poll`] is, with some macros expanded: 568```c 569#define __bos_fd_count_trivially_safe(bos_val, fds, fd_count) \ 570 ((bos_val) == -1) || \ 571 (__builtin_constant_p(fd_count) && \ 572 (bos_val) >= sizeof(*fds) * (fd_count))) 573 574static 575__inline__ 576__attribute__((no_stack_protector)) 577__attribute__((always_inline)) 578int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout) 579 __attribute__((overloadable)) 580 __attriubte__((diagnose_if( 581 __builtin_object_size(fds, 1) != -1 && __builtin_object_size(fds, 1) < sizeof(*fds) * fd_count, 582 "error", 583 "in call to 'poll', fd_count is larger than the given buffer"))) { 584 size_t bos_fds = __builtin_object_size(fds, 1); 585 if (!__bos_fd_count_trivially_safe(bos_fds, fds, fd_count)) { 586 return __poll_chk(fds, fd_count, timeout, bos_fds); 587 } 588 return (&poll)(fds, fd_count, timeout); 589} 590``` 591 592To get the commonality with `mempcpy` and `open` out of the way: 593- This function is an overload with `__builtin_poll`. 594- The signature is the same, modulo the presence of a `pass_object_size` 595 attribute. Hence, for direct calls, overload resolution will always prefer it 596 over `__builtin_poll`. Taking the address of `poll` is forbidden, so all 597 references to `&poll` actually reference `__builtin_poll`. 598- When `fds` is too small to hold `fd_count` `pollfd`s, Clang will emit a 599 compile-time error if possible using `diagnose_if`. 600- If this can't be observed until run-time, `__poll_chk` verifies this. 601- When `fds` is a constant according to `__builtin_constant_p`, this always 602 compiles into `__poll_chk` for always-broken calls to `poll`, or 603 `__builtin_poll` for always-safe calls to `poll`. 604 605The critical bits to highlight here are on this line: 606```c 607int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout) 608``` 609 610And this line: 611```c 612 return (&poll)(fds, fd_count, timeout); 613``` 614 615Starting with the simplest, we call `__builtin_poll` with `(&poll)(...);`. As 616referenced above, taking the address of an overloaded function where all but one 617overload has a `pass_object_size` attribute on one or more parameters always 618resolves to the function without any `pass_object_size` attributes. 619 620The other line deserves a section. The subtlety of it is almost entirely in the 621use of `pass_object_size(1)` instead of `pass_object_size(0)`. on the `fds` 622parameter, and the corresponding use of `__builtin_object_size(fds, 1);` in the 623body of `poll`. 624 625### Subtleties of __builtin_object_size(p, N) 626 627Earlier in this document, it was said that a full description of each 628attribute/builtin necessary to power FORTIFY was out of scope. This is... only 629somewhat the case when we talk about `__builtin_object_size` and 630`pass_object_size`, especially when their second argument is `1`. 631 632#### tl;dr 633`__builtin_object_size(p, N)` and `pass_object_size(N)`, where `(N & 1) == 1`, 634can only be accurately determined by Clang. LLVM's `@llvm.objectsize` intrinsic 635ignores the value of `N & 1`, since handling `(N & 1) == 1` accurately requires 636data that's currently entirely inaccessible to LLVM, and that is difficult to 637preserve through LLVM's optimization passes. 638 639`pass_object_size`'s "lifting" of the evaluation of 640`__builtin_object_size(p, N)` to the caller is critical, since it allows Clang 641full visibility into the expression passed to e.g., `poll(&foo->bar, baz, qux)`. 642It's not a perfect solution, but it allows `N == 1` to be fully accurate in at 643least some cases. 644 645#### Background 646Clang's implementation of `__builtin_object_size` aims to be compatible with 647GCC's, which has [a decent bit of documentation]. Put simply, 648`__builtin_object_size(p, N)` is intended to evaluate at compile-time how many 649bytes can be accessed after `p` in a well-defined way. Straightforward examples 650of this are: 651```c 652char buf[8]; 653assert(__builtin_object_size(buf, N) == 8); 654assert(__builtin_object_size(buf + 1, N) == 7); 655``` 656 657This should hold for all values of N that are valid to pass to 658`__builtin_object_size`. The `N` value of `__builtin_object_size` is a mask of 659settings. 660 661##### (N & 2) == ? 662 663This is mostly for completeness sake; in Bionic's FORTIFY implementation, N is 664always either 0 or 1. 665 666If there are multiple possible values of `p` in a call to 667`__builtin_object_size(p, N)`, the second bit in `N` determines the behavior of 668the compiler. If `(N & 2) == 0`, `__builtin_object_size` should return the 669greatest possible size for each possible value of `p`. Otherwise, it should 670return the least possible value. For example: 671 672```c 673char smol_buf[7]; 674char buf[8]; 675char *p = rand() ? smol_buf : buf; 676assert(__builtin_object_size(p, 0) == 8); 677assert(__builtin_object_size(p, 2) == 7); 678``` 679 680##### (N & 1) == 0 681 682`__builtin_object_size(p, 0)` is more or less as simple as the example in the 683Background section directly above. When Clang attempts to evaluate 684`__builtin_object_size(p, 0);` and when LLVM tries to determine the result of a 685corresponding `@llvm.objectsize` call to, they search for the storage underlying 686the pointer in question. If that can be determined, Clang or LLVM can provide an 687answer; otherwise, they cannot. 688 689##### (N & 1) == 1, and the true magic of pass_object_size 690 691`__builtin_object_size(p, 1)` has a less uniform implementation between LLVM and 692Clang. According to GCC's documentation, "If the least significant bit [of 693__builtin_object_size's second argument] is clear, objects are whole variables, 694if it is set, a closest surrounding subobject is considered the object a pointer 695points to." 696 697The "closest surrounding subobject," means that `(N & 1) == 1` depends on type 698information in order to operate in many cases. Consider the following examples: 699```c 700struct Foo { 701 int a; 702 int b; 703}; 704 705struct Foo foo; 706assert(__builtin_object_size(&foo, 0) == sizeof(foo)); 707assert(__builtin_object_size(&foo, 1) == sizeof(foo)); 708assert(__builtin_object_size(&foo->a, 0) == sizeof(foo)); 709assert(__builtin_object_size(&foo->a, 1) == sizeof(int)); 710 711struct Foo foos[2]; 712assert(__builtin_object_size(&foos[0], 0) == 2 * sizeof(foo)); 713assert(__builtin_object_size(&foos[0], 1) == sizeof(foo)); 714assert(__builtin_object_size(&foos[0]->a, 0) == 2 * sizeof(foo)); 715assert(__builtin_object_size(&foos[0]->a, 1) == sizeof(int)); 716``` 717 718...And perhaps somewhat surprisingly: 719```c 720void example(struct Foo *foo) { 721 // (As a reminder, `-1` is "I don't know" when `(N & 2) == 0`.) 722 assert(__builtin_object_size(foo, 0) == -1); 723 assert(__builtin_object_size(foo, 1) == -1); 724 assert(__builtin_object_size(foo->a, 0) == -1); 725 assert(__builtin_object_size(foo->a, 1) == sizeof(int)); 726} 727``` 728 729In Clang, [this type-aware requirement poses problems for us]: Clang's frontend 730knows everything we could possibly want about the types of variables, but 731optimizations are only performed by LLVM. LLVM has no reliable source for C or 732C++ data types, so calls to `__builtin_object_size(p, N)` that cannot be 733resolved by clang are lowered to the equivalent of 734`__builtin_object_size(p, N & ~1)` in LLVM IR. 735 736Moreover, Clang's frontend is the best-equipped part of the compiler to 737accurately determine the answer for `__builtin_object_size(p, N)`, given we know 738what `p` is. LLVM is the best-equipped part of the compiler to determine the 739value of `p`. This ordering issue is unfortunate. 740 741This is where `pass_object_size(N)` comes in. To summarize [the docs for 742`pass_object_size`], it evaluates `__builtin_object_size(p, N)` within the 743context of the caller of the function annotated with `pass_object_size`, and 744passes the value of that into the callee as an invisible parameter. All calls to 745`__builtin_object_size(parameter, N)` are substituted with references to this 746invisible parameter. 747 748Putting this plainly, Clang's frontend struggles to evaluate the following: 749```c 750int foo(void *p) { 751 return __builtin_object_size(p, 1); 752} 753 754void bar() { 755 struct { int i, j } k; 756 // The frontend can't figure this interprocedural objectsize out, so it gets lowered to 757 // LLVM, which determines that the answer here is sizeof(k). 758 int baz = foo(&k.i); 759} 760``` 761 762However, with the magic of `pass_object_size`, we get one level of inlining to 763look through: 764```c 765int foo(void *const __attribute__((pass_object_size(1))) p) { 766 return __builtin_object_size(p, 1); 767} 768 769void bar() { 770 struct { int i, j } k; 771 // Due to pass_object_size, this is equivalent to: 772 // int baz = foo(&k.i, __builtin_object_size(&k.i, 1)); 773 // ...and `int foo(void *)` is actually equivalent to: 774 // int foo(void *const, size_t size) { 775 // return size; 776 // } 777 int baz = foo(&k.i); 778} 779``` 780 781So we can obtain an accurate result in this case. 782 783##### What about pass_object_size(0)? 784It's sort of tangential, but if you find yourself wondering about the utility of 785`pass_object_size(0)` ... it's somewhat split. `pass_object_size(0)` in Bionic's 786FORTIFY exists mostly for visual consistency, simplicity, and as a useful way to 787have e.g., `&mempcpy` == `&__builtin_mempcpy`. 788 789Outside of these fringe benefits, all of the functions with 790`pass_object_size(0)` on parameters are marked with `always_inline`, so 791"lifting" the `__builtin_object_size` call isn't ultimately very helpful. In 792theory, users can always have something like: 793 794```c 795// In some_header.h 796// This function does cool and interesting things with the `__builtin_object_size` of its parameter, 797// and is able to work with that as though the function were defined inline. 798void out_of_line_function(void *__attribute__((pass_object_size(0)))); 799``` 800 801Though the author isn't aware of uses like this in practice, beyond a few folks 802on LLVM's mailing list seeming interested in trying it someday. 803 804#### Wrapping up 805In the (long) section above, two things were covered: 806- The use of `(&poll)(...);` is a convenient shorthand for calling 807 `__builtin_poll`. 808- `__builtin_object_size(p, N)` with `(N & 1) == 1` is not easy for Clang to 809 answer accurately, since it relies on type info only available in the 810 frontend, and it sometimes relies on optimizations only available in the 811 middle-end. `pass_object_size` helps mitigate this. 812 813## Miscellaneous Notes 814The above should be a roughly comprehensive view of how FORTIFY works in the 815real world. The main thing it fails to mention is the use of [the `diagnose_as_builtin` attribute] in Clang. 816 817As time has moved on, Clang has increasingly gained support for emitting 818warnings that were previously emitted by FORTIFY machinery. 819`diagnose_as_builtin` allows us to remove the `diagnose_if`s from some of the 820`static inline` overloads of stdlib functions above, so Clang may diagnose them 821instead. 822 823Clang's built-in diagnostics are often better than `diagnose_if` diagnostics, 824since Clang can format its diagnostics to include e.g., information about the 825sizes of buffers in a suspect call to a function. `diagnose_if` can only have 826the compiler output constant strings. 827 828[ChromeOS' Glibc patch]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/90fa9b27731db10a6010c7f7c25b24028145b091/sys-libs/glibc/files/local/glibc-2.33/0007-glibc-add-clang-style-FORTIFY.patch 829[FORTIFY'ed implementation of `open`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/fcntl.h#41 830[FORTIFY'ed version of `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/string.h#45 831[a decent bit of documentation]: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html 832[an implementation for `__mempcpy_chk`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/fortify.cpp#501 833[full header implementation of `poll`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/poll.h#43 834[incompatible with stricter versions of FORTIFY checking]: https://godbolt.org/z/fGfEYxfnf 835[similar to C++11's `std::unique_ptr`]: https://stackoverflow.com/questions/58339165/why-can-a-t-be-passed-in-register-but-a-unique-ptrt-cannot 836[source for `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/string.h#55 837[the `diagnose_as_builtin` attribute]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#diagnose-as-builtin 838[the docs for `pass_object_size`]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#pass-object-size-pass-dynamic-object-size 839[this type-aware requirement poses problems for us]: https://github.com/llvm/llvm-project/issues/55742 840[unconditionally call `__open_2`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/open.cpp#70 841