xref: /aosp_15_r20/bionic/docs/clang_fortify_anatomy.md (revision 8d67ca893c1523eb926b9080dbe4e2ffd2a27ba1)
1*This document was originally written for a broad audience, and it was*
2*determined that it'd be good to hold in Bionic's docs, too. Due to the*
3*ever-changing nature of code, it tries to link to a stable tag of*
4*Bionic's libc, rather than the live code in Bionic. Same for Clang.*
5*Reader beware. :)*
6
7# The Anatomy of Clang FORTIFY
8
9## Objective
10
11The intent of this document is to run through the minutiae of how Clang FORTIFY
12actually works in Bionic at the time of writing. Other FORTIFY implementations
13that target Clang should use very similar mechanics. This document exists in part
14because many Clang-specific features serve multiple purposes simultaneously, so
15getting up-to-speed on how things function can be quite difficult.
16
17## Background
18
19FORTIFY is a broad suite of extensions to libc aimed at catching misuses of
20common library functions. Textually, these extensions exist purely in libc, but
21all implementations of FORTIFY rely heavily on C language extensions in order
22to function at all.
23
24Broadly, FORTIFY implementations try to guard against many misuses of C
25standard(-ish) libraries:
26- Buffer overruns in functions where pointers+sizes are passed (e.g., `memcpy`,
27  `poll`), or where sizes exist implicitly (e.g., `strcpy`).
28- Arguments with incorrect values passed to libc functions (e.g.,
29  out-of-bounds bits in `umask`).
30- Missing arguments to functions (e.g., `open()` with `O_CREAT`, but no mode
31  bits).
32
33FORTIFY is traditionally enabled by passing `-D_FORTIFY_SOURCE=N` to your
34compiler. `N==0` disables FORTIFY, whereas `N==1`, `N==2`, and `N==3` enable
35increasingly strict versions of it. In general, FORTIFY doesn't require user
36code changes; that said, some code patterns
37are [incompatible with stricter versions of FORTIFY checking]. This is largely
38because FORTIFY has significant flexibility in what it considers to be an
39"out-of-bounds" access.
40
41FORTIFY implementations use a mix of compiler diagnostics and runtime checks to
42flag and/or mitigate the impacts of the misuses mentioned above.
43
44Further, given FORTIFY's design, the effectiveness of FORTIFY is a function of
45-- among other things -- the optimization level you're compiling your code at.
46Many FORTIFY implementations are implicitly disabled when building with `-O0`,
47since FORTIFY's design for both Clang and GCC relies on optimizations in order
48to provide useful run-time checks. For the purpose of this document, all
49analysis of FORTIFY functions and commentary on builtins assume that code is
50being built with some optimization level > `-O0`.
51
52### A note on GCC
53
54This document talks specifically about Bionic's FORTIFY implementation targeted
55at Clang. While GCC also provides a set of language extensions necessary to
56implement FORTIFY, these tools are different from what Clang offers. This
57divergence is an artifact of Clang and GCC's differing architecture as
58compilers.
59
60Textually, quite a bit can be shared between a FORTIFY implementation for GCC
61and one for Clang (e.g., see [ChromeOS' Glibc patch]), but this kind of sharing
62requires things like macros that expand to unbalanced braces depending on your
63compiler:
64
65```c
66/*
67 * Highly simplified; if you're interested in FORTIFY's actual implementation,
68 * please see the patch linked above.
69 */
70#ifdef __clang__
71# define FORTIFY_PRECONDITIONS
72# define FORTIFY_FUNCTION_END
73#else
74# define FORTIFY_PRECONDITIONS {
75# define FORTIFY_FUNCTION_END }
76#endif
77
78/*
79 * FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN is not defined, due to its
80 * complexity and irrelevance. It turns into a compile-time warning if the
81 * compiler can determine `*buf` has fewer than `size` bytes available.
82 */
83
84char *getcwd(char *buf, size_t size)
85FORTIFY_PRECONDITIONS
86  FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN(buf, size, "`buf` is too smol.")
87{
88  // Actual shared function implementation goes here.
89}
90FORTIFY_FUNCTION_END
91```
92
93All talk of GCC-focused implementations and how to merge Clang and GCC
94implementations is out-of-scope for this doc, however.
95
96## The Life of a Clang FORTIFY Function
97
98As referenced in the Background section, FORTIFY performs many different checks
99for many functions. This section intends to go through real-world examples of
100FORTIFY functions in Bionic, breaking down how each part of these functions
101work, and how the pieces fit together to provide FORTIFY-like functionality.
102
103While FORTIFY implementations may differ between stdlibs, they broadly follow
104the same patterns when implementing their checks for Clang, and they try to
105make similar promises with respect to FORTIFY compiling to be zero-overhead in
106some cases, etc. Moreover, while this document specifically examines Bionic,
107many stdlibs will operate _very similarly_ to Bionic in their Clang FORTIFY
108implementations.
109
110**In general, when reading the below, be prepared for exceptions, subtlety, and
111corner cases. The individual function breakdowns below try to not offer
112redundant information. Each one focuses on different aspects of FORTIFY.**
113
114### Terminology
115
116Because FORTIFY should be mostly transparent to developers, there are inherent
117naming collisions here: `memcpy(x, y, z)` turns into fundamentally different
118generated code depending on the value of `_FORTIFY_SOURCE`. Further, said
119`memcpy` call with `_FORTIFY_SOURCE` enabled needs to be able to refer to the
120`memcpy` that would have been called, had `_FORTIFY_SOURCE` been disabled.
121Hence, the following convention is followed in the subsections below for all
122prose (namely, multiline code blocks are exempted from this):
123
124- Standard library function names preceded by `__builtin_` refer to the use of
125  the function with `_FORTIFY_SOURCE` disabled.
126- Standard library function names without a prefix refer to the use of the
127  function with `_FORTIFY_SOURCE` enabled.
128
129This convention also applies in `clang`. `__builtin_memcpy` will always call
130`memcpy` as though `_FORTIFY_SOURCE` were disabled.
131
132## Breakdown of `mempcpy`
133
134The [FORTIFY'ed version of `mempcpy`] is a full, featureful example of a
135FORTIFY'ed function from Bionic. From the user's perspective, it supports a few
136things:
137- Producing a compile-time error if the number of bytes to copy trivially
138  exceeds the number of bytes available at the destination pointer.
139- If the `mempcpy` has the potential to write to more bytes than what is
140  available at the destination, a run-time check is inserted to crash the
141  program if more bytes are written than what is allowed.
142- Compiling away to be zero overhead when none of the buffer sizes can be
143  determined at compile-time[^1].
144
145The declaration in Bionic's headers for `__builtin_mempcpy` is:
146```c
147void* mempcpy(void* __dst, const void* __src, size_t __n) __INTRODUCED_IN(23);
148```
149
150Which is annotated with nothing special, so it will be ignored.
151
152The [source for `mempcpy`] in Bionic's headers for is:
153```c
154__BIONIC_FORTIFY_INLINE
155void* mempcpy(void* const dst __pass_object_size0, const void* src, size_t copy_amount)
156        __overloadable
157        __clang_error_if(__bos_unevaluated_lt(__bos0(dst), copy_amount),
158                         "'mempcpy' called with size bigger than buffer") {
159#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
160    size_t bos_dst = __bos0(dst);
161    if (!__bos_trivially_ge(bos_dst, copy_amount)) {
162        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
163    }
164#endif
165    return __builtin_mempcpy(dst, src, copy_amount);
166}
167```
168
169Expanding some of the important macros here, this function expands to roughly:
170```c
171static
172__inline__
173__attribute__((no_stack_protector))
174__attribute__((always_inline))
175void* mempcpy(
176        void* const dst __attribute__((pass_object_size(0))),
177        const void* src,
178        size_t copy_amount)
179        __attribute__((overloadable))
180        __attribute__((diagnose_if(
181            __builtin_object_size(dst, 0) != -1 && __builtin_object_size(dst, 0) <= copy_amount),
182            "'mempcpy' called with size bigger than buffer"))) {
183#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
184    size_t bos_dst = __builtin_object_size(dst, 0);
185    if (!(__bos_trivially_ge(bos_dst, copy_amount))) {
186        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
187    }
188#endif
189    return __builtin_mempcpy(dst, src, copy_amount);
190}
191```
192
193So let's walk through this step by step, to see how FORTIFY does what it says on
194the tin here.
195
196[^1]: "Zero overhead" in a way [similar to C++11's `std::unique_ptr`]: this will
197turn into a direct call `__builtin_mempcpy` (or an optimized form thereof) with
198no other surrounding checks at runtime. However, the additional complexity may
199hinder optimizations that are performed before the optimizer can prove that the
200`if (...) { ... }` can be optimized out. Depending on how late this happens,
201the additional complexity may skew inlining costs, hide opportunities for e.g.,
202`memcpy` coalescing, etc etc.
203
204### How does Clang select `mempcpy`?
205
206First, it's critical to notice that `mempcpy` is marked `overloadable`. This
207function is a `static inline __attribute__((always_inline))` overload of
208`__builtin_mempcpy`:
209- `__attribute__((overloadable))` allows us to perform overloading in C.
210- `__attribute__((overloadable))` mangles all calls to functions marked with
211  `__attribute__((overloadable))`.
212- `__attribute__((overloadable))` allows exactly one function signature with a
213  given name to not be marked with `__attribute__((overloadable))`. Calls to
214  this overload will not be mangled.
215
216Second, one might note that this `mempcpy` implementation has the same C-level
217signature as `__builtin_mempcpy`. `pass_object_size` is a Clang attribute that
218is generally needed by FORTIFY, but it carries the side-effect that functions
219may be overloaded simply on the presence (or lack of presence) of
220`pass_object_size` attributes. Given two overloads of a function that only
221differ on the presence of `pass_object_size` attributes, the candidate with
222`pass_object_size` attributes is preferred.
223
224Finally, the prior paragraph gets thrown out if one tries to take the address of
225`mempcpy`. It is impossible to take the address of a function with one or more
226parameters that are annotated with `pass_object_size`. Hence,
227`&__builtin_mempcpy == &mempcpy`. Further, because this is an issue of overload
228resolution, `(&mempcpy)(x, y, z);` is functionally identical to
229`__builtin_mempcpy(x, y, z);`.
230
231All of this accomplishes the following:
232- Direct calls to `mempcpy` should call the FORTIFY-protected `mempcpy`.
233- Indirect calls to `&mempcpy` should call `__builtin_mempcpy`.
234
235### How does Clang offer compile-time diagnostics?
236
237Once one is convinced that the FORTIFY-enabled overload of `mempcpy` will be
238selected for direct calls, Clang's `diagnose_if` and `__builtin_object_size` do
239all of the work from there.
240
241Subtleties here primarily fall out of the discussion in the above section about
242`&__builtin_mempcpy == &mempcpy`:
243```c
244#define _FORTIFY_SOURCE 2
245#include <string.h>
246void example_code() {
247  char buf[4]; // ...Assume sizeof(char) == 1.
248  const char input_buf[] = "Hello, World";
249  mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued.
250
251  mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5.
252  __builtin_mempcpy(buf, input_buf, 5); // No compile-time error.
253  (&mempcpy)(buf, input_buf, 5); // No compile-time error, since __builtin_mempcpy is selected.
254}
255```
256
257Otherwise, the rest of this subsection is dedicated to preliminary discussion
258about `__builtin_object_size`.
259
260Clang's frontend can do one of two things with `__builtin_object_size(p, n)`:
261- Evaluate it as a constant.
262  - This can either mean declaring that the number of bytes at `p` is definitely
263    impossible to know, so the default value is used, or the number of bytes at
264    `p` can be known without optimizations.
265- Declare that the expression cannot form a constant, and lower it to
266  `@llvm.objectsize`, which is discussed in depth later.
267
268In the examples above, since `diagnose_if` is evaluated with context from the
269caller, Clang should be able to trivially determine that `buf` refers to a
270`char` array with 4 elements.
271
272The primary consequence of the above is that diagnostics can only be emitted if
273no optimizations are required to detect a broken code pattern. To be specific,
274clang's constexpr evaluator must be able to determine the logical object that
275any given pointer points to in order to fold `__builtin_object_size` to a
276constant, non-default answer:
277
278```c
279#define _FORTIFY_SOURCE 2
280#include <string.h>
281void example_code() {
282  char buf[4]; // ...Assume sizeof(char) == 1.
283  const char input_buf[] = "Hello, World";
284  mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued.
285  mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5.
286  char *buf_ptr = buf;
287  mempcpy(buf_ptr, input_buf, 5); // No compile-time error; `buf_ptr`'s target can't be determined.
288}
289```
290
291### How does Clang insert run-time checks?
292
293This section expands on the following statement: FORTIFY has zero runtime cost
294in instances where there is no chance of catching a bug at run-time. Otherwise,
295it introduces a tiny additional run-time cost to ensure that functions aren't
296misused.
297
298In prior sections, the following was established:
299- `overloadable` and `pass_object_size` prompt Clang to always select this
300  overload of `mempcpy` over `__builtin_mempcpy` for direct calls.
301- If a call to `mempcpy` was trivially broken, Clang would produce a
302  compile-time error, rather than producing a binary.
303
304Hence, the case we're interested in here is one where Clang's frontend selected
305a FORTIFY'ed function's implementation for a function call, but was unable to
306find anything seriously wrong with said function call. Since the frontend is
307powerless to detect bugs at this point, our focus shifts to the mechanisms that
308LLVM uses to support FORTIFY.
309
310Going back to Bionic's `mempcpy` implementation, we have the following (ignoring
311diagnose_if and assuming run-time checks are enabled):
312```c
313static
314__inline__
315__attribute__((no_stack_protector))
316__attribute__((always_inline))
317void* mempcpy(
318        void* const dst __attribute__((pass_object_size(0))),
319        const void* src,
320        size_t copy_amount)
321        __attribute__((overloadable)) {
322    size_t bos_dst = __builtin_object_size(dst, 0);
323    if (bos_dst != -1 &&
324        !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) {
325        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
326    }
327    return __builtin_mempcpy(dst, src, copy_amount);
328}
329```
330
331In other words, we have a `static`, `always_inline` function which:
332- If `__builtin_object_size(dst, 0)` cannot be determined (in which case, it
333  returns -1), calls `__builtin_mempcpy`.
334- Otherwise, if `copy_amount` can be folded to a constant, and if
335  `__builtin_object_size(dst, 0) >= copy_amount`, calls `__builtin_mempcpy`.
336- Otherwise, calls `__builtin___mempcpy_chk`.
337
338
339How can this be "zero overhead"? Let's focus on the following part of the
340function:
341
342```c
343    size_t bos_dst = __builtin_object_size(dst, 0);
344    if (bos_dst != -1 &&
345        !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) {
346```
347
348If Clang's frontend cannot determine a value for `__builtin_object_size`, Clang
349lowers it to LLVM's `@llvm.objectsize` intrinsic. The `@llvm.objectsize`
350invocation corresponding to `__builtin_object_size(p, 0)` is guaranteed to
351always fold to a constant value by the time LLVM emits machine code.
352
353Hence, `bos_dst` is guaranteed to be a constant; if it's -1, the above branch
354can be eliminated entirely, since it folds to `if (false && ...)`. Further, the
355RHS of the `&&` in this branch has us call `__builtin_mempcpy` if `copy_amount`
356is a known value less than `bos_dst` (yet another constant value). Therefore,
357the entire condition is always knowable when LLVM is done with LLVM IR-level
358optimizations, so no condition is ever emitted to machine code in practice.
359
360#### Why is "zero overhead" in quotes? Why is `unique_ptr` relevant?
361
362`__builtin_object_size` and `__builtin_constant_p` are forced to be constants
363after most optimizations take place. Until LLVM replaces both of these with
364constants and optimizes them out, we have additional branches and function calls
365in our IR. This can have negative effects, such as distorting inlining costs and
366inhibiting optimizations that are conservative around branches in control-flow.
367
368So FORTIFY is free in these cases _in isolation of any of the code around it_.
369Due to its implementation, it may impact the optimizations that occur on code
370around the literal call to the FORTIFY-hardened libc function.
371
372`unique_ptr` was just the first thing that came to the author's mind for "the
373type should be zero cost with any level of optimization enabled, but edge-cases
374might make it only-mostly-free to use."
375
376### How is checking actually performed?
377
378In cases where checking can be performed (e.g., where we call
379`__builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);`), Bionic provides [an
380implementation for `__mempcpy_chk`]. This is:
381
382```c
383extern "C" void* __mempcpy_chk(void* dst, const void* src, size_t count, size_t dst_len) {
384  __check_count("mempcpy", "count", count);
385  __check_buffer_access("mempcpy", "write into", count, dst_len);
386  return mempcpy(dst, src, count);
387}
388```
389This function itself boils down to a few small branches which abort the program
390if they fail, and a direct call to `__builtin_mempcpy`.
391
392### Wrapping up
393
394In the above breakdown, it was shown how Clang and Bionic work together to:
395- represent FORTIFY-hardened overloads of functions,
396- report misuses of stdlib functions at compile-time, and
397- insert run-time checks for uses of functions that might be incorrect, but only
398  if we have the potential of proving the incorrectness of these.
399
400## Breakdown of open
401
402In Bionic, the [FORTIFY'ed implementation of `open`] is quite large. Much like
403`mempcpy`, the `__builtin_open` declaration is simple:
404
405```c
406int open(const char* __path, int __flags, ...);
407```
408
409With some macros expanded, the FORTIFY-hardened header implementation is:
410```c
411int __open_2(const char*, int);
412int __open_real(const char*, int, ...) __asm__(open);
413
414#define __open_modes_useful(flags) (((flags) & O_CREAT) || ((flags) & O_TMPFILE) == O_TMPFILE)
415
416static
417int open(const char* pathname, int flags, mode_t modes, ...) __overloadable
418        __attribute__((diagnose_if(1, "error", "too many arguments")));
419
420static
421__inline__
422__attribute__((no_stack_protector))
423__attribute__((always_inline))
424int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags)
425        __attribute__((overloadable))
426        __attribute__((diagnose_if(
427            __open_modes_useful(flags),
428            "error",
429            "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) {
430#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
431    return __open_2(pathname, flags);
432#else
433    return __open_real(pathname, flags);
434#endif
435}
436static
437__inline__
438__attribute__((no_stack_protector))
439__attribute__((always_inline))
440int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes)
441        __attribute__((overloadable))
442        __clang_warning_if(!__open_modes_useful(flags) && modes,
443                           "'open' has superfluous mode bits; missing O_CREAT?") {
444    return __open_real(pathname, flags, modes);
445}
446```
447
448Which may be a lot to take in.
449
450Before diving too deeply, please note that the remainder of these subsections
451assume that the programmer didn't make any egregious typos. Moreover, there's no
452real way that Bionic tries to prevent calls to `open` like
453`open("foo", 0, "how do you convert a const char[N] to mode_t?");`. The only
454real C-compatible solution the author can think of is "stamp out many overloads
455to catch sort-of-common instances of this very uncommon typo." This isn't great.
456
457More directly, no effort is made below to recognize calls that, due to
458incompatible argument types, cannot go to any `open` implementation other than
459`__builtin_open`, since it's recognized right here. :)
460
461### Implementation breakdown
462
463This `open` implementation does a few things:
464- Turns calls to `open` with too many arguments into a compile-time error.
465- Diagnoses calls to `open` with missing modes at compile-time and run-time
466  (both cases turn into errors).
467- Emits warnings on calls to `open` with useless mode bits, unless the mode bits
468  are all 0.
469
470One common bit of code not explained below is the `__open_real` declaration above:
471```c
472int __open_real(const char*, int, ...) __asm__(open);
473```
474
475This exists as a way for us to call `__builtin_open` without needing clang to
476have a pre-defined `__builtin_open` function.
477
478#### Compile-time error on too many arguments
479
480```c
481static
482int open(const char* pathname, int flags, mode_t modes, ...) __overloadable
483        __attribute__((diagnose_if(1, "error", "too many arguments")));
484```
485
486Which matches most calls to open that supply too many arguments, since
487`int(const char *, int, ...)` matches less strongly than
488`int(const char *, int, mode_t, ...)` for calls where the 3rd arg can be
489converted to `mode_t` without too much effort. Because of the `diagnose_if`
490attribute, all of these calls turn into compile-time errors.
491
492#### Compile-time or run-time error on missing arguments
493The following overload handles all two-argument calls to `open`.
494```c
495static
496__inline__
497__attribute__((no_stack_protector))
498__attribute__((always_inline))
499int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags)
500        __attribute__((overloadable))
501        __attribute__((diagnose_if(
502            __open_modes_useful(flags),
503            "error",
504            "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) {
505#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
506    return __open_2(pathname, flags);
507#else
508    return __open_real(pathname, flags);
509#endif
510}
511```
512
513Like `mempcpy`, `diagnose_if` handles emitting a compile-time error if the call
514to `open` is broken in a way that's visible to Clang's frontend. This
515essentially boils down to "`open` is being called with a `flags` value that
516requires mode bits to be set."
517
518If that fails to catch a bug, we [unconditionally call `__open_2`], which
519performs a run-time check:
520```c
521int __open_2(const char* pathname, int flags) {
522  if (needs_mode(flags)) __fortify_fatal("open: called with O_CREAT/O_TMPFILE but no mode");
523  return FDTRACK_CREATE_NAME("open", __openat(AT_FDCWD, pathname, force_O_LARGEFILE(flags), 0));
524}
525```
526
527#### Compile-time warning if modes are pointless
528
529Finally, we have the following `open` call:
530```c
531static
532__inline__
533__attribute__((no_stack_protector))
534__attribute__((always_inline))
535int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes)
536        __attribute__((overloadable))
537        __clang_warning_if(!__open_modes_useful(flags) && modes,
538                           "'open' has superfluous mode bits; missing O_CREAT?") {
539    return __open_real(pathname, flags, modes);
540}
541```
542
543This simply issues a warning if Clang's frontend can determine that `flags`
544isn't necessary. Due to conventions in existing code, a `modes` value of `0` is
545not diagnosed.
546
547#### What about `&open`?
548One yet-unaddressed aspect of the above is how `&open` works. This is thankfully
549a short answer:
550- It happens that `open` takes a parameter of type `const char*`.
551- It happens that `pass_object_size` -- an attribute only applicable to
552  parameters of type `T*` --  makes it impossible to take the address of a
553  function.
554
555Since clang doesn't support a "this function should never have its address
556taken," attribute, Bionic uses the next best thing: `pass_object_size`. :)
557
558## Breakdown of poll
559
560(Preemptively: at the time of writing, Clang has no literal `__builtin_poll`
561builtin. `__builtin_poll` is referenced below to remain consistent with the
562convention established in the Terminology section.)
563
564Bionic's `poll` implementation is closest to `mempcpy` above, though it has a
565few interesting aspects worth examining.
566
567The [full header implementation of `poll`] is, with some macros expanded:
568```c
569#define __bos_fd_count_trivially_safe(bos_val, fds, fd_count) \
570  ((bos_val) == -1) || \
571    (__builtin_constant_p(fd_count) && \
572    (bos_val) >= sizeof(*fds) * (fd_count)))
573
574static
575__inline__
576__attribute__((no_stack_protector))
577__attribute__((always_inline))
578int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout)
579    __attribute__((overloadable))
580    __attriubte__((diagnose_if(
581       __builtin_object_size(fds, 1) != -1 && __builtin_object_size(fds, 1) < sizeof(*fds) * fd_count,
582        "error",
583        "in call to 'poll', fd_count is larger than the given buffer"))) {
584  size_t bos_fds = __builtin_object_size(fds, 1);
585  if (!__bos_fd_count_trivially_safe(bos_fds, fds, fd_count)) {
586    return __poll_chk(fds, fd_count, timeout, bos_fds);
587  }
588  return (&poll)(fds, fd_count, timeout);
589}
590```
591
592To get the commonality with `mempcpy` and `open` out of the way:
593- This function is an overload with `__builtin_poll`.
594- The signature is the same, modulo the presence of a `pass_object_size`
595  attribute. Hence, for direct calls, overload resolution will always prefer it
596  over `__builtin_poll`. Taking the address of `poll` is forbidden, so all
597  references to `&poll` actually reference `__builtin_poll`.
598- When `fds` is too small to hold `fd_count` `pollfd`s, Clang will emit a
599  compile-time error if possible using `diagnose_if`.
600- If this can't be observed until run-time, `__poll_chk` verifies this.
601- When `fds` is a constant according to `__builtin_constant_p`, this always
602  compiles into `__poll_chk` for always-broken calls to `poll`, or
603  `__builtin_poll` for always-safe calls to `poll`.
604
605The critical bits to highlight here are on this line:
606```c
607int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout)
608```
609
610And this line:
611```c
612  return (&poll)(fds, fd_count, timeout);
613```
614
615Starting with the simplest, we call `__builtin_poll` with `(&poll)(...);`. As
616referenced above, taking the address of an overloaded function where all but one
617overload has a `pass_object_size` attribute on one or more parameters always
618resolves to the function without any `pass_object_size` attributes.
619
620The other line deserves a section. The subtlety of it is almost entirely in the
621use of `pass_object_size(1)` instead of `pass_object_size(0)`. on the `fds`
622parameter, and the corresponding use of `__builtin_object_size(fds, 1);` in the
623body of `poll`.
624
625### Subtleties of __builtin_object_size(p, N)
626
627Earlier in this document, it was said that a full description of each
628attribute/builtin necessary to power FORTIFY was out of scope. This is... only
629somewhat the case when we talk about `__builtin_object_size` and
630`pass_object_size`, especially when their second argument is `1`.
631
632#### tl;dr
633`__builtin_object_size(p, N)` and `pass_object_size(N)`, where `(N & 1) == 1`,
634can only be accurately determined by Clang. LLVM's `@llvm.objectsize` intrinsic
635ignores the value of `N & 1`, since handling `(N & 1) == 1` accurately requires
636data that's currently entirely inaccessible to LLVM, and that is difficult to
637preserve through LLVM's optimization passes.
638
639`pass_object_size`'s "lifting" of the evaluation of
640`__builtin_object_size(p, N)` to the caller is critical, since it allows Clang
641full visibility into the expression passed to e.g., `poll(&foo->bar, baz, qux)`.
642It's not a perfect solution, but it allows `N == 1` to be fully accurate in at
643least some cases.
644
645#### Background
646Clang's implementation of `__builtin_object_size` aims to be compatible with
647GCC's, which has [a decent bit of documentation]. Put simply,
648`__builtin_object_size(p, N)` is intended to evaluate at compile-time how many
649bytes can be accessed after `p` in a well-defined way. Straightforward examples
650of this are:
651```c
652char buf[8];
653assert(__builtin_object_size(buf, N) == 8);
654assert(__builtin_object_size(buf + 1, N) == 7);
655```
656
657This should hold for all values of N that are valid to pass to
658`__builtin_object_size`. The `N` value of `__builtin_object_size` is a mask of
659settings.
660
661##### (N & 2) == ?
662
663This is mostly for completeness sake; in Bionic's FORTIFY implementation, N is
664always either 0 or 1.
665
666If there are multiple possible values of `p` in a call to
667`__builtin_object_size(p, N)`, the second bit in `N` determines the behavior of
668the compiler. If `(N & 2) == 0`, `__builtin_object_size` should return the
669greatest possible size for each possible value of `p`. Otherwise, it should
670return the least possible value. For example:
671
672```c
673char smol_buf[7];
674char buf[8];
675char *p = rand() ? smol_buf : buf;
676assert(__builtin_object_size(p, 0) == 8);
677assert(__builtin_object_size(p, 2) == 7);
678```
679
680##### (N & 1) == 0
681
682`__builtin_object_size(p, 0)` is more or less as simple as the example in the
683Background section directly above. When Clang attempts to evaluate
684`__builtin_object_size(p, 0);` and when LLVM tries to determine the result of a
685corresponding `@llvm.objectsize` call to, they search for the storage underlying
686the pointer in question. If that can be determined, Clang or LLVM can provide an
687answer; otherwise, they cannot.
688
689##### (N & 1) == 1, and the true magic of pass_object_size
690
691`__builtin_object_size(p, 1)` has a less uniform implementation between LLVM and
692Clang. According to GCC's documentation, "If the least significant bit [of
693__builtin_object_size's second argument] is clear, objects are whole variables,
694if it is set, a closest surrounding subobject is considered the object a pointer
695points to."
696
697The "closest surrounding subobject," means that `(N & 1) == 1` depends on type
698information in order to operate in many cases. Consider the following examples:
699```c
700struct Foo {
701  int a;
702  int b;
703};
704
705struct Foo foo;
706assert(__builtin_object_size(&foo, 0) == sizeof(foo));
707assert(__builtin_object_size(&foo, 1) == sizeof(foo));
708assert(__builtin_object_size(&foo->a, 0) == sizeof(foo));
709assert(__builtin_object_size(&foo->a, 1) == sizeof(int));
710
711struct Foo foos[2];
712assert(__builtin_object_size(&foos[0], 0) == 2 * sizeof(foo));
713assert(__builtin_object_size(&foos[0], 1) == sizeof(foo));
714assert(__builtin_object_size(&foos[0]->a, 0) == 2 * sizeof(foo));
715assert(__builtin_object_size(&foos[0]->a, 1) == sizeof(int));
716```
717
718...And perhaps somewhat surprisingly:
719```c
720void example(struct Foo *foo) {
721  // (As a reminder, `-1` is "I don't know" when `(N & 2) == 0`.)
722  assert(__builtin_object_size(foo, 0) == -1);
723  assert(__builtin_object_size(foo, 1) == -1);
724  assert(__builtin_object_size(foo->a, 0) == -1);
725  assert(__builtin_object_size(foo->a, 1) == sizeof(int));
726}
727```
728
729In Clang, [this type-aware requirement poses problems for us]: Clang's frontend
730knows everything we could possibly want about the types of variables, but
731optimizations are only performed by LLVM. LLVM has no reliable source for C or
732C++ data types, so calls to `__builtin_object_size(p, N)` that cannot be
733resolved by clang are lowered to the equivalent of
734`__builtin_object_size(p, N & ~1)` in LLVM IR.
735
736Moreover, Clang's frontend is the best-equipped part of the compiler to
737accurately determine the answer for `__builtin_object_size(p, N)`, given we know
738what `p` is. LLVM is the best-equipped part of the compiler to determine the
739value of `p`. This ordering issue is unfortunate.
740
741This is where `pass_object_size(N)` comes in. To summarize [the docs for
742`pass_object_size`], it evaluates `__builtin_object_size(p, N)` within the
743context of the caller of the function annotated with `pass_object_size`, and
744passes the value of that into the callee as an invisible parameter. All calls to
745`__builtin_object_size(parameter, N)` are substituted with references to this
746invisible parameter.
747
748Putting this plainly, Clang's frontend struggles to evaluate the following:
749```c
750int foo(void *p) {
751  return __builtin_object_size(p, 1);
752}
753
754void bar() {
755  struct { int i, j } k;
756  // The frontend can't figure this interprocedural objectsize out, so it gets lowered to
757  // LLVM, which determines that the answer here is sizeof(k).
758  int baz = foo(&k.i);
759}
760```
761
762However, with the magic of `pass_object_size`, we get one level of inlining to
763look through:
764```c
765int foo(void *const __attribute__((pass_object_size(1))) p) {
766  return __builtin_object_size(p, 1);
767}
768
769void bar() {
770  struct { int i, j } k;
771  // Due to pass_object_size, this is equivalent to:
772  // int baz = foo(&k.i, __builtin_object_size(&k.i, 1));
773  // ...and `int foo(void *)` is actually equivalent to:
774  // int foo(void *const, size_t size) {
775  //   return size;
776  // }
777  int baz = foo(&k.i);
778}
779```
780
781So we can obtain an accurate result in this case.
782
783##### What about pass_object_size(0)?
784It's sort of tangential, but if you find yourself wondering about the utility of
785`pass_object_size(0)` ... it's somewhat split. `pass_object_size(0)` in Bionic's
786FORTIFY exists mostly for visual consistency, simplicity, and as a useful way to
787have e.g., `&mempcpy` == `&__builtin_mempcpy`.
788
789Outside of these fringe benefits, all of the functions with
790`pass_object_size(0)` on parameters are marked with `always_inline`, so
791"lifting" the `__builtin_object_size` call isn't ultimately very helpful. In
792theory, users can always have something like:
793
794```c
795// In some_header.h
796// This function does cool and interesting things with the `__builtin_object_size` of its parameter,
797// and is able to work with that as though the function were defined inline.
798void out_of_line_function(void *__attribute__((pass_object_size(0))));
799```
800
801Though the author isn't aware of uses like this in practice, beyond a few folks
802on LLVM's mailing list seeming interested in trying it someday.
803
804#### Wrapping up
805In the (long) section above, two things were covered:
806- The use of `(&poll)(...);` is a convenient shorthand for calling
807  `__builtin_poll`.
808- `__builtin_object_size(p, N)` with `(N & 1) == 1` is not easy for Clang to
809  answer accurately, since it relies on type info only available in the
810  frontend, and it sometimes relies on optimizations only available in the
811  middle-end. `pass_object_size` helps mitigate this.
812
813## Miscellaneous Notes
814The above should be a roughly comprehensive view of how FORTIFY works in the
815real world. The main thing it fails to mention is the use of [the `diagnose_as_builtin` attribute] in Clang.
816
817As time has moved on, Clang has increasingly gained support for emitting
818warnings that were previously emitted by FORTIFY machinery.
819`diagnose_as_builtin` allows us to remove the `diagnose_if`s from some of the
820`static inline` overloads of stdlib functions above, so Clang may diagnose them
821instead.
822
823Clang's built-in diagnostics are often better than `diagnose_if` diagnostics,
824since Clang can format its diagnostics to include e.g., information about the
825sizes of buffers in a suspect call to a function. `diagnose_if` can only have
826the compiler output constant strings.
827
828[ChromeOS' Glibc patch]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/90fa9b27731db10a6010c7f7c25b24028145b091/sys-libs/glibc/files/local/glibc-2.33/0007-glibc-add-clang-style-FORTIFY.patch
829[FORTIFY'ed implementation of `open`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/fcntl.h#41
830[FORTIFY'ed version of `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/string.h#45
831[a decent bit of documentation]: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
832[an implementation for `__mempcpy_chk`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/fortify.cpp#501
833[full header implementation of `poll`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/poll.h#43
834[incompatible with stricter versions of FORTIFY checking]: https://godbolt.org/z/fGfEYxfnf
835[similar to C++11's `std::unique_ptr`]: https://stackoverflow.com/questions/58339165/why-can-a-t-be-passed-in-register-but-a-unique-ptrt-cannot
836[source for `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/string.h#55
837[the `diagnose_as_builtin` attribute]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#diagnose-as-builtin
838[the docs for `pass_object_size`]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#pass-object-size-pass-dynamic-object-size
839[this type-aware requirement poses problems for us]: https://github.com/llvm/llvm-project/issues/55742
840[unconditionally call `__open_2`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/open.cpp#70
841