xref: /aosp_15_r20/external/coreboot/Documentation/technotes/asan.md (revision b9411a12aaaa7e1e6a6fb7c5e057f44ee179a49c)
1# Address Sanitizer
2
3Memory safety is hard to achieve. We, as humans, are bound to make mistakes in
4our code. While it may be straightforward to detect memory corruption bugs in
5few lines of code, it becomes quite challenging to find those bugs in a massive
6code. In such cases, 'Address Sanitizer' may prove to be useful and could help
7save time.
8
9[Address Sanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer)
10, also known as ASan, is a runtime memory debugger designed to find
11out-of-bounds accesses and use-after-scope bugs. coreboot has an in-built
12Address Sanitizer. Therefore, it is advised to take advantage of this debugging
13tool while working on large patches. This would further help to ensure code
14quality and make runtime code more robust.
15
16## Types of errors detected
17ASan in coreboot catches the following types of memory bugs:
18
19### Stack buffer overflow
20Example stack-out-of-bounds:
21```c
22void foo()
23{
24	int stack_array[5] = {0};
25        int i, out;
26	for (i = 0; i < 10; i++)
27		out = stack_array[i];
28}
29```
30In this example, the array is of length 5 but it is being read even beyond the
31index 4.
32
33### Global buffer overflow
34Example global-out-of-bounds:
35```c
36char a[] = "I use coreboot";
37
38void foo()
39{
40        char b[] = "proprietary BIOS";
41        strcpy(a + 6, b);
42}
43```
44In this example,
45> well, you are replacing coreboot with proprietary BIOS. In any case, that's
46an "error".
47
48Let's come to the memory bug. The string 'a' is of length 14 but it is being
49written to even beyond that.
50
51### Use after scope
52Example use-after-scope:
53```c
54volatile int *p = 0;
55
56void foo() {
57  {
58    int x = 0;
59    p = &x;
60  }
61  *p = 5;
62}
63```
64In this example, the value 5 is written to an undefined address instead of the
65variable 'x'. This happens because 'x' can't be accessed outside its scope.
66
67## Using ASan
68
69In order to enable ASan on a supported platform,
70select `Address sanitizer support` from `General setup` menu while configuring
71coreboot.
72
73Then build coreboot and run the image as usual. If your code contains any of the
74above-mentioned memory bugs, ASan will report them in the console log as shown
75below:
76```text
77ASan: <bug type> in <ip>
78<access type> of <access size> bytes at addr <access address>
79```
80where,
81
82`bug type` is either `stack-out-of-bounds`, `global-out-of-bounds` or
83`use-after-scope`,
84
85`ip` is the address of the last good instruction before the bad access,
86
87`access type` is either `Read` or `Write`,
88
89`access size` is the number of bytes read or written, and
90
91`access address` is the memory location which is accessed while the error
92occurs.
93
94Next, you have to use `ip` to retrieve the instruction which causes the error.
95Since stages in coreboot are relocated, you need to normalize `ip`. For this,
96first subtract the start address of the stage from `ip`. Then, read the section
97headers from `<stage>.debug` file to determine the offset of the text segment.
98Add this offset to the difference you calculated earlier. Let's call the
99resultant address `ip'`.
100
101Next, read the contents of the symbol table and search for a function having
102an address closest to `ip'`. This is the function in which our memory bug is
103present. Let's denote the address of this function by `ip''`.
104
105Finally, read the assembly contents of the object file where this function is
106present. Look for the affected function. Here, the instruction which exists at
107the offset `ip' - ip''` corresponds to the address `ip`. Therefore, the very
108next instruction is the one which causes the error.
109
110To see ASan in action, let's take an example. Suppose, there is a
111stack-out-of-bounds error in cbfs.c that we aren’t aware of and we want ASan
112to help us detect it.
113```c
114int cbfs_boot_region_device(struct region_device *rdev)
115{
116	int array[5], i;
117	boot_device_init();
118
119	for (i = 10; i > 0; i--)
120		array[i] = i;
121
122	return vboot_locate_cbfs(rdev) &&
123	       fmap_locate_area_as_rdev("COREBOOT", rdev);
124}
125```
126First, we enable ASan from the configuration menu as shown above. Then, we
127build coreboot and run the image.
128
129ASan reports the following error in the console log:
130```text
131ASan: stack-out-of-bounds in 0x7f7432fd
132Write of 4 bytes at addr 0x7f7c2ac8
133```
134Here 0x7f7432fd is `ip` i.e. the address of the last good instruction before
135the bad access. First we have to normalize this address as stated above.
136As per the console log, this error happened in ramstage and the stage starts
137from 0x7f72c000. So, let’s look at the sections headers of ramstage from
138`ramstage.debug`.
139```text
140$ objdump -h build/cbfs/fallback/ramstage.debug
141
142build/cbfs/fallback/ramstage.debug:     file format elf32-i386
143
144Sections:
145Idx Name          Size      VMA       LMA       File off  Algn
146  0 .text         00070b20  00e00000  00e00000  00001000  2**12
147                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
148  1 .ctors        0000036c  00e70b20  00e70b20  00071b20  2**2
149                  CONTENTS, ALLOC, LOAD, RELOC, DATA
150  2 .data         0001c8f4  00e70e8c  00e70e8c  00071e8c  2**2
151                  CONTENTS, ALLOC, LOAD, RELOC, DATA
152  3 .bss          00012940  00e8d780  00e8d780  0008e780  2**7
153                  ALLOC
154  4 .heap         00004000  00ea00c0  00ea00c0  0008e780  2**0
155                  ALLOC
156```
157As you can see, the offset of the text segment is 0x00e00000. Let's subtract the
158start address of the stage from `ip` and add this offset to the difference. The
159resultant address i.e. `ip'` is 0x00e172fd.
160
161Next, we read the contents of the symbol table and search for a function having
162an address closest to 0x00e172fd.
163```text
164$ nm -n build/cbfs/fallback/ramstage.debug
165........
166........
16700e17116 t _GLOBAL__sub_I_65535_1_gfx_get_init_done
16800e17129 t tohex16
16900e171db T cbfs_load_and_decompress
17000e1729b T cbfs_boot_region_device
17100e17387 T cbfs_boot_locate
17200e1740d T cbfs_boot_map_with_leak
17300e174ef T cbfs_boot_map_optionrom
174........
175........
176```
177The symbol having an address closest to 0x00e172fd is `cbfs_boot_region_device` and
178its address i.e. `ip''` is 0x00e1729b.
179
180Now, as we know the affected function, let's read the assembly contents of
181`cbfs_boot_region_device()` which is present in `cbfs.o` to find the faulty
182instruction.
183```text
184$ objdump -d build/ramstage/lib/cbfs.o
185........
186........
187  51:   e8 fc ff ff ff          call   52 <cbfs_boot_region_device+0x52>
188  56:   83 ec 0c                sub    $0xc,%esp
189  59:   57                      push   %edi
190  5a:   83 ef 04                sub    $0x4,%edi
191  5d:   e8 fc ff ff ff          call   5e <cbfs_boot_region_device+0x5e>
192  62:   83 c4 10                add    $0x10,%esp
193  65:   89 5f 04                mov    %ebx,0x4(%edi)
194  68:   4b                      dec    %ebx
195  69:   75 eb                   jne    56 <cbfs_boot_region_device+0x56>
196........
197........
198```
199Here, we look for the instruction present at the offset 62 i.e. `ip' - ip''`.
200The instruction is `add $0x10,%esp` and it corresponds to
201`for (i = 10; i > 0; i--)` in our code. It means the very next instruction
202i.e. `mov %ebx,0x4(%edi)` is the one that causes the error. Now, as we look at
203C code of `cbfs_boot_region_device()` again, we find that this instruction
204corresponds to `array[i] = i`.
205
206Voilà! We just caught the memory bug using ASan.
207
208## Supported platforms
209Presently, the following architectures support ASan in ramstage:
210```{eval-rst}
211+------------------+--------------------------------+
212| Architecture     | Notes                          |
213+==================+================================+
214| x86              | Support for all x86 platforms  |
215+------------------+--------------------------------+
216```
217
218And in romstage ASan is available on the following platforms:
219```{eval-rst}
220+---------------------+-----------------------------+
221| Platform            | Notes                       |
222+=====================+=============================+
223| QEMU i440-fx        |                             |
224+---------------------+-----------------------------+
225| Intel Apollo Lake   |                             |
226+---------------------+-----------------------------+
227| Intel Haswell       |                             |
228+---------------------+-----------------------------+
229```
230Alternatively, you can use `grep` to view the list of platforms that support
231ASan in romstage:
232
233    $ git grep "select HAVE_ASAN_IN_ROMSTAGE"
234
235If the x86 platform you are using is not listed here, there is
236still a chance that it supports ASan in romstage.
237
238To test it, select `HAVE_ASAN_IN_ROMSTAGE` from the Kconfig file in the
239platform's dedicated directory. Then, enable ASan from the config menu as
240indicated in the previous section.
241
242If you are able to build coreboot without any errors and boot cleanly, that
243means the platform supports ASan in romstage. In that case, please upload a
244patch on Gerrit selecting this config option using 'ASan' topic. Also, update
245the platform name in the table.
246
247However, if you end up in compilation errors or the linker error saying that
248the cache got full, additional steps need to be taken to enable ASan in
249romstage on the platform. While compile errors could be resolved easily and
250therefore ASan in romstage has a good chance to be supported, a full cache is
251an indication that it is way more work or even likely impossible to enable
252ASan in romstage.
253
254## Future work
255### Heap buffer overflow
256Presently, ASan doesn't detect out-of-bounds accesses for the objects defined
257in heap.
258
259To add support for these type of memory bugs, you have to make sure that
260whenever some block of memory is allocated in the heap, the surrounding areas
261(redzones) are poisoned. Correspondingly, these redzones should be unpoisoned
262when the memory block is de-allocated.
263
264### ASan on other architectures
265The following points should help when adding support for ASan to other
266architectures like ARM or RISC-V:
267
268* Enabling ASan in ramstage on other architectures should be easy. You just
269have to make sure the shadow memory is initialized as early as possible when
270ramstage is loaded. This can be done by making a function call to `asan_init()`
271at the appropriate place.
272
273* For romstage, you have to find out if there is enough room in the cache to fit
274the shadow memory region. For this, find the boundary linker symbols for the
275region you'd want to run ASan on, excluding the hardware mapped addresses.
276Then define a new linker section named `asan_shadow` of size
277`(_end - _start) >> 3`, where `_start` and `_end` are the linker symbols you
278found earlier. This section should be appended to the region already occupied
279by the coreboot program. Now build coreboot. If you don't see any errors while
280building with the current translation function, ASan can be enabled on that
281platform.
282
283* The shadow region we currently use consumes memory equal to 1/8th of the
284program memory. So, if you end up in a linker error saying that the memory got
285full, you'll have to use a more compact shadow region. In that case, the
286translation function could be something like
287`shadow = (mem >> 7) | shadow_offset`. Since the stack buffers are protected by
288the compiler, you'll also have to create a GCC patch forcing it to use the new
289translation function for this particular architecture.
290
291* Once you are sure that the architecture supports ASan in ramstage, select
292`HAVE_ASAN_IN_RAMSTAGE` from the Kconfig file of that architecture. Similarly,
293if the platform supports ASan in romstage, select `HAVE_ASAN_IN_ROMSTAGE` from
294the platform's dedicated Kconfig file.
295
296### Post-processing script
297Unlike Linux, coreboot doesn't have `%pS` printk format to dereference pointer
298to its symbolic name. Therefore, we normalise the pointer address manually to
299determine the name of the affected function and further use it to find the
300instruction which causes the error.
301
302A custom script can be written to automate this process.
303