1# Tutorial 2 3The simplest use for the STG tools is to extract, store and compare ABI 4representations. 5 6This tutorial uses long options throughout. Equivalent short options can be 7found in the manual pages for [`stg`](stg.md) and [`stgdiff`](stgdiff.md). Both 8tools understand `-` as a shorthand for `/dev/stdout`. 9 10<details> 11<summary>Working Example - code and compilation</summary> 12 13This small code sample will be used as a working example. Copy it into a file 14called `tree.c`. 15 16```c 17struct N { 18 struct N * left; 19 struct N * right; 20 int value; 21}; 22 23unsigned int count(struct N * tree) { 24 return tree ? count(tree->left) + count(tree->right) + 1 : 0; 25} 26 27int sum(struct N * tree) { 28 return tree ? sum(tree->left) + sum(tree->right) + tree->value : 0; 29} 30``` 31 32Compile it: 33 34```shell 35gcc -Wall -Wextra -g -c tree.c -o tree.o 36``` 37 38</details> 39 40## Extraction from ELF / DWARF 41 42`stg` is the tool for extracting ABI representations, though it can do more 43sophisticated things as well. The simplest invocation of `stg` looks something 44like this: 45 46```shell 47stg --elf library.so --output library.stg 48``` 49 50Adding the `--annotate` option can be useful, especially if trying to debug ABI 51issues or when experimenting with the tools, like now. 52 53If the output consists of just symbols and you get a warning about missing DWARF 54information, this means that `library.so` has no DWARF debugging information. 55For meaningful results, `stg` should be run on an *unstripped* ELF file which 56may require build system adjustments. 57 58<details> 59<summary>Working Example - ABI extraction</summary> 60 61Run this: 62 63```shell 64stg --elf tree.o --annotate --output - 65``` 66 67And you should get something like this: 68 69<details> 70<summary>Output</summary> 71 72```proto 73version: 0x00000002 74root_id: 0x84ea5130 # interface 75pointer_reference { 76 id: 0x32b38621 77 kind: POINTER 78 pointee_type_id: 0xe08efe1a # struct N 79} 80primitive { 81 id: 0x4585663f 82 name: "unsigned int" 83 encoding: UNSIGNED_INTEGER 84 bytesize: 0x00000004 85} 86primitive { 87 id: 0x6720d32f 88 name: "int" 89 encoding: SIGNED_INTEGER 90 bytesize: 0x00000004 91} 92member { 93 id: 0x35cbdb23 94 name: "left" 95 type_id: 0x32b38621 # struct N* 96} 97member { 98 id: 0x0b440ffb 99 name: "right" 100 type_id: 0x32b38621 # struct N* 101 offset: 64 102} 103member { 104 id: 0xa06f75d5 105 name: "value" 106 type_id: 0x6720d32f # int 107 offset: 128 108} 109struct_union { 110 id: 0xe08efe1a 111 kind: STRUCT 112 name: "N" 113 definition { 114 bytesize: 24 115 member_id: 0x35cbdb23 # struct N* left 116 member_id: 0x0b440ffb # struct N* right 117 member_id: 0xa06f75d5 # int value 118 } 119} 120function { 121 id: 0x912c02a7 122 return_type_id: 0x6720d32f # int 123 parameter_id: 0x32b38621 # struct N* 124} 125function { 126 id: 0xc2779f73 127 return_type_id: 0x4585663f # unsigned int 128 parameter_id: 0x32b38621 # struct N* 129} 130elf_symbol { 131 id: 0xbb237197 132 name: "count" 133 is_defined: true 134 symbol_type: FUNCTION 135 type_id: 0xc2779f73 # unsigned int(struct N*) 136 full_name: "count" 137} 138elf_symbol { 139 id: 0x4fdeca38 140 name: "sum" 141 is_defined: true 142 symbol_type: FUNCTION 143 type_id: 0x912c02a7 # int(struct N*) 144 full_name: "sum" 145} 146interface { 147 id: 0x84ea5130 148 symbol_id: 0xbb237197 # unsigned int count(struct N*) 149 symbol_id: 0x4fdeca38 # int sum(struct N*) 150} 151``` 152 153</details> 154 155</details> 156 157## Filtering 158 159One issue when first starting to manage the ABI of a binary is the wish to 160restrict the interface surface to just the necessary minimum. Any superfluous 161symbols or type definitions in the ABI representation can result in spurious ABI 162differences in reports later on. 163 164When it comes to the symbols exposed, it's common to control symbol 165*visibility*. Type definitions can be either exposed in public header files or 166hidden in private header files, with perhaps only public forward declarations, 167but this does not remove any type definitions in the DWARF information. 168 169STG provides filtering facilities for both symbols and types, for example: 170 171```shell 172stg --files '*.h' --elf library.so --output library.stg 173``` 174 175This will ensure that definitions of any types defined outside any header files, 176and perhaps used as opaque pointer handles, are omitted from the ABI 177representation. If you separate public and private headers, then use an 178appropriate glob pattern that distinguishes the two. 179 180Sets of symbol or file names can be read from a file. In this example, all 181symbols whose names begin with `api_`, except those in the `obsolete` file, are 182kept. 183 184```shell 185stg --symbols 'api_* & ! :obsolete' --elf library.so --output library.stg 186``` 187 188For historical reasons, the literal filter file format is compatible with 189libabigail's symbol list one, but this is subject to change. 190 191```ini 192[list] 193 # one symbol per line 194 foo # comments, whitespace and empty lines are all ignored 195 bar 196 197 baz 198``` 199 200<details> 201<summary>Working Example - filtering the ABI</summary> 202 203Let's say that `struct N` is supposed to be an opaque type that user code only 204gets pointers to and, additionally, the function `count` should be excluded from 205the ABI (perhaps due to an argument over its return type). We can exclude the 206definition of `struct N`, along with that of any other types defined in 207`tree.c`, using a file filter. The symbol can be excluded by name. 208 209Run this: 210 211```shell 212stg --elf tree.o --files '*.h' --symbols '!count' --output - 213``` 214 215The result should be something like this: 216 217<details> 218<summary>Output</summary> 219 220```proto 221version: 0x00000002 222root_id: 0x84ea5130 223pointer_reference { 224 id: 0x26944aa7 225 kind: POINTER 226 pointee_type_id: 0xb011cc02 227} 228primitive { 229 id: 0x6720d32f 230 name: "int" 231 encoding: SIGNED_INTEGER 232 bytesize: 0x00000004 233} 234struct_union { 235 id: 0xb011cc02 236 kind: STRUCT 237 name: "N" 238} 239function { 240 id: 0x9425f186 241 return_type_id: 0x6720d32f 242 parameter_id: 0x26944aa7 243} 244elf_symbol { 245 id: 0x4fdeca38 246 name: "sum" 247 is_defined: true 248 symbol_type: FUNCTION 249 type_id: 0x9425f186 250 full_name: "sum" 251} 252interface { 253 id: 0x84ea5130 254 symbol_id: 0x4fdeca38 255} 256``` 257 258</details> 259 260</details> 261 262## ABI Comparison 263 264`stgdiff` is the tool for comparing ABI representations and reporting 265differences, though it has some other, more specialised, uses. The simplest 266invocation of `stgdiff` looks something like this: 267 268```shell 269stgdiff --stg old/library.stg new/library.stg --output - 270``` 271 272This will report ABI differences in the default (`small`) format. 273 274<details> 275<summary>Working Example - ABI differences - small format</summary> 276 277The function `sum` has a type that depends on `struct N`. Any change to either 278might affect the ABI exposed via `sum`. For example, if the type of the `value` 279member is changed to `short` and the file is recompiled, STG can detect this 280difference. 281 282First rerun the STG extraction, specifying `--output tree-old.stg`. Make the 283source code change, recompile and extract the ABI with `--output tree-new.stg`. 284 285Then run this: 286 287```shell 288stgdiff --stg tree-old.stg tree-new.stg --output - 289``` 290 291To get this: 292 293```text 294type 'struct N' changed 295 member changed from 'int value' to 'short int value' 296 type changed from 'int' to 'short int' 297 298``` 299 300</details> 301 302The `small` format omits parts of the ABI graph which haven't changed.[^1] To 303see all impacted nodes, use `--format flat` instead. 304 305[^1]: The similarly named `short` format goes a bit further and will omit and 306 summarise certain repetitive differences. 307 308<details> 309<summary>Working Example - ABI differences - flat format</summary> 310 311```text 312function symbol 'int sum(struct N*)' changed 313 type 'int(struct N*)' changed 314 parameter 1 type 'struct N*' changed 315 pointed-to type 'struct N' changed 316 317type 'struct N' changed 318 member 'struct N* left' changed 319 type 'struct N*' changed 320 pointed-to type 'struct N' changed 321 member 'struct N* right' changed 322 type 'struct N*' changed 323 pointed-to type 'struct N' changed 324 member changed from 'int value' to 'short int value' 325 type changed from 'int' to 'short int' 326 327``` 328 329</details> 330 331And if you really want to see more of the graph structure, use `--format plain`. 332 333<details> 334<summary>Working Example - ABI differences - plain format</summary> 335 336```text 337function symbol 'int sum(struct N*)' changed 338 type 'int(struct N*)' changed 339 parameter 1 type 'struct N*' changed 340 pointed-to type 'struct N' changed 341 member 'struct N* left' changed 342 type 'struct N*' changed 343 pointed-to type 'struct N' changed 344 (being reported) 345 member 'struct N* right' changed 346 type 'struct N*' changed 347 pointed-to type 'struct N' changed 348 (being reported) 349 member changed from 'int value' to 'short int value' 350 type changed from 'int' to 'short int' 351 352``` 353 354</details> 355 356Or just use `--format viz` which generates input for 357[Graphviz](https://graphviz.org/). 358 359<details> 360<summary>Working Example - ABI differences - viz format</summary> 361 362```dot 363digraph "ABI diff" { 364 "0" [shape=rectangle, label="'interface'"] 365 "1" [label="'int sum(struct N*)'"] 366 "2" [label="'int(struct N*)'"] 367 "3" [label="'struct N*'"] 368 "4" [shape=rectangle, label="'struct N'"] 369 "5" [label="'struct N* left'"] 370 "5" -> "3" [label=""] 371 "4" -> "5" [label=""] 372 "6" [label="'struct N* right'"] 373 "6" -> "3" [label=""] 374 "4" -> "6" [label=""] 375 "7" [label="'int value' → 'short int value'"] 376 "8" [color=red, label="'int' → 'short int'"] 377 "7" -> "8" [label=""] 378 "4" -> "7" [label=""] 379 "3" -> "4" [label="pointed-to"] 380 "2" -> "3" [label="parameter 1"] 381 "1" -> "2" [label=""] 382 "0" -> "1" [label=""] 383} 384``` 385 386</details> 387