1# `.pte` file format 2 3ExecuTorch `.pte` program files are serialized as modified binary flatbuffer 4files with optional data segments appended. 5 6``` 7 ┌───────────────────────────────────┐ 8 │Standard flatbuffer header │ 9 ├───────────────────────────────────┤ 10Optional ──> │ExecuTorch extended header │ 11 ├───────────────────────────────────┤ 12 │Flatbuffer-serialized program data │ 13 │ │ 14 │ │ 15 ┌─ ├───────────────────────────────────┤ 16 │ │Padding │ 17 │ ├───────────────────────────────────┤ 18 │ │Segment data │ 19 │ │ │ 20 │ │ │ 21 │ ├───────────────────────────────────┤ 22 │ │Padding │ 23Optional ─┤ ├───────────────────────────────────┤ 24 │ │Segment data │ 25 │ │ │ 26 │ │ │ 27 │ ├───────────────────────────────────┤ 28 │ │Padding │ 29 │ ├───────────────────────────────────┤ 30 │ │... │ 31 └─ └───────────────────────────────────┘ 32``` 33 34## Compatibility 35 36See the [Runtime Compatibility Policy]( 37https://github.com/pytorch/executorch/tree/main/runtime/COMPATIBILITY.md) for 38details about the compatibility guarantees between the `.pte` format and the 39ExecuTorch runtime. 40 41## Headers 42 43Program files can be recognized by the magic string at byte offset 4, beginning 44with `ET` and followed by two ASCII decimal digits. 45 46Program files may have an optional extended header at byte offset 8, recognized 47by the magic string beginning with `eh` and followed by two ASCII decimal 48digits. This header includes the size of the flatbuffer-encoded core program 49data, and the starting offset of the segments that may follow the program data. 50Note that this header is ExecuTorch-specific, but even when present it does not 51upset most flatbuffer-parsing code (apart from the rarely-used 52`GetBufferStartFromRootPointer()`). 53 54All numbers are little-endian, regardless of the host system. 55 56Header layout: 57``` 58[0..3] uint32_t byte offset to the beginning of the flatbuffer root table. 59[4..7] File magic bytes: "ET" followed by two ASCII decimal digits. The digits 60 will change if the binary format of this file is changed in a 61 non-backwards-compatible way. 62Optional extended header: 63| [8..11] Extended header magic bytes: "eh" followed by two ASCII decimal 64| digits. The digits will change if the binary format of this header is 65| changed in a non-backwards-compatible way. 66| [12..15] uint32_t size of this extended header in bytes, including the magic 67| header and this size field. Fields can be added to this header in 68| the future by increasing this size. This size does not include any 69| padding that may follow the header. 70| [16..23] uint64_t size of the flatbuffer-encoded program data, starting from 71| byte offset zero above. I.e., it includes these headers. 72| [24..31] uint64_t offset (from byte offset zero above) to the start of the 73| first segment, or zero if there are no segments. 74| [31..?] Any zero-padding necessary to preserve the alignment of the data 75| that follows. 76End of optional extended header. 77``` 78 79Example: 80``` 81 Offset to flatbuffer root (0x38) 82 | File magic ("ET??") 83 | | Extended header magic ("eh??") 84 | | | Extended header size (0x18) 85 vvvvvvvvvvv vvvvvvvvvvv vvvvvvvvvvv vvvvvvvvvvv 860x0000 38 00 00 00 45 54 3F 3F 65 68 3F 3F 18 00 00 00 870x0010 F0 02 00 00 00 00 00 00 00 10 00 00 00 00 00 00 88 ^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^ 89 | Offset to segments (0x1000) 90 Size of program flatbuffer data (0x2f0) 91``` 92 93## Program data 94 95See `//executorch/schema/program.fbs` for the Program flatbuffer schema. 96 97The flatbuffer-encoded program data follows the headers. By embedding the size 98of this region in the extended header, clients can read only the program data 99without reading in segment data. This is useful because program data typically 100sticks around for the lifetime of a model, while the large segment data is often 101freeable after model initialization. 102 103## Segment data 104 105The first segment starts at the offset embedded in the extended header. 106Segments are typically aligned to 4096 or some other power of 2 that matches 107the target system's memory page size. This makes it easier to use `mmap()` 108if desired. 109 110The `Program.segments` array in the program data contains size/offset 111information about the segments that optionally follow. Offsets in this array are 112relative to the segment offet in the extended header. 113