xref: /aosp_15_r20/external/leveldb/doc/table_format.md (revision 9507f98c5f32dee4b5f9e4a38cd499f3ff5c4490)
1leveldb File format
2===================
3
4    <beginning_of_file>
5    [data block 1]
6    [data block 2]
7    ...
8    [data block N]
9    [meta block 1]
10    ...
11    [meta block K]
12    [metaindex block]
13    [index block]
14    [Footer]        (fixed size; starts at file_size - sizeof(Footer))
15    <end_of_file>
16
17The file contains internal pointers.  Each such pointer is called
18a BlockHandle and contains the following information:
19
20    offset:   varint64
21    size:     varint64
22
23See [varints](https://developers.google.com/protocol-buffers/docs/encoding#varints)
24for an explanation of varint64 format.
25
261.  The sequence of key/value pairs in the file are stored in sorted
27order and partitioned into a sequence of data blocks.  These blocks
28come one after another at the beginning of the file.  Each data block
29is formatted according to the code in `block_builder.cc`, and then
30optionally compressed.
31
322. After the data blocks we store a bunch of meta blocks.  The
33supported meta block types are described below.  More meta block types
34may be added in the future.  Each meta block is again formatted using
35`block_builder.cc` and then optionally compressed.
36
373. A "metaindex" block.  It contains one entry for every other meta
38block where the key is the name of the meta block and the value is a
39BlockHandle pointing to that meta block.
40
414. An "index" block.  This block contains one entry per data block,
42where the key is a string >= last key in that data block and before
43the first key in the successive data block.  The value is the
44BlockHandle for the data block.
45
465. At the very end of the file is a fixed length footer that contains
47the BlockHandle of the metaindex and index blocks as well as a magic number.
48
49        metaindex_handle: char[p];     // Block handle for metaindex
50        index_handle:     char[q];     // Block handle for index
51        padding:          char[40-p-q];// zeroed bytes to make fixed length
52                                       // (40==2*BlockHandle::kMaxEncodedLength)
53        magic:            fixed64;     // == 0xdb4775248b80fb57 (little-endian)
54
55## "filter" Meta Block
56
57If a `FilterPolicy` was specified when the database was opened, a
58filter block is stored in each table.  The "metaindex" block contains
59an entry that maps from `filter.<N>` to the BlockHandle for the filter
60block where `<N>` is the string returned by the filter policy's
61`Name()` method.
62
63The filter block stores a sequence of filters, where filter i contains
64the output of `FilterPolicy::CreateFilter()` on all keys that are stored
65in a block whose file offset falls within the range
66
67    [ i*base ... (i+1)*base-1 ]
68
69Currently, "base" is 2KB.  So for example, if blocks X and Y start in
70the range `[ 0KB .. 2KB-1 ]`, all of the keys in X and Y will be
71converted to a filter by calling `FilterPolicy::CreateFilter()`, and the
72resulting filter will be stored as the first filter in the filter
73block.
74
75The filter block is formatted as follows:
76
77    [filter 0]
78    [filter 1]
79    [filter 2]
80    ...
81    [filter N-1]
82
83    [offset of filter 0]                  : 4 bytes
84    [offset of filter 1]                  : 4 bytes
85    [offset of filter 2]                  : 4 bytes
86    ...
87    [offset of filter N-1]                : 4 bytes
88
89    [offset of beginning of offset array] : 4 bytes
90    lg(base)                              : 1 byte
91
92The offset array at the end of the filter block allows efficient
93mapping from a data block offset to the corresponding filter.
94
95## "stats" Meta Block
96
97This meta block contains a bunch of stats.  The key is the name
98of the statistic.  The value contains the statistic.
99
100TODO(postrelease): record following stats.
101
102    data size
103    index size
104    key size (uncompressed)
105    value size (uncompressed)
106    number of entries
107    number of data blocks
108