xref: /aosp_15_r20/external/zstd/doc/decompressor_permissive.md (revision 01826a4963a0d8a59bc3812d29bdf0fb76416722)
1*01826a49SYabin CuiDecompressor Permissiveness to Invalid Data
2*01826a49SYabin Cui===========================================
3*01826a49SYabin Cui
4*01826a49SYabin CuiThis document describes the behavior of the reference decompressor in cases
5*01826a49SYabin Cuiwhere it accepts formally invalid data instead of reporting an error.
6*01826a49SYabin Cui
7*01826a49SYabin CuiWhile the reference decompressor *must* decode any compliant frame following
8*01826a49SYabin Cuithe specification, its ability to detect erroneous data is on a best effort
9*01826a49SYabin Cuibasis: the decoder may accept input data that would be formally invalid,
10*01826a49SYabin Cuiwhen it causes no risk to the decoder, and which detection would cost too much
11*01826a49SYabin Cuicomplexity or speed regression.
12*01826a49SYabin Cui
13*01826a49SYabin CuiIn practice, the vast majority of invalid data are detected, if only because
14*01826a49SYabin Cuimany corruption events are dangerous for the decoder process (such as
15*01826a49SYabin Cuirequesting an out-of-bound memory access) and many more are easy to check.
16*01826a49SYabin Cui
17*01826a49SYabin CuiThis document lists a few known cases where invalid data was formerly accepted
18*01826a49SYabin Cuiby the decoder, and what has changed since.
19*01826a49SYabin Cui
20*01826a49SYabin Cui
21*01826a49SYabin CuiOffset == 0
22*01826a49SYabin Cui-----------
23*01826a49SYabin Cui
24*01826a49SYabin Cui**Last affected version**: v1.5.5
25*01826a49SYabin Cui
26*01826a49SYabin Cui**Produced by the reference compressor**: No
27*01826a49SYabin Cui
28*01826a49SYabin Cui**Example Frame**: `28b5 2ffd 0000 4500 0008 0002 002f 430b ae`
29*01826a49SYabin Cui
30*01826a49SYabin CuiIf a sequence is decoded with `literals_length = 0` and `offset_value = 3`
31*01826a49SYabin Cuiwhile `Repeated_Offset_1 = 1`, the computed offset will be `0`, which is
32*01826a49SYabin Cuiinvalid.
33*01826a49SYabin Cui
34*01826a49SYabin CuiThe reference decompressor up to v1.5.5 processes this case as if the computed
35*01826a49SYabin Cuioffset was `1`, including inserting `1` into the repeated offset list.
36*01826a49SYabin CuiThis prevents the output buffer from remaining uninitialized, thus denying a
37*01826a49SYabin Cuipotential attack vector from an untrusted source.
38*01826a49SYabin CuiHowever, in the rare case where this scenario would be the outcome of a
39*01826a49SYabin Cuitransmission or storage error, the decoder relies on the checksum to detect
40*01826a49SYabin Cuithe error.
41*01826a49SYabin Cui
42*01826a49SYabin CuiIn newer versions, this case is always detected and reported as a corruption error.
43*01826a49SYabin Cui
44*01826a49SYabin Cui
45*01826a49SYabin CuiNon-zeroes reserved bits
46*01826a49SYabin Cui------------------------
47*01826a49SYabin Cui
48*01826a49SYabin Cui**Last affected version**: v1.5.5
49*01826a49SYabin Cui
50*01826a49SYabin Cui**Produced by the reference compressor**: No
51*01826a49SYabin Cui
52*01826a49SYabin CuiThe Sequences section of each block has a header, and one of its elements is a
53*01826a49SYabin Cuibyte, which describes the compression mode of each symbol.
54*01826a49SYabin CuiThis byte contains 2 reserved bits which must be set to zero.
55*01826a49SYabin Cui
56*01826a49SYabin CuiThe reference decompressor up to v1.5.5 just ignores these 2 bits.
57*01826a49SYabin CuiThis behavior has no consequence for the rest of the frame decoding process.
58*01826a49SYabin Cui
59*01826a49SYabin CuiIn newer versions, the 2 reserved bits are actively checked for value zero,
60*01826a49SYabin Cuiand the decoder reports a corruption error if they are not.
61