xref: /aosp_15_r20/external/harfbuzz_ng/docs/serializer.md (revision 2d1272b857b1f7575e6e246373e1cb218663db8a)
1*2d1272b8SAndroid Build Coastguard Worker# Introduction
2*2d1272b8SAndroid Build Coastguard Worker
3*2d1272b8SAndroid Build Coastguard WorkerIn hb-subset serialization is the process of writing the subsetted font
4*2d1272b8SAndroid Build Coastguard Workertables out to actual bytes in the final format. All serialization works
5*2d1272b8SAndroid Build Coastguard Workerthrough an object called the serialize context
6*2d1272b8SAndroid Build Coastguard Worker([hb_serialize_context_t](https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-serialize.hh)).
7*2d1272b8SAndroid Build Coastguard Worker
8*2d1272b8SAndroid Build Coastguard WorkerInternally the serialize context holds a fixed size memory buffer. For simple
9*2d1272b8SAndroid Build Coastguard Workertables the final bytes are written into the buffer sequentially to produce
10*2d1272b8SAndroid Build Coastguard Workerthe final serialized bytes.
11*2d1272b8SAndroid Build Coastguard Worker
12*2d1272b8SAndroid Build Coastguard Worker## Simple Tables
13*2d1272b8SAndroid Build Coastguard Worker
14*2d1272b8SAndroid Build Coastguard WorkerSimple tables are tables that do not use offset graphs.
15*2d1272b8SAndroid Build Coastguard Worker
16*2d1272b8SAndroid Build Coastguard WorkerTo write a struct into the serialization context, first you call an
17*2d1272b8SAndroid Build Coastguard Workerallocation method on the context which requests a writable array of bytes of
18*2d1272b8SAndroid Build Coastguard Workera fixed size. If the requested array will not exceed the bounds of the fixed
19*2d1272b8SAndroid Build Coastguard Workerbuffer the serializer will return a pointer to the next unwritten portion
20*2d1272b8SAndroid Build Coastguard Workerof the buffer. Then the struct is cast onto the returned pointer and values
21*2d1272b8SAndroid Build Coastguard Workerare written to the structs fields.
22*2d1272b8SAndroid Build Coastguard Worker
23*2d1272b8SAndroid Build Coastguard WorkerInternally the serialization context ends up looking like:
24*2d1272b8SAndroid Build Coastguard Worker
25*2d1272b8SAndroid Build Coastguard Worker```
26*2d1272b8SAndroid Build Coastguard Worker+-------+-------+-----+-------+--------------+
27*2d1272b8SAndroid Build Coastguard Worker| Obj 1 | Obj 2 | ... | Obj N | Unused Space |
28*2d1272b8SAndroid Build Coastguard Worker+-------+-------+-----+-------+--------------+
29*2d1272b8SAndroid Build Coastguard Worker```
30*2d1272b8SAndroid Build Coastguard Worker
31*2d1272b8SAndroid Build Coastguard WorkerHere Obj N, is the object currently being written.
32*2d1272b8SAndroid Build Coastguard Worker
33*2d1272b8SAndroid Build Coastguard Worker## Complex Tables
34*2d1272b8SAndroid Build Coastguard Worker
35*2d1272b8SAndroid Build Coastguard WorkerComplex tables are made up of graphs of objects, where offset's are used
36*2d1272b8SAndroid Build Coastguard Workerto form the edges of the graphs. Each object is a continuous slice of bytes
37*2d1272b8SAndroid Build Coastguard Workerthat contains zero or more offsets pointing to more objects.
38*2d1272b8SAndroid Build Coastguard Worker
39*2d1272b8SAndroid Build Coastguard WorkerIn this case the serialization buffer has a different layout:
40*2d1272b8SAndroid Build Coastguard Worker
41*2d1272b8SAndroid Build Coastguard Worker```
42*2d1272b8SAndroid Build Coastguard Worker|- in progress objects -|              |--- packed objects --|
43*2d1272b8SAndroid Build Coastguard Worker+-----------+-----------+--------------+-------+-----+-------+
44*2d1272b8SAndroid Build Coastguard Worker|  Obj n+2  |  Obj n+1  | Unused Space | Obj n | ... | Obj 0 |
45*2d1272b8SAndroid Build Coastguard Worker+-----------+-----------+--------------+-------+-----+-------+
46*2d1272b8SAndroid Build Coastguard Worker|----------------------->              <---------------------|
47*2d1272b8SAndroid Build Coastguard Worker```
48*2d1272b8SAndroid Build Coastguard Worker
49*2d1272b8SAndroid Build Coastguard WorkerThe buffer holds two stacks:
50*2d1272b8SAndroid Build Coastguard Worker
51*2d1272b8SAndroid Build Coastguard Worker1. In progress objects are held in a stack starting from the start of buffer
52*2d1272b8SAndroid Build Coastguard Worker   that grows towards the end of the buffer.
53*2d1272b8SAndroid Build Coastguard Worker
54*2d1272b8SAndroid Build Coastguard Worker2. Packed objects are held in a stack that starts at the end of the buffer
55*2d1272b8SAndroid Build Coastguard Worker   and grows towards the start of the buffer.
56*2d1272b8SAndroid Build Coastguard Worker
57*2d1272b8SAndroid Build Coastguard WorkerOnce the object on the top of the in progress stack is finished being written
58*2d1272b8SAndroid Build Coastguard Workerits bytes are popped from the in progress stack and copied to the top of
59*2d1272b8SAndroid Build Coastguard Workerthe packed objects stack. In the example above, finalizing Obj n+1
60*2d1272b8SAndroid Build Coastguard Workerwould result in the following state:
61*2d1272b8SAndroid Build Coastguard Worker
62*2d1272b8SAndroid Build Coastguard Worker```
63*2d1272b8SAndroid Build Coastguard Worker+---------+--------------+---------+-------+-----+-------+
64*2d1272b8SAndroid Build Coastguard Worker| Obj n+2 | Unused Space | Obj n+1 | Obj n | ... | Obj 0 |
65*2d1272b8SAndroid Build Coastguard Worker+---------+--------------+---------+-------+-----+-------+
66*2d1272b8SAndroid Build Coastguard Worker```
67*2d1272b8SAndroid Build Coastguard Worker
68*2d1272b8SAndroid Build Coastguard WorkerEach packed object is associated with an ID, it's zero based position in the packed
69*2d1272b8SAndroid Build Coastguard Workerobjects stack. In this example Obj 0, would have an ID of 0.
70*2d1272b8SAndroid Build Coastguard Worker
71*2d1272b8SAndroid Build Coastguard WorkerDuring serialization offsets that link from one object to another are stored
72*2d1272b8SAndroid Build Coastguard Workerusing object ids. The serialize context maintains a list of links between
73*2d1272b8SAndroid Build Coastguard Workerobjects. Each link records the parent object id, the child object id, the position
74*2d1272b8SAndroid Build Coastguard Workerof the offset field within the parent object, and the width of the offset.
75*2d1272b8SAndroid Build Coastguard Worker
76*2d1272b8SAndroid Build Coastguard WorkerLinks are always added to the current in progress object and you can only link to
77*2d1272b8SAndroid Build Coastguard Workerobjects that have been packed and thus have an ID.
78*2d1272b8SAndroid Build Coastguard Worker
79*2d1272b8SAndroid Build Coastguard Worker### Object De-duplication
80*2d1272b8SAndroid Build Coastguard Worker
81*2d1272b8SAndroid Build Coastguard WorkerAn important optimization in packing offset graphs is de-duplicating equivalent objects. If you
82*2d1272b8SAndroid Build Coastguard Workerhave two or more parent objects that point to child objects that are equivalent then you only need
83*2d1272b8SAndroid Build Coastguard Workerto encode the child once and can have the parents point to the same child. This can significantly
84*2d1272b8SAndroid Build Coastguard Workerreduce the final size of a serialized graph.
85*2d1272b8SAndroid Build Coastguard Worker
86*2d1272b8SAndroid Build Coastguard WorkerDuring packing of an inprogress object the serialization context checks if any existing packed
87*2d1272b8SAndroid Build Coastguard Workerobjects are equivalent to the object being packed. Here equivalence means the object has the
88*2d1272b8SAndroid Build Coastguard Workerexact same bytes and all of it's links are equivalent. If an equivalent object is found the
89*2d1272b8SAndroid Build Coastguard Workerin progress object is discarded and not copied to the packed object stack. The object id of
90*2d1272b8SAndroid Build Coastguard Workerthe equivalent object is instead returned. Thus parent objects will then link to the existing
91*2d1272b8SAndroid Build Coastguard Workerequivalent object.
92*2d1272b8SAndroid Build Coastguard Worker
93*2d1272b8SAndroid Build Coastguard WorkerTo find equivalent objects the serialization context maintains a hashmap from object to the canonical
94*2d1272b8SAndroid Build Coastguard Workerobject id.
95*2d1272b8SAndroid Build Coastguard Worker
96*2d1272b8SAndroid Build Coastguard Worker### Link Resolution
97*2d1272b8SAndroid Build Coastguard Worker
98*2d1272b8SAndroid Build Coastguard WorkerOnce all objects have been packed the next step is to assign actual values to all of the offset
99*2d1272b8SAndroid Build Coastguard Workerfields. Prior to this point all links in the graph have been recorded using object id's. For each
100*2d1272b8SAndroid Build Coastguard Workerlink the resolver computes the offset between the parent and child and writes the offset into
101*2d1272b8SAndroid Build Coastguard Workerthe serialization buffer at the appropriate location.
102*2d1272b8SAndroid Build Coastguard Worker
103*2d1272b8SAndroid Build Coastguard Worker### Offset Overflow Resolution
104*2d1272b8SAndroid Build Coastguard Worker
105*2d1272b8SAndroid Build Coastguard WorkerIf during link resolution the resolver finds that an offsets value would exceed what can be encoded
106*2d1272b8SAndroid Build Coastguard Workerin that offset field link resolution is aborted and the offset overflow resolver is invoked.
107*2d1272b8SAndroid Build Coastguard WorkerThat process is documented [here](reapcker.md).
108*2d1272b8SAndroid Build Coastguard Worker
109*2d1272b8SAndroid Build Coastguard Worker
110*2d1272b8SAndroid Build Coastguard Worker### Example of Complex Serialization
111*2d1272b8SAndroid Build Coastguard Worker
112*2d1272b8SAndroid Build Coastguard Worker
113*2d1272b8SAndroid Build Coastguard WorkerIf we wanted to serialize the following graph:
114*2d1272b8SAndroid Build Coastguard Worker
115*2d1272b8SAndroid Build Coastguard Worker```
116*2d1272b8SAndroid Build Coastguard Workera--b--d
117*2d1272b8SAndroid Build Coastguard Worker \   /
118*2d1272b8SAndroid Build Coastguard Worker   c
119*2d1272b8SAndroid Build Coastguard Worker```
120*2d1272b8SAndroid Build Coastguard Worker
121*2d1272b8SAndroid Build Coastguard WorkerSerializer would be called like this:
122*2d1272b8SAndroid Build Coastguard Worker
123*2d1272b8SAndroid Build Coastguard Worker```c++
124*2d1272b8SAndroid Build Coastguard Workerhb_serialize_context_t ctx;
125*2d1272b8SAndroid Build Coastguard Worker
126*2d1272b8SAndroid Build Coastguard Workerstruct root {
127*2d1272b8SAndroid Build Coastguard Worker  char name;
128*2d1272b8SAndroid Build Coastguard Worker  Offset16To<child> child_1;
129*2d1272b8SAndroid Build Coastguard Worker  Offset16To<child> child_2;
130*2d1272b8SAndroid Build Coastguard Worker}
131*2d1272b8SAndroid Build Coastguard Worker
132*2d1272b8SAndroid Build Coastguard Workerstruct child {
133*2d1272b8SAndroid Build Coastguard Worker  char name;
134*2d1272b8SAndroid Build Coastguard Worker  Offset16To<char> leaf;
135*2d1272b8SAndroid Build Coastguard Worker}
136*2d1272b8SAndroid Build Coastguard Worker
137*2d1272b8SAndroid Build Coastguard Worker// Object A.
138*2d1272b8SAndroid Build Coastguard Workerctx->push();
139*2d1272b8SAndroid Build Coastguard Workerroot* a = ctx->start_embed<root> ();
140*2d1272b8SAndroid Build Coastguard Workerctx->extend_min (a);
141*2d1272b8SAndroid Build Coastguard Workera->name = 'a';
142*2d1272b8SAndroid Build Coastguard Worker
143*2d1272b8SAndroid Build Coastguard Worker// Object B.
144*2d1272b8SAndroid Build Coastguard Workerctx->push();
145*2d1272b8SAndroid Build Coastguard Workerchild* b = ctx->start_embed<child> ();
146*2d1272b8SAndroid Build Coastguard Workerctx->extend_min (b);
147*2d1272b8SAndroid Build Coastguard Workerb->name = 'b';
148*2d1272b8SAndroid Build Coastguard Worker
149*2d1272b8SAndroid Build Coastguard Worker// Object D.
150*2d1272b8SAndroid Build Coastguard Workerctx->push();
151*2d1272b8SAndroid Build Coastguard Worker*ctx->allocate_size<char> (1) = 'd';
152*2d1272b8SAndroid Build Coastguard Workerunsigned d_id = ctx->pop_pack ();
153*2d1272b8SAndroid Build Coastguard Worker
154*2d1272b8SAndroid Build Coastguard Workerctx->add_link (b->leaf, d_id);
155*2d1272b8SAndroid Build Coastguard Workerunsigned b_id = ctx->pop_pack ();
156*2d1272b8SAndroid Build Coastguard Worker
157*2d1272b8SAndroid Build Coastguard Worker// Object C
158*2d1272b8SAndroid Build Coastguard Workerctx->push();
159*2d1272b8SAndroid Build Coastguard Workerchild* c = ctx->start_embed<child> ();
160*2d1272b8SAndroid Build Coastguard Workerctx->extend_min (c);
161*2d1272b8SAndroid Build Coastguard Workerc->name = 'c';
162*2d1272b8SAndroid Build Coastguard Worker
163*2d1272b8SAndroid Build Coastguard Worker// Object D.
164*2d1272b8SAndroid Build Coastguard Workerctx->push();
165*2d1272b8SAndroid Build Coastguard Worker*ctx->allocate_size<char> (1) = 'd';
166*2d1272b8SAndroid Build Coastguard Workerd_id = ctx->pop_pack (); // Serializer will automatically de-dup this with the previous 'd'
167*2d1272b8SAndroid Build Coastguard Worker
168*2d1272b8SAndroid Build Coastguard Workerctx->add_link (c->leaf, d_id);
169*2d1272b8SAndroid Build Coastguard Workerunsigned c_id = ctx->pop_pack ();
170*2d1272b8SAndroid Build Coastguard Worker
171*2d1272b8SAndroid Build Coastguard Worker// Object A's links:
172*2d1272b8SAndroid Build Coastguard Workerctx->add_link (a->child_1, b_id);
173*2d1272b8SAndroid Build Coastguard Workerctx->add_link (a->child_2, c_id);
174*2d1272b8SAndroid Build Coastguard Workerctx->pop_pack ();
175*2d1272b8SAndroid Build Coastguard Worker
176*2d1272b8SAndroid Build Coastguard Workerctx->end_serialize ();
177*2d1272b8SAndroid Build Coastguard Worker
178*2d1272b8SAndroid Build Coastguard Worker```
179