1*99e0aae7SDavid Rees# Design Sketch: Protocol Buffers <=> Emboss Translation 2*99e0aae7SDavid Rees 3*99e0aae7SDavid Rees## Overview 4*99e0aae7SDavid Rees 5*99e0aae7SDavid ReesThere are many tools that operate on Protocol Buffer objects ("Protos"). 6*99e0aae7SDavid ReesProviding a way to translate between Protos and Emboss structures would allow 7*99e0aae7SDavid Reesthose tools to be used without writing a tedious translation layer. 8*99e0aae7SDavid Rees 9*99e0aae7SDavid Rees 10*99e0aae7SDavid Rees## Defining an Equivalent Proto `message` 11*99e0aae7SDavid Rees 12*99e0aae7SDavid ReesFor each Emboss `struct`, `bits`, `enum`, and primitive type, there would need 13*99e0aae7SDavid Reesto be some equivalent Proto encoding -- likely a `message` for each `struct` or 14*99e0aae7SDavid Rees`bits`, a Proto `enum` inside a `message` for each `enum` (see below), and a 15*99e0aae7SDavid ReesProto primitive type for each Emboss primitive type. 16*99e0aae7SDavid Rees 17*99e0aae7SDavid ReesThere are two basic ways that the Proto definition could be generated: 18*99e0aae7SDavid Rees 19*99e0aae7SDavid Rees1. Human-Authored `.proto` Definitions: 20*99e0aae7SDavid Rees 21*99e0aae7SDavid Rees This requires more human effort when trying to use Emboss structures as 22*99e0aae7SDavid Rees Protos, likely approaching the level of effort to just hand-write a 23*99e0aae7SDavid Rees translation layer. It *might* make it easier to use an existing Proto 24*99e0aae7SDavid Rees definition. 25*99e0aae7SDavid Rees 26*99e0aae7SDavid Rees It would also require significantly more flexibility, and therefore more 27*99e0aae7SDavid Rees complexity, in the Emboss compiler. 28*99e0aae7SDavid Rees 29*99e0aae7SDavid Rees2. Emboss Generates a `.proto` File: 30*99e0aae7SDavid Rees 31*99e0aae7SDavid Rees This option is likely to create slightly "unnatural" Proto definitions (see 32*99e0aae7SDavid Rees below for more details), but requires very little human effort to create a 33*99e0aae7SDavid Rees translation to a Proto. 34*99e0aae7SDavid Rees 35*99e0aae7SDavid Rees Escape hatches for "partially hand-coded" translations should be 36*99e0aae7SDavid Rees considered, even if they are not implemented in the first pass at Emboss 37*99e0aae7SDavid Rees <=> Proto translation. 38*99e0aae7SDavid Rees 39*99e0aae7SDavid ReesBecause a human always has the option to hand code their own translation, this 40*99e0aae7SDavid Reesdocument will assume option 2: the Emboss compiler generates a Proto 41*99e0aae7SDavid Reesdefinition. 42*99e0aae7SDavid Rees 43*99e0aae7SDavid Rees 44*99e0aae7SDavid Rees### Proto2 vs Proto3 45*99e0aae7SDavid Rees 46*99e0aae7SDavid ReesThe current state of Google Protocol Buffers is a bit messy, with both "version 47*99e0aae7SDavid Rees2" ("Proto2") and "version 3" ("Proto3") Protocol Buffers. Proto2 and Proto3 48*99e0aae7SDavid Reescan (mostly) freely interoperate -- Proto2 files can import and use messages 49*99e0aae7SDavid Reesfrom Proto3 files and vice versa -- and both have long-term support guarantees 50*99e0aae7SDavid Reesfrom Google. Differences between Proto2 and Proto3 are highlighted below: it 51*99e0aae7SDavid Reesis not clear whether Emboss should generate Proto2, Proto3, or both (via a flag 52*99e0aae7SDavid Reesor file-level property). 53*99e0aae7SDavid Rees 54*99e0aae7SDavid Rees 55*99e0aae7SDavid Rees### Primitive Types 56*99e0aae7SDavid Rees 57*99e0aae7SDavid Rees#### `Int`, `UInt` 58*99e0aae7SDavid Rees 59*99e0aae7SDavid Rees`Int` and `UInt` can map to Proto's `int32`, `int64`, `uint32`, and `uint64`. 60*99e0aae7SDavid ReesSmaller integers can be extended to the next-largest Proto integer size. 61*99e0aae7SDavid Rees 62*99e0aae7SDavid Rees 63*99e0aae7SDavid Rees#### `Float` 64*99e0aae7SDavid Rees 65*99e0aae7SDavid Rees`Float` maps to Proto's `float` and `double`. 66*99e0aae7SDavid Rees 67*99e0aae7SDavid Rees 68*99e0aae7SDavid Rees#### `Flag` 69*99e0aae7SDavid Rees 70*99e0aae7SDavid Rees`Flag` maps to Proto's `bool`. 71*99e0aae7SDavid Rees 72*99e0aae7SDavid Rees 73*99e0aae7SDavid Rees#### (Future) Emboss String/Blob Type 74*99e0aae7SDavid Rees 75*99e0aae7SDavid ReesA future Emboss string or blob type would translate to Proto's `string` or 76*99e0aae7SDavid Rees`bytes`. It is likely that an Emboss "string" will be `bytes` in Proto, since 77*99e0aae7SDavid ReesEmboss is unlikely to enforce UTF-8 compliance. 78*99e0aae7SDavid Rees 79*99e0aae7SDavid ReesNote that Proto (version 2 only?) C++ does not enforce UTF-8 compliance on 80*99e0aae7SDavid Rees`string`, which can lead to crashes when the message is decoded in Python, 81*99e0aae7SDavid ReesJava, or another language that properly enforces string encoding. 82*99e0aae7SDavid Rees 83*99e0aae7SDavid Rees 84*99e0aae7SDavid Rees### Arrays 85*99e0aae7SDavid Rees 86*99e0aae7SDavid ReesUnidimensional arrays map neatly to `repeated` Proto fields. 87*99e0aae7SDavid Rees 88*99e0aae7SDavid ReesMultidimensional arrays must be handled with a wrapper `message` at each 89*99e0aae7SDavid Reesdimension after the first. 90*99e0aae7SDavid Rees 91*99e0aae7SDavid ReesBecause of the way that Proto wire format works (see [Translation Between 92*99e0aae7SDavid ReesEmboss View and Proto Wire Format](#between-emboss-view-and-proto-wire-format), 93*99e0aae7SDavid Reesbelow), there is a slight technical advantage to wrapping the outermost array 94*99e0aae7SDavid Reesin its own message. This does make the (Proto) API a bit awkward, but not too 95*99e0aae7SDavid Reesbad: 96*99e0aae7SDavid Rees 97*99e0aae7SDavid Rees```c++ 98*99e0aae7SDavid Reesauto element = structure.array_field().v(2); 99*99e0aae7SDavid Reesauto nested_element = structure.array_2d_field().v(2).v(1); 100*99e0aae7SDavid Rees``` 101*99e0aae7SDavid Rees 102*99e0aae7SDavid Reesvs 103*99e0aae7SDavid Rees 104*99e0aae7SDavid Rees```c++ 105*99e0aae7SDavid Reesauto element = structure.array_field(2); 106*99e0aae7SDavid Reesauto nested_element = structure.array_2d_field(2).v(1); 107*99e0aae7SDavid Rees``` 108*99e0aae7SDavid Rees 109*99e0aae7SDavid Rees 110*99e0aae7SDavid Rees### Conditional Fields 111*99e0aae7SDavid Rees 112*99e0aae7SDavid ReesIn Proto2, conditional fields map fairly well to the concept of "presence" for 113*99e0aae7SDavid Reesfields. Proto2 allows non-present fields to be read -- returning the default 114*99e0aae7SDavid Reesvalue for that field -- but this is not an issue for Emboss, which can easily 115*99e0aae7SDavid Reesgenerate the appropriate <code>has_*field*()</code> calls. 116*99e0aae7SDavid Rees 117*99e0aae7SDavid ReesProto3 does not track existence for primitive types the way that Proto2 does. 118*99e0aae7SDavid ReesThe "recommended" workaround is to use standardized wrapper types 119*99e0aae7SDavid Rees(`google.protobuf.FloatValue`, `google.protobuf.Int32Value`, etc.), which 120*99e0aae7SDavid Reesintroduce an extra layer. There is a second workaround, related to the slightly 121*99e0aae7SDavid Reesweird way that Proto handles `oneof`: if the primitive field is inside a 122*99e0aae7SDavid Rees`oneof`, then it is *not* always present. A `oneof` may contain a single 123*99e0aae7SDavid Reesmember, so primitive-typed fields could be generated as something like: 124*99e0aae7SDavid Rees 125*99e0aae7SDavid Rees``` 126*99e0aae7SDavid Reesmessage Foo { 127*99e0aae7SDavid Rees oneof field_1_oneof { 128*99e0aae7SDavid Rees int32 field_1 = 1; 129*99e0aae7SDavid Rees } 130*99e0aae7SDavid Rees} 131*99e0aae7SDavid Rees``` 132*99e0aae7SDavid Rees 133*99e0aae7SDavid ReesNote that in Emboss, changing a field from unconditionally present to 134*99e0aae7SDavid Reesconditionally present is (usually) a backwards-compatible change. 135*99e0aae7SDavid Rees 136*99e0aae7SDavid Rees 137*99e0aae7SDavid Rees### (Future) Emboss Union Construct 138*99e0aae7SDavid Rees 139*99e0aae7SDavid ReesAn Emboss union construct would be necessary to take advantage of runtime space 140*99e0aae7SDavid Reessavings from using a Proto `oneof`. 141*99e0aae7SDavid Rees 142*99e0aae7SDavid Rees 143*99e0aae7SDavid Rees### `struct` and `bits` 144*99e0aae7SDavid Rees 145*99e0aae7SDavid Rees`struct` and `bits` map neatly to `message`, with few issues. 146*99e0aae7SDavid Rees 147*99e0aae7SDavid Rees 148*99e0aae7SDavid Rees#### Anonymous `bits` 149*99e0aae7SDavid Rees 150*99e0aae7SDavid ReesAnonymous `bits` get "flattened" so that their fields appear to be part of their 151*99e0aae7SDavid Reesenclosing structure. This should be handled reasonably well via treating 152*99e0aae7SDavid Reesread-write virtual fields as members of the `message`, and by suppressing the 153*99e0aae7SDavid Rees"private" fields, such as anonymous `bits`. 154*99e0aae7SDavid Rees 155*99e0aae7SDavid Rees 156*99e0aae7SDavid Rees#### Proto Field IDs 157*99e0aae7SDavid Rees 158*99e0aae7SDavid ReesProto requires each field to have a unique tag ID. We propose that, for fields 159*99e0aae7SDavid Reeswith a fixed start location, the start location + 1 is used for a default tag 160*99e0aae7SDavid ReesID: since a change to a field's start location would be a breaking change to the 161*99e0aae7SDavid ReesEmboss definition, it should be reasonably stable. For fields with a variable 162*99e0aae7SDavid Reesstart location, virtual fields, or where the programmer wants a specific tag, 163*99e0aae7SDavid Reesthe attribute `[(proto) id]` can be used to specify the ID. 164*99e0aae7SDavid Rees 165*99e0aae7SDavid ReesThe "+ 1" is required since `0` is not a valid Proto tag ID. 166*99e0aae7SDavid Rees 167*99e0aae7SDavid Rees 168*99e0aae7SDavid Rees### `enum` 169*99e0aae7SDavid Rees 170*99e0aae7SDavid ReesThe Emboss `enum` construct does not map cleanly to the Proto `enum` construct, 171*99e0aae7SDavid Reeswith different issues in Proto2 vs Proto3. 172*99e0aae7SDavid Rees 173*99e0aae7SDavid ReesCommon to both, the names of Proto `enum` values are hoisted into the same 174*99e0aae7SDavid Reesnamespace as the `enum` itself (consistent with the C's handling of `enum`), 175*99e0aae7SDavid Reeswhich means that multiple `enum`s in the same context cannot hold the same value 176*99e0aae7SDavid Reesname. This can be handled -- somewhat awkwardly -- by wrapping the `enum` in a 177*99e0aae7SDavid Rees"namespace" `message`, like: 178*99e0aae7SDavid Rees 179*99e0aae7SDavid Rees``` 180*99e0aae7SDavid Reesmessage SomeEnum { 181*99e0aae7SDavid Rees enum SomeEnum { 182*99e0aae7SDavid Rees VALUE1 = 1; 183*99e0aae7SDavid Rees VALUE2 = 2; 184*99e0aae7SDavid Rees } 185*99e0aae7SDavid Rees} 186*99e0aae7SDavid Rees``` 187*99e0aae7SDavid Rees 188*99e0aae7SDavid ReesAdditionally, Proto `enum` values must fit in an `int32`, whereas Emboss `enum` 189*99e0aae7SDavid Reesvalues may require up to a `uint64`. 190*99e0aae7SDavid Rees 191*99e0aae7SDavid ReesProto2: In Proto2, `enum`s are closed: unknown values are ignored on message 192*99e0aae7SDavid Reesparse, so `enum` fields can never have an unknown value at runtime. Emboss 193*99e0aae7SDavid Rees`enum`s, much like C `enum`s, can hold unknown values. 194*99e0aae7SDavid Rees 195*99e0aae7SDavid ReesProto3: In Proto3, `enum`s are open, like Emboss `enum`s, but every Proto3 196*99e0aae7SDavid Rees`enum` must have a first entry whose value is `0`. In order to avoid 197*99e0aae7SDavid Reescompatibility issues, Emboss should emit a well-known name for the `0` value in 198*99e0aae7SDavid Reesevery case. There is a second issue in Proto3: there is no "has" bit for enum 199*99e0aae7SDavid Reesfields, so conditional enum fields have to be wrapped in a struct. 200*99e0aae7SDavid Rees(TODO(bolms): are Proto3 `enum`s signed, unsigned, or either?) 201*99e0aae7SDavid Rees 202*99e0aae7SDavid ReesThus, for Proto2, `enum`s would produce something like: 203*99e0aae7SDavid Rees 204*99e0aae7SDavid Rees``` 205*99e0aae7SDavid Reesmessage SomeEnum { 206*99e0aae7SDavid Rees enum SomeEnum { 207*99e0aae7SDavid Rees VALUE1 = 1; 208*99e0aae7SDavid Rees VALUE2 = 2; 209*99e0aae7SDavid Rees } 210*99e0aae7SDavid Rees oneof { 211*99e0aae7SDavid Rees SomeEnum value = 1; 212*99e0aae7SDavid Rees int64 integer_value = 2; 213*99e0aae7SDavid Rees } 214*99e0aae7SDavid Rees} 215*99e0aae7SDavid Rees``` 216*99e0aae7SDavid Rees 217*99e0aae7SDavid Reeswhich would be included in structures as: 218*99e0aae7SDavid Rees 219*99e0aae7SDavid Rees``` 220*99e0aae7SDavid Reesmessage SomeStruct { 221*99e0aae7SDavid Rees optional SomeEnum some_enum = 1; // NOT SomeEnum.SomeEnum 222*99e0aae7SDavid Rees} 223*99e0aae7SDavid Rees``` 224*99e0aae7SDavid Rees 225*99e0aae7SDavid ReesFor Proto3, the situation ends up similar: 226*99e0aae7SDavid Rees 227*99e0aae7SDavid Rees``` 228*99e0aae7SDavid Reesmessage SomeEnum { 229*99e0aae7SDavid Rees enum SomeEnum { 230*99e0aae7SDavid Rees DEFAULT = 0; 231*99e0aae7SDavid Rees VALUE1 = 1; 232*99e0aae7SDavid Rees VALUE2 = 2; 233*99e0aae7SDavid Rees } 234*99e0aae7SDavid Rees SomeEnum value = 1; 235*99e0aae7SDavid Rees} 236*99e0aae7SDavid Rees 237*99e0aae7SDavid Reesmessage SomeStruct { 238*99e0aae7SDavid Rees optional SomeEnum some_enum = 1; // NOT SomeEnum.SomeEnum 239*99e0aae7SDavid Rees} 240*99e0aae7SDavid Rees``` 241*99e0aae7SDavid Rees 242*99e0aae7SDavid Rees 243*99e0aae7SDavid Rees#### `enum` Name Restrictions 244*99e0aae7SDavid Rees 245*99e0aae7SDavid ReesProto enforces a (very slightly) stricter rule for the names of values within 246*99e0aae7SDavid Reesan `enum` than Emboss does: they must not collide *even when translated to 247*99e0aae7SDavid ReesCamelCase*. 248*99e0aae7SDavid Rees 249*99e0aae7SDavid ReesFor example, Emboss allows: 250*99e0aae7SDavid Rees 251*99e0aae7SDavid Rees``` 252*99e0aae7SDavid Reesenum Foo: 253*99e0aae7SDavid Rees BAR_1_1 = 2 254*99e0aae7SDavid Rees BAR_11 = 11 255*99e0aae7SDavid Rees``` 256*99e0aae7SDavid Rees 257*99e0aae7SDavid ReesWhen translated to CamelCase, `BAR_1_1` and `BAR_11` both become `Bar11`, and 258*99e0aae7SDavid Reesthus are not allowed to be part of the same `enum` in Proto. 259*99e0aae7SDavid Rees 260*99e0aae7SDavid ReesIt may be sufficient to require `.emb` authors to update their `enum`s when 261*99e0aae7SDavid Reesattempting to compile to Proto. 262*99e0aae7SDavid Rees 263*99e0aae7SDavid Rees 264*99e0aae7SDavid Rees### Bookkeeping Fields 265*99e0aae7SDavid Rees 266*99e0aae7SDavid ReesEmboss structures often have "bookkeeping" fields that are either irrelevant to 267*99e0aae7SDavid Reestypical Proto consumers, or place unusual restrictions. 268*99e0aae7SDavid Rees 269*99e0aae7SDavid ReesFor example, fields which are used to calculate the offset of other fields are 270*99e0aae7SDavid Reesgenerally not useful to Proto consumers: 271*99e0aae7SDavid Rees 272*99e0aae7SDavid Rees``` 273*99e0aae7SDavid Reesstruct Foo: 274*99e0aae7SDavid Rees 0 [+4] UInt header_length (h) 275*99e0aae7SDavid Rees h [+4] UInt first_body_message 276*99e0aae7SDavid Rees``` 277*99e0aae7SDavid Rees 278*99e0aae7SDavid Rees**These fields would still need to be set correctly when translating *from* 279*99e0aae7SDavid ReesProto to Emboss.** 280*99e0aae7SDavid Rees 281*99e0aae7SDavid ReesSome of the pain could likely be mitigated via a [default 282*99e0aae7SDavid Reesvalues](#default_values.md) feature, when implemented. 283*99e0aae7SDavid Rees 284*99e0aae7SDavid ReesField-length fields are somewhat trickier: 285*99e0aae7SDavid Rees 286*99e0aae7SDavid Rees``` 287*99e0aae7SDavid Reesstruct Foo: 288*99e0aae7SDavid Rees 0 [+4] UInt message_length (m) 289*99e0aae7SDavid Rees 4 [+m] UInt:8[] message_bytes 290*99e0aae7SDavid Rees``` 291*99e0aae7SDavid Rees 292*99e0aae7SDavid ReesIn Proto, `message_length` becomes an implicit part of `message_bytes`, since 293*99e0aae7SDavid Rees`message_bytes` knows its own length. For simple fields cases, as above, we 294*99e0aae7SDavid Reescan likely have the Emboss compiler "just figure it out" and fold 295*99e0aae7SDavid Rees`message_length` into `message_bytes`. For more complex cases, we will 296*99e0aae7SDavid Reesprobably need to have explicit annotations (`[(proto) set_length_by: x = 297*99e0aae7SDavid Reessome_expression]`), or just require applications using the Proto side to set 298*99e0aae7SDavid Reeslength fields correctly. 299*99e0aae7SDavid Rees 300*99e0aae7SDavid ReesA similar problem happens with "message type" fields: 301*99e0aae7SDavid Rees 302*99e0aae7SDavid Rees``` 303*99e0aae7SDavid Reesstruct Foo: 304*99e0aae7SDavid Rees 0 [+4] MessageType message_type (mt) 305*99e0aae7SDavid Rees if mt == MessageType.BAR: 306*99e0aae7SDavid Rees 4 [+8] Bar bar 307*99e0aae7SDavid Rees if mt == MessageType.BAZ: 308*99e0aae7SDavid Rees 4 [+16] Baz baz 309*99e0aae7SDavid Rees # ... 310*99e0aae7SDavid Rees``` 311*99e0aae7SDavid Rees 312*99e0aae7SDavid ReesThis will probably be easier to handle with a `union` construct in Emboss. 313*99e0aae7SDavid ReesAgain, "complex" cases will probably have to be handled by application code. 314*99e0aae7SDavid Rees 315*99e0aae7SDavid Rees 316*99e0aae7SDavid Rees## Translation 317*99e0aae7SDavid Rees 318*99e0aae7SDavid Rees### Between Emboss View and Proto In-Memory Format 319*99e0aae7SDavid Rees 320*99e0aae7SDavid ReesTranslation should be relatively straightforward; when going from Emboss to 321*99e0aae7SDavid ReesProto, the problem is roughly equivalent to serializing a View to text, and for 322*99e0aae7SDavid ReesProto to Emboss it is roughly equivalent to deserializing a View from text. 323*99e0aae7SDavid Rees 324*99e0aae7SDavid ReesOne minor difference is that the *deserialization* from Proto must occur in 325*99e0aae7SDavid Reesdependency order, while serialization can happen in any order. In Emboss text 326*99e0aae7SDavid Reesformat, *serialization* happens in dependency order, and deserialization happens 327*99e0aae7SDavid Reesin whatever order is specified in the text. 328*99e0aae7SDavid Rees 329*99e0aae7SDavid ReesAs with deserialization from text, it is possible for the Proto message to 330*99e0aae7SDavid Reesinclude untranslatable entries (e.g., an Emboss `Int:16` would stored in a Proto 331*99e0aae7SDavid Rees`int32`; a too-large value in the Proto `message` should be rejected). 332*99e0aae7SDavid Rees 333*99e0aae7SDavid Rees 334*99e0aae7SDavid Rees### Between Emboss View and Proto Wire Format 335*99e0aae7SDavid Rees 336*99e0aae7SDavid ReesSince the Proto wire format is extremely stable and documented, it would be 337*99e0aae7SDavid Reespossible for Emboss to emit code to directly translate between Emboss structs 338*99e0aae7SDavid Reesand proto wire format. 339*99e0aae7SDavid Rees 340*99e0aae7SDavid Rees*Serialization* is relatively straightforward; except for arrays, the code 341*99e0aae7SDavid Reesstructure is almost identical to the text serialization code structure. 342*99e0aae7SDavid Rees 343*99e0aae7SDavid Rees*Deserialization* is problematic. First and foremost, Proto does not specify an 344*99e0aae7SDavid Reesorder in which the fields of a structure will be serialized, so it is entirely 345*99e0aae7SDavid Reespossible for the Emboss view to see a dependent field before its prerequisite 346*99e0aae7SDavid Rees(e.g., have a variable-offset field before the offset specifier field). 347*99e0aae7SDavid ReesSecondly, Proto repeated fields aren't really "arrays"; on the wire, other 348*99e0aae7SDavid Reesfields can appear *in between* elements of repeated fields. For Emboss, this 349*99e0aae7SDavid Reesmeans that every array in the structure would have to maintain a cursor during 350*99e0aae7SDavid Reesdeserialization. 351*99e0aae7SDavid Rees 352*99e0aae7SDavid ReesIt *may* still be desirable to support serialization without trying to support 353*99e0aae7SDavid Reesdeserialization, or to support deserialization for a subset of structures, so 354*99e0aae7SDavid Reesthat we can send protos to/from microcontrollers: this would be an alternative 355*99e0aae7SDavid Reesto Nanopb for some cases. 356*99e0aae7SDavid Rees 357*99e0aae7SDavid Rees 358*99e0aae7SDavid Rees### Between Emboss View and [Nanopb](https://github.com/nanopb/nanopb) 359*99e0aae7SDavid Rees 360*99e0aae7SDavid ReesIn order to translate between Emboss views and Protos on microcontrollers and 361*99e0aae7SDavid Reesother limited-memory devices, it may make sense to generate Emboss <=> Nanopb 362*99e0aae7SDavid Reescode. On top of the standard Proto generator, we would have to implement a 363*99e0aae7SDavid ReesNanopb options file generator, and translation code. 364*99e0aae7SDavid Rees 365*99e0aae7SDavid Rees 366*99e0aae7SDavid Rees## Miscellaneous Notes 367*99e0aae7SDavid Rees 368*99e0aae7SDavid Rees### Overlays 369*99e0aae7SDavid Rees 370*99e0aae7SDavid ReesEmboss was designed with the notion that some backends would need their own 371*99e0aae7SDavid Reesattributes -- for example, the `[(cpp) namespace]` attribute, and here there 372*99e0aae7SDavid Reesare a number of `[(proto)]` attributes. 373*99e0aae7SDavid Rees 374*99e0aae7SDavid ReesHowever, adding back-end-specific attributes still requires changes to be made 375*99e0aae7SDavid Reesdirectly to the `.emb` file, which may be inconvenient for `.emb`s from third 376*99e0aae7SDavid Reesparties. 377*99e0aae7SDavid Rees 378*99e0aae7SDavid ReesIdeally, one could write an "overlay file," like: 379*99e0aae7SDavid Rees 380*99e0aae7SDavid Rees``` 381*99e0aae7SDavid Reesmessage Foo 382*99e0aae7SDavid Rees [(proto) attr = value] 383*99e0aae7SDavid Rees 384*99e0aae7SDavid Rees field 385*99e0aae7SDavid Rees [(proto) field_attr = value] 386*99e0aae7SDavid Rees``` 387*99e0aae7SDavid Rees 388*99e0aae7SDavid ReesThis is not needed for a first pass at a Proto back end, but should be 389*99e0aae7SDavid Reesconsidered. 390*99e0aae7SDavid Rees 391*99e0aae7SDavid Rees 392*99e0aae7SDavid Rees### Generating an `.emb` From a `.proto` 393*99e0aae7SDavid Rees 394*99e0aae7SDavid ReesThere are cases where it would be useful to generate a microcontroller-friendly 395*99e0aae7SDavid Reesrepresentation of an existing Proto, rather than the other way around. 396*99e0aae7SDavid Rees 397*99e0aae7SDavid ReesFor most `message`s, it would be relatively straightforward to generate a 398*99e0aae7SDavid Rees`struct`, like: 399*99e0aae7SDavid Rees 400*99e0aae7SDavid Rees``` 401*99e0aae7SDavid Reesmessage Foo { 402*99e0aae7SDavid Rees optional int32 bar = 1; 403*99e0aae7SDavid Rees optional bool baz = 2; 404*99e0aae7SDavid Rees optional string qux = 3; 405*99e0aae7SDavid Rees} 406*99e0aae7SDavid Rees``` 407*99e0aae7SDavid Rees 408*99e0aae7SDavid Reesto: 409*99e0aae7SDavid Rees 410*99e0aae7SDavid Rees``` 411*99e0aae7SDavid Reesstruct Foo: 412*99e0aae7SDavid Rees 0 [+4] bits: 413*99e0aae7SDavid Rees 0 [+1] Flag has_bar 414*99e0aae7SDavid Rees 1 [+1] Flag has_baz 415*99e0aae7SDavid Rees if has_baz: 416*99e0aae7SDavid Rees 2 [+1] Flag baz 417*99e0aae7SDavid Rees 2 [+1] Flag has_qux 418*99e0aae7SDavid Rees 419*99e0aae7SDavid Rees if has_bar: 420*99e0aae7SDavid Rees 4 [+4] Int:32 bar 421*99e0aae7SDavid Rees 422*99e0aae7SDavid Rees if has_qux: 423*99e0aae7SDavid Rees 8 [+4] UInt:32 qux_offset 424*99e0aae7SDavid Rees 12 [+4] UInt:32 qux_length 425*99e0aae7SDavid Rees qux_offset [+qux_length] UInt:8[] qux 426*99e0aae7SDavid Rees``` 427*99e0aae7SDavid Rees 428*99e0aae7SDavid ReesThe main issue is that it would be difficult to maintain equivalent 429*99e0aae7SDavid Reesbackwards-compatibility guarantees to the ones that Proto provides as messages 430*99e0aae7SDavid Reesevolve. 431*99e0aae7SDavid Rees 432*99e0aae7SDavid ReesAlso note that this format is fairly close to the [Cap'n 433*99e0aae7SDavid ReesProto](https://capnproto.org/) format. 434