1# Emboss User Guide 2 3 4## Getting Started 5 6First, you must identify a data structure you want to read and write. These are 7often documented in hardware manuals a bit like [this one, for the fictional 8BN-P-6000404 illuminated button panel](BogoNEL_BN-P-6000404_User_Guide.pdf). We 9will use the BN-P-6000404 as an example. 10 11 12### A Caution 13 14Emboss is still beta software. While we believe that we will not need to make 15any more breaking changes before 1.0, you may still encounter bugs and there are 16many missing features. 17 18You can contact `[email protected]` with any issues. Emboss is not an 19officially supported Google product, but the Emboss authors will try to answer 20emails. 21 22 23### System Requirements 24 25#### Running the Emboss Compiler 26 27The Emboss compiler requires Python 3.8 or later -- the minimum supported 28version tracks the support timeline of the Python project. On a Linux-like 29system with Python 3 installed in the usual place (`/usr/bin/python3`), you 30can run the embossc script at the top level on an `.emb` file to generate 31C++, like so: 32 33``` 34embossc --generate cc --output-path path/to/object/dir path/to/input.emb 35``` 36 37If your project is using Bazel, the `build_defs.bzl` file has an 38`emboss_cc_library` rule that you can use from your project. 39 40 41#### Using the Generated C++ Code 42 43The code generated by Emboss requires a C++11-compliant compiler, and a 44reasonably up-to-date standard library. Emboss has been tested with GCC and 45Clang, libc++ and libstd++. In theory, it should work with MSVC, ICC, etc., but 46it has not been tested, so there are likely to be bugs. 47 48The generated C++ code lives entirely in a `.h` file, one per `.emb` file. All 49of the generated code is in C++ templates or (in a very few cases) `inline` 50functions. The generated code is structured this way in order to implement 51"pay-as-you-use" for code size: any functions, methods, or views that are not 52used by your code won't end up in your final binary. This is often important 53for environments like microcontrollers! 54 55There is an Emboss runtime library (under `runtime/cpp`), which is also 56header-only. You will need to add the root of the Emboss source tree to your 57`#include` path. 58 59Note that it is *strongly* recommended that you compile your release code with 60at least some optimizations: `-Os` or `-O2`. The Emboss generated code leans 61fairly heavily on your C++ compiler's inlining and common code elimination to 62produce fast, lean compiled code. 63 64 65#### Contributing to the Compiler 66 67If you want to contribute features or bugfixes to the Emboss compiler itself, 68you will need Bazel to run the Emboss test suite. 69 70 71### Create an `.emb` file 72 73Next, you will need to translate your structures. 74 75``` 76[$default byte_order: "LittleEndian"] 77[(cpp) namespace: "bogonel::bnp6000404"] 78``` 79 80The BN-P-6000404 uses little-endian numbers, so we can set the default byte 81order to `LittleEndian`. There is no particular C++ namespace implied by the 82BN-P-6000404 user guide, so we use one that is specific to the BN-P-6000404. 83 84The BN-P-6000404, like many devices with serial interfaces, uses a framed 85message system, with a fixed header and a variable message body depending on a 86message ID. For the BN-P-6000404, this framing looks like this: 87 88<!-- TODO(bolms): finalize the "magic value initialization" feature, document it 89here. --> 90 91``` 92struct Message: 93 -- Top-level message structure, specified in section 5.3 of the BN-P-6000404 94 -- user guide. 95 96 0 [+1] UInt sync_1 97 [requires: this == 0x42] 98 99 1 [+1] UInt sync_2 100 [requires: this == 0x4E] 101 102 2 [+1] MessageId message_id 103 -- Type of message 104 105 3 [+1] UInt message_length (ml) 106 -- Length of message, including header and checksum 107 108 # ... body fields to follow ... 109``` 110 111We could have chosen to put the header fields into a separate `Header` structure 112instead of placing them directly in the `Message` structure. 113 114The `sync_1` and `sync_2` fields are required to have specific magic values, so 115we add the appropriate `[requires: ...]` attributes to them. This tells Emboss 116that if those fields do not have those values, then the `Message` `struct` is 117ill-formed: in the client code, the `Message` will not be `Ok()` if those fields 118have the wrong values, and Emboss will not allow wrong values to be written into 119those fields using the checked (default) APIs. 120 121Unfortunately, BogoNEL does not provide a nice table of message IDs, but 122fortunately there are only a few, so we can gather them from the individual 123messages: 124 125``` 126enum MessageId: 127 -- Message type idenfiers for the BN-P-6000404. 128 IDENTIFICATION = 0x01 129 INTERACTION = 0x02 130 QUERY_IDENTIFICATION = 0x10 131 QUERY_BUTTONS = 0x11 132 SET_ILLUMINATION = 0x12 133``` 134 135Next, we should translate the individual messages to Emboss. 136 137``` 138struct Identification: 139 -- IDENTIFICATION message, specified in section 5.3.3. 140 141 0 [+4] UInt vendor 142 # 0x4F474F42 is "BOGO" in ASCII, interpreted as a 4-byte little-endian 143 # value. 144 [requires: this == 0x4F47_4F42] 145 146 0 [+4] UInt:8[4] vendor_ascii 147 -- "BOGO" for BogoNEL Corp 148 # The `vendor` field really contains the four ASCII characters "BOGO", so we 149 # could use a byte array instead of a single UInt. Since it is valid to 150 # have overlapping fields, we can have both `vendor` and `vendor_ascii` in 151 # our Emboss specification. 152 153 4 [+2] UInt firmware_major 154 -- Firmware major version 155 156 6 [+2] UInt firmware_minor 157 -- Firmware minor version 158``` 159 160<!-- TODO(bolms): fixed-length, ASCIIZ, and variable-length string support? --> 161 162The `Identification` structure is fairly straightforward. In this case, we 163provide an alternate view of the `vendor` field via `vendor_ascii`: 0x4F474F42 164in little-endian works out to the ASCII characters "BOGO". 165 166Note that `vendor_ascii` uses `UInt:8[4]` for its type, and not `UInt[4]`. For 167most fields, we can use plain `UInt` and Emboss will figure out how big the 168`UInt` should be, but for an array we must be explicit that we want 8-bit 169elements. 170 171``` 172struct Interaction: 173 -- INTERACTION message, specified in section 5.3.4. 174 175 0 [+1] UInt number_of_buttons (n) 176 -- Number of buttons currently depressed by user 177 178 4 [+n] ButtonId:8[n] button_id 179 -- ID of pressed button. A number of entries equal to number_of_buttons 180 -- will be provided. 181``` 182 183<!-- TODO(bolms): reserved field support --> 184 185`Interaction` is also fairly straightforward. The only tricky bit is the 186`button_id` field: since `Interaction` can return a variable number of button 187IDs, depending on how many buttons are currently pressed, the `button_id` field 188must has length `n`. It would have been OK to use `[+number_of_buttons]`, but 189full field names can get cumbersome, particularly when the length involves are 190more complex expression. Instead, we set an *alias* for `number_of_buttons` 191using `(n)`, and then use the alias in `button_id`'s length. The `n` alias is 192not visible outside of the `Interaction` message, and won't be available in the 193generated code, so the short name is not likely to cause confusion. 194 195``` 196enum ButtonId: 197 -- Button IDs, specified in table 5-6. 198 BUTTON_A = 0x00 199 BUTTON_B = 0x04 200 BUTTON_C = 0x08 201 BUTTON_D = 0x0C 202 BUTTON_E = 0x01 203 BUTTON_F = 0x05 204 BUTTON_G = 0x09 205 BUTTON_H = 0x0D 206 BUTTON_I = 0x02 207 BUTTON_J = 0x06 208 BUTTON_K = 0x0A 209 BUTTON_L = 0x0E 210 BUTTON_M = 0x03 211 BUTTON_N = 0x07 212 BUTTON_O = 0x0B 213 BUTTON_P = 0x0F 214``` 215 216We had to prefix all of the button names with `BUTTON_` because Emboss does not 217allow single-character enum names. 218 219The QUERY IDENTIFICATION and QUERY BUTTONS messages don't have any fields other 220than `checksum`, so we will handle them a bit differently. 221 222``` 223struct SetIllumination: 224 -- SET ILLUMINATION message, specified in section 5.3.7. 225 226 0 [+1] bits: 227 0 [+1] Flag red_channel_enable 228 -- Enables setting the RED channel. 229 230 1 [+1] Flag blue_channel_enable 231 -- Enables setting the BLUE channel. 232 233 2 [+1] Flag green_channel_enable 234 -- Enables setting the GREEN channel. 235 236 1 [+1] UInt blink_duty 237 -- Sets the proportion of time between time on and time off for blink 238 -- feature. 239 -- 240 -- Minimum value = 0 (no illumination) 241 -- 242 -- Maximum value = 240 (constant illumination) 243 [requires: 0 <= this <= 240] 244 245 2 [+2] UInt blink_period 246 -- Sets the blink period, in milliseconds. 247 -- 248 -- Minimum value = 10 249 -- 250 -- Maximum value = 10000 251 [requires: 10 <= this <= 10_000] 252 253 4 [+4] bits: 254 0 [+32] UInt:2[16] intensity 255 -- Intensity values for the unmasked channels. 2 bits of intensity for 256 -- each button. 257``` 258 259`SetIllumination` requires us to use bitfields. The first bitfield is in the 260CHANNEL MASK field: rather than making a single `channel_mask` field, Emboss 261lets us specify the red, green, and blue channel masks separately. 262 263As with `sync_1` and `sync_2`, we have added `[requires: ...]` to the 264`blink_duty` and `blink_period` fields: this time, specifying a range of valid 265values. `[requires: ...]` accepts an arbitrary expression, which can be as 266simple or as complex as desired. 267 268It is not clear from BogoNEL's documentation whether "bit 0" means the least 269significant or most significant bit of its byte, but a little experimentation 270with the device shows that setting the least significant bit causes 271`SetIllumination` to set its red channel. Emboss always numbers bits in 272bitfields from least significant (bit 0) to most significant. 273 274The other bitfield is the `intensity` array. The BN-P-6000404 uses an array of 2752 bit intensity values, so we specify that array. 276 277Finally, we should add all of the sub-messages into `Message`, and also take 278care of `checksum`. After making those changes, `Message` looks like: 279 280``` 281struct Message: 282 -- Top-level message structure, specified in section 5.3 of the BN-P-6000404 283 -- user guide. 284 285 0 [+1] UInt sync_1 286 [requires: this == 0x42] 287 288 1 [+1] UInt sync_2 289 [requires: this == 0x4E] 290 291 2 [+1] MessageId message_id 292 -- Type of message 293 294 3 [+1] UInt message_length (ml) 295 -- Length of message, including header and checksum 296 297 if message_id == MessageId.IDENTIFICATION: 298 4 [+ml-8] Identification identification 299 300 if message_id == MessageId.INTERACTION: 301 4 [+ml-8] Interaction interaction 302 303 if message_id == MessageId.SET_ILLUMINATION: 304 4 [+ml-8] SetIllumination set_illumination 305 306 0 [+ml-4] UInt:8[] checksummed_bytes 307 308 ml-4 [+4] UInt checksum 309``` 310 311By wrapping the various message types in `if message_id == ...` constructs, 312those substructures will only be available when the `message_id` field is set to 313the corresponding message type. This kind of selection is used for any 314structure field that is only valid some of the time. 315 316The substructures all have the length `ml-8`. The `ml` is a short alias for the 317`message_length` field; these short aliases are available so that the field 318types and names don't have to be pushed far to the right. Aliases may only be 319used directly in the same structure definition where they are created; they may 320not be used elsewhere in an Emboss file, and they are not available in the 321generated code. The length is `ml-8` in this case because the `message_length` 322includes the header and checksum, which left out of the substructures. 323 324Note that we simply don't have any subfield for QUERY IDENTIFICATION or QUERY 325BUTTONS: since those messages do not have any fields, there is no need for a 326zero-byte structure. 327 328We also added the `checksummed_bytes` field as a convenience for computing the 329checksum. 330 331 332### Generate code 333 334Once you have an `.emb`, you will need to generate code from it. 335 336The simplest way to do so is to run the `embossc` tool: 337 338``` 339embossc -I src --generate cc --output-path generated bogonel.emb 340``` 341 342The `-I` option adds a directory to the *include path*. The input file -- in 343this case, `bogonel.emb` -- must be found somewhere on the include path. 344 345The `--generate` option specifies which back end to use; `cc` is the C++ back 346end. 347 348The `--output-path` option specifies where the generated file should be placed. 349Note that the output path will include all of the path components of the input 350file: if the input file is `x/y/z.emb`, then the path `x/y/z.emb.h` will be 351appended to the `--output-path`. Missing directories will be created. 352 353 354<!-- #### Using Bazel --> 355 356<!-- TODO(bolms): Make this usable from Bazel. --> 357 358 359### Include the generated C++ code 360 361Emboss generates a single C++ header file from your `.emb` by appending `.h` to 362the file name: to use the BogoNEL definitions, you would `#include 363"path/to/bogonel.emb.h"` in your C++ code. 364 365Currently, Emboss does not generate a corresponding `.cc` file: the code that 366Emboss generates is all templates, which exist in the `.h`. Although the Emboss 367maintainers (e.g., bolms@) like the simplicity of generating a single file, this 368could change at some point. 369 370 371### Use the generated C++ code 372 373Emboss generates *views*, which your program can use to read and write existing 374arrays of bytes, and which do not take ownership. For example: 375 376```c++ 377#include "path/to/bogonel.emb.h" 378 379template <typename View> 380bool ChecksumIsCorrect(View message_view); 381 382// Handles BogoNEL BN-P-6000404 device messages from a byte stream. Returns 383// the number of bytes that were processed. Unprocessed bytes should be 384// passed into the next call. 385int HandleBogonelPanelMessages(const char *bytes, int byte_count) { 386 auto message_view = bogonel::bnp6000404::MakeMessageView(bytes, byte_count); 387 388 // IsComplete() will return true if the view has enough bytes to fully 389 // contain the message; i.e., that byte_count is at least 390 // message_view.message_length().Read() + 4. 391 if (!message_view->IsComplete()) { 392 return 0; 393 } 394 395 // If Emboss is happy with the message, we still need to check the checksum: 396 // Emboss does not (yet) have support for automatically checking checksums and 397 // CRCs. 398 if (!message_view->Ok() || !ChecksumIsCorrect(message_view)) { 399 // If the message is complete, but not correct, we need to log an error. 400 HandleBrokenMessage(message_view); 401 return message_view->Size(); 402 } 403 404 405 // At this point, we know the message is complete and (basically) OK, so 406 // we dispatch it to a message-type-specific handler. 407 switch (message_view->message_id().Read()) { 408 case bogonel::bnp6000404::MessageId::IDENTIFICATION: 409 HandleIdentificationMessage(message_view); 410 break; 411 412 case bogonel::bnp6000404::MessageId::INTERACTION: 413 HandleInteractionMessage(message_view); 414 break; 415 416 case bogonel::bnp6000404::MessageId::QUERY_IDENTIFICATION: 417 case bogonel::bnp6000404::MessageId::QUERY_BUTTONS: 418 case bogonel::bnp6000404::MessageId::SET_ILLUMINATION: 419 Log("Unexpected host to device message type."); 420 break; 421 422 default: 423 Log("Unknown message type."); 424 break; 425 } 426 427 return message_view->Size(); 428} 429 430template <typename View> 431bool ChecksumIsCorrect(View message_view) { 432 uint32_t checksum = 0; 433 for (int i = 0; i < message_view.checksum_bytes().ElementCount(); ++i) { 434 checksum += message_view.checksum_bytes()[i].Read(); 435 } 436 return checksum == message_view.checksum().Read(); 437} 438``` 439 440<!-- TODO(bolms): solidify support for checksums, so that the Ok() call in the 441example actually checks them. --> 442 443The `message_view` object in this example is a lightweight object that simply 444provides *access* to the bytes in `message`. Emboss views are very cheap to 445construct because they only contain a couple of pointers and a length -- they do 446not copy or take ownership of the underlying bytes. This also means that you 447have to keep the underlying bytes alive as long as you are using a view -- you 448can't let them go out of scope or delete them. 449 450Views can also be used for writing, if they are given pointers to mutable 451memory: 452 453```c++ 454void ConstructSetIlluminationMessage(const vector<bool> &lit_buttons, 455 vector<char> *result) { 456 // The SetIllumination message has a constant size, so SizeInBytes() is 457 // available as a static method. 458 int length = bogonel::bnp6000404::SetIllumination::SizeInBytes() + 8; 459 result->clear(); 460 result->resize(length); 461 462 auto view = bogonel::bnp6000404::MakeMessageView(result); 463 view->sync_1().Write(0x42); 464 view->sync_2().Write(0x4E); 465 view->message_id().Write(bogonel::bnp6000404::MessageId::SET_ILLUMINATION); 466 view->message_length().Write(length); 467 view->set_illumination().red_channel_enable().Write(true); 468 view->set_illumination().blue_channel_enable().Write(true); 469 view->set_illumination().green_channel_enable().Write(true); 470 view->set_illumination().blink_duty().Write(240); 471 view->set_illumination().blink_period().Write(10000); 472 for (int i = 0; i < view->set_illumination().intensity().ElementCount(); 473 ++i) { 474 view->set_illumination().intensity()[i].Write(lit_buttons[i] ? 3 : 0); 475 } 476} 477``` 478 479 480### Use the `.emb` Autoformatter 481 482You can use the `.emb` autoformatter to avoid manual formatting. For now, it is 483available at `compiler/front_end/format.py`. 484 485*TODO(bolms): Package the Emboss tools for easy workstation installation.* 486