xref: /aosp_15_r20/external/pigweed/pw_protobuf/docs.rst (revision 61c4878ac05f98d0ceed94b57d316916de578985)
1.. _module-pw_protobuf:
2
3===========
4pw_protobuf
5===========
6.. pigweed-module::
7   :name: pw_protobuf
8
9``pw_protobuf`` provides an expressive interface for encoding and decoding
10the Protocol Buffer wire format with a lightweight code and data footprint.
11
12.. note::
13
14   The protobuf module is a work in progress. Wire format encoding and decoding
15   is supported, though the APIs are not final. C++ code generation exists for
16   encoding and decoding, but not yet optimized for in-memory decoding.
17
18--------
19Overview
20--------
21Unlike protobuf libraries which require protobuf messages be represented by
22in-memory data structures, ``pw_protobuf`` provides a progressive flexible API
23that allows the user to choose the data storage format and tradeoffs most
24suitable for their product on top of the implementation.
25
26The API is designed in three layers, which can be freely intermixed with each
27other in your code, depending on point of use requirements:
28
291. Message Structures,
302. Per-Field Writers and Readers,
313. Direct Writers and Readers.
32
33This has a few benefits. The primary one is that it allows the core proto
34serialization and deserialization libraries to be relatively small.
35
36.. include:: size_report/protobuf_overview
37
38To demonstrate these layers, we use the following protobuf message definition
39in the examples:
40
41.. code-block:: protobuf
42
43   message Customer {
44     enum Status {
45       NEW = 1;
46       ACTIVE = 2;
47       INACTIVE = 3;
48     }
49     int32 age = 1;
50     string name = 2;
51     Status status = 3;
52   }
53
54And the following accompanying options file:
55
56.. code-block:: text
57
58   Customer.name max_size:32
59
60.. toctree::
61   :maxdepth: 1
62
63   size_report
64
65Message Structures
66==================
67The highest level API is based around message structures created through C++
68code generation, integrated with Pigweed's build system.
69
70.. warning::
71
72   Message structures only support a subset of protobuf field types. Before
73   continuing, refer to :ref:`pw_protobuf-message-limitations` to understand
74   what types of protobuf messages can and cannot be represented, and whether
75   or not message structures are a suitable choice.
76
77This results in the following generated structure:
78
79.. code-block:: c++
80
81   enum class Customer::Status : uint32_t {
82     NEW = 1,
83     ACTIVE = 2,
84     INACTIVE = 3,
85
86     kNew = NEW,
87     kActive = ACTIVE,
88     kInactive = INACTIVE,
89   };
90
91   struct Customer::Message {
92     int32_t age;
93     pw::InlineString<32> name;
94     Customer::Status status;
95   };
96
97Which can be encoded with the code:
98
99.. code-block:: c++
100
101   #include "example_protos/customer.pwpb.h"
102
103   pw::Status EncodeCustomer(Customer::StreamEncoder& encoder) {
104     return encoder.Write({
105       age = 33,
106       name = "Joe Bloggs",
107       status = Customer::Status::INACTIVE
108     });
109   }
110
111And decoded into a struct with the code:
112
113.. code-block:: c++
114
115   #include "example_protos/customer.pwpb.h"
116
117   pw::Status DecodeCustomer(Customer::StreamDecoder& decoder) {
118     Customer::Message customer{};
119     PW_TRY(decoder.Read(customer));
120     // Read fields from customer
121     return pw::OkStatus();
122   }
123
124These structures can be moved, copied, and compared with each other for
125equality.
126
127The encoder and decoder code is generic and implemented in the core C++ module.
128A small overhead for each message type used in your code describes the structure
129to the generic encoder and decoders.
130
131Message comparison
132------------------
133Message structures implement ``operator==`` and ``operator!=`` for equality and
134inequality comparisons. However, these can only work on trivial scalar fields
135and fixed-size strings within the message. Fields using a callback are not
136considered in the comparison.
137
138To check if the equality operator of a generated message covers all fields,
139``pw_protobuf`` provides an ``IsTriviallyComparable`` function.
140
141.. code-block:: c++
142
143   template <typename Message>
144   constexpr bool IsTriviallyComparable<Message>();
145
146For example, given the following protobuf definitions:
147
148.. code-block:: protobuf
149
150   message Point {
151     int32 x = 1;
152     int32 y = 2;
153   }
154
155   message Label {
156     Point point = 1;
157     string label = 2;
158   }
159
160And the accompanying options file:
161
162.. code-block:: text
163
164   Label.label use_callback:true
165
166The ``Point`` message can be fully compared for equality, but ``Label`` cannot.
167``Label`` still defines an ``operator==``, but it ignores the ``label`` string.
168
169.. code-block:: c++
170
171   Point::Message one = {.x = 5, .y = 11};
172   Point::Message two = {.x = 5, .y = 11};
173
174   static_assert(pw::protobuf::IsTriviallyComparable<Point::Message>());
175   ASSERT_EQ(one, two);
176   static_assert(!pw::protobuf::IsTriviallyComparable<Label::Message>());
177
178Buffer Sizes
179------------
180Initializing a ``MemoryEncoder`` requires that you specify the size of the
181buffer to encode to. The code generation includes a ``kMaxEncodedSizeBytes``
182constant that represents the maximum encoded size of the protobuf message,
183excluding the contents of any field values which require a callback.
184
185.. code-block:: c++
186
187   #include "example_protos/customer.pwpb.h"
188
189   std::byte buffer[Customer::kMaxEncodedSizeBytes];
190   Customer::MemoryEncoder encoder(buffer);
191   const auto status = encoder.Write({
192     age = 22,
193     name = "Wolfgang Bjornson",
194     status = Customer::Status::ACTIVE
195   });
196
197   // Always check the encoder status or return values from Write calls.
198   if (!status.ok()) {
199     PW_LOG_INFO("Failed to encode proto; %s", encoder.status().str());
200   }
201
202In the above example, because the ``name`` field has a ``max_size`` specified
203in the accompanying options file, ``kMaxEncodedSizeBytes`` includes the maximum
204length of the value for that field.
205
206Where the maximum length of a field value is not known, indicated by the
207structure requiring a callback for that field, the constant includes
208all relevant overhead and only requires that you add the length of the field
209values.
210
211For example if a ``bytes`` field length is not specified in the options file,
212but is known to your code (``kMaxImageDataSize`` in this example being a
213constant in your own code), you can simply add it to the generated constant:
214
215.. code-block:: c++
216
217   #include "example_protos/store.pwpb.h"
218
219   const std::byte image_data[kMaxImageDataSize] = { ... };
220
221   Store::Message store{};
222   // Calling SetEncoder means we must always extend the buffer size.
223   store.image_data.SetEncoder([](Store::StreamEncoder& encoder) {
224     return encoder.WriteImageData(image_data);
225   });
226
227   std::byte buffer[Store::kMaxEncodedSizeBytes + kMaxImageDataSize];
228   Store::MemoryEncoder encoder(buffer);
229   const auto status = encoder.Write(store);
230
231   // Always check the encoder status or return values from Write calls.
232   if (!status.ok()) {
233     PW_LOG_INFO("Failed to encode proto; %s", encoder.status().str());
234   }
235
236Or when using a variable number of repeated submessages, where the maximum
237number is known to your code but not to the proto, you can add the constants
238from one message type to another:
239
240.. code-block:: c++
241
242   #include "example_protos/person.pwpb.h"
243
244   Person::Message grandchild{};
245   // Calling SetEncoder means we must always extend the buffer size.
246   grandchild.grandparent.SetEncoder([](Person::StreamEncoder& encoder) {
247     PW_TRY(encoder.GetGrandparentEncoder().Write(maternal_grandma));
248     PW_TRY(encoder.GetGrandparentEncoder().Write(maternal_grandpa));
249     PW_TRY(encoder.GetGrandparentEncoder().Write(paternal_grandma));
250     PW_TRY(encoder.GetGrandparentEncoder().Write(paternal_grandpa));
251     return pw::OkStatus();
252   });
253
254   std::byte buffer[Person::kMaxEncodedSizeBytes +
255                    Grandparent::kMaxEncodedSizeBytes * 4];
256   Person::MemoryEncoder encoder(buffer);
257   const auto status = encoder.Write(grandchild);
258
259   // Always check the encoder status or return values from Write calls.
260   if (!status.ok()) {
261     PW_LOG_INFO("Failed to encode proto; %s", encoder.status().str());
262   }
263
264.. warning::
265  Encoding to a buffer that is insufficiently large will return
266  ``Status::ResourceExhausted()`` from ``Write`` calls, and from the
267  encoder's ``status()`` call. Always check the status of calls or the encoder,
268  as in the case of error, the encoded data will be invalid.
269
270.. _pw_protobuf-message-limitations:
271
272Limitations
273-----------
274``pw_protobuf``'s message structure API is incomplete. Generally speaking, it is
275reasonable to use for basic messages containing simple inline fields (scalar
276types, bytes, and strings) and nested messages of similar form. Beyond this,
277certain other types of protobuf specifiers can be used, but may be limited in
278how and when they are supported. These cases are described below.
279
280If an object representation of protobuf messages is desired, the Pigweed team
281recommends using `Nanopb`_, which is fully supported within Pigweed's build and
282RPC systems.
283
284Message structures are eventually intended to be replaced with an alternative
285object model. See `SEED-0103 <http://pwrev.dev/133971>`_ for additional
286information about how message structures came to be and our future plans.
287
288Protobuf versioning
289^^^^^^^^^^^^^^^^^^^
290Message structures are generated in accordance to ``proto3`` semantics only.
291Unset fields default to their zero values. Explicit field presence is supported
292through the use of the ``optional`` specifier. Proto2-only keywords such as
293``required`` have no effect.
294
295There is limited preliminary support for Protobuf editions, primarily to allow
296editions-based proto files to compile. At this time, the only editions feature
297supported by the code generator is ``field_presence``.
298
299``oneof`` fields
300^^^^^^^^^^^^^^^^
301``oneof`` protobuf fields cannot be inlined within a message structure: they
302must be encoded and decoded using callbacks.
303
304``optional`` fields
305^^^^^^^^^^^^^^^^^^^
306Only scalar fields generate a ``std::optional`` wrapper in the resulting message
307structure. Optional submessages or variable-length fields
308(``string`` / ``bytes``) must be processed through callbacks, requiring manual
309tracking of presence depending on whether or not the callback is invoked.
310
311.. _pw_protobuf-per-field-apis:
312
313Per-Field Writers and Readers
314=============================
315The middle level API is based around typed methods to write and read each
316field of the message directly to the final serialized form, again created
317through C++ code generation.
318
319Encoding
320--------
321Given the same message structure, in addition to the ``Write()`` method that
322accepts a message structure, the following additional methods are also
323generated in the typed ``StreamEncoder`` class.
324
325There are lightweight wrappers around the core implementation, calling the
326underlying methods with the correct field numbers and value types, and result
327in no additional binary code over correctly using the core implementation.
328
329.. code-block:: c++
330
331   class Customer::StreamEncoder : pw::protobuf::StreamEncoder {
332    public:
333     // Message Structure Writer.
334     pw::Status Write(const Customer::Message&);
335
336     // Per-Field Typed Writers.
337     pw::Status WriteAge(int32_t);
338
339     pw::Status WriteName(std::string_view);
340     pw::Status WriteName(const char*, size_t);
341
342     pw::Status WriteStatus(Customer::Status);
343   };
344
345So the same encoding method could be written as:
346
347.. code-block:: c++
348
349   #include "example_protos/customer.pwpb.h"
350
351   Status EncodeCustomer(Customer::StreamEncoder& encoder) {
352     PW_TRY(encoder.WriteAge(33));
353     PW_TRY(encoder.WriteName("Joe Bloggs"sv));
354     PW_TRY(encoder.WriteStatus(Customer::Status::INACTIVE));
355   }
356
357Pigweed's protobuf encoders encode directly to the wire format of a proto rather
358than staging information to a mutable datastructure. This means any writes of a
359value are final, and can't be referenced or modified as a later step in the
360encode process.
361
362Casting between generated StreamEncoder types
363^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
364pw_protobuf guarantees that all generated ``StreamEncoder`` classes can be
365converted among each other. It's also safe to convert any ``MemoryEncoder`` to
366any other ``StreamEncoder``.
367
368This guarantee exists to facilitate usage of protobuf overlays. Protobuf
369overlays are protobuf message definitions that deliberately ensure that
370fields defined in one message will not conflict with fields defined in other
371messages.
372
373For example:
374
375.. code-block:: protobuf
376
377   // The first half of the overlaid message.
378   message BaseMessage {
379     uint32 length = 1;
380     reserved 2;  // Reserved for Overlay
381   }
382
383   // OK: The second half of the overlaid message.
384   message Overlay {
385     reserved 1;  // Reserved for BaseMessage
386     uint32 height = 2;
387   }
388
389   // OK: A message that overlays and bundles both types together.
390   message Both {
391     uint32 length = 1;  // Defined independently by BaseMessage
392     uint32 height = 2;  // Defined independently by Overlay
393   }
394
395   // BAD: Diverges from BaseMessage's definition, and can cause decode
396   // errors/corruption.
397   message InvalidOverlay {
398     fixed32 length = 1;
399   }
400
401The ``StreamEncoderCast<>()`` helper template reduces very messy casting into
402a much easier to read syntax:
403
404.. code-block:: c++
405
406   #include "pw_protobuf/encoder.h"
407   #include "pw_protobuf_test_protos/full_test.pwpb.h"
408
409   Result<ConstByteSpan> EncodeOverlaid(uint32_t height,
410                                        uint32_t length,
411                                        ConstByteSpan encode_buffer) {
412     BaseMessage::MemoryEncoder base(encode_buffer);
413
414     // Without StreamEncoderCast<>(), this line would be:
415     //   Overlay::StreamEncoder& overlay =
416     //       *static_cast<Overlay::StreamEncoder*>(
417     //           static_cast<pw::protobuf::StreamEncoder*>(&base)
418     Overlay::StreamEncoder& overlay =
419         StreamEncoderCast<Overlay::StreamEncoder>(base);
420     if (!overlay.WriteHeight(height).ok()) {
421       return overlay.status();
422     }
423     if (!base.WriteLength(length).ok()) {
424       return base.status();
425     }
426     return ConstByteSpan(base);
427   }
428
429While this use case is somewhat uncommon, it's a core supported use case of
430pw_protobuf.
431
432.. warning::
433
434   Using this to convert one stream encoder to another when the messages
435   themselves do not safely overlay will result in corrupt protos. Be careful
436   when doing this as there's no compile-time way to detect whether or not two
437   messages are meant to overlay.
438
439Decoding
440--------
441For decoding, in addition to the ``Read()`` method that populates a message
442structure, the following additional methods are also generated in the typed
443``StreamDecoder`` class.
444
445.. code-block:: c++
446
447   class Customer::StreamDecoder : pw::protobuf::StreamDecoder {
448    public:
449     // Message Structure Reader.
450     pw::Status Read(Customer::Message&);
451
452     // Returns the identity of the current field.
453     ::pw::Result<Fields> Field();
454
455     // Per-Field Typed Readers.
456     pw::Result<int32_t> ReadAge();
457
458     pw::StatusWithSize ReadName(pw::span<char>);
459     BytesReader GetNameReader(); // Read name as a stream of bytes.
460
461     pw::Result<Customer::Status> ReadStatus();
462   };
463
464Complete and correct decoding requires looping through the fields, so is more
465complex than encoding or using the message structure.
466
467.. code-block:: c++
468
469   pw::Status DecodeCustomer(Customer::StreamDecoder& decoder) {
470     uint32_t age;
471     char name[32];
472     Customer::Status status;
473
474     while ((status = decoder.Next()).ok()) {
475       switch (decoder.Field().value()) {
476         case Customer::Fields::kAge: {
477           PW_TRY_ASSIGN(age, decoder.ReadAge());
478           break;
479         }
480         case Customer::Fields::kName: {
481           PW_TRY(decoder.ReadName(name));
482           break;
483         }
484         case Customer::Fields::kStatus: {
485           PW_TRY_ASSIGN(status, decoder.ReadStatus());
486           break;
487         }
488       }
489     }
490
491     return status.IsOutOfRange() ? OkStatus() : status;
492   }
493
494.. warning:: ``Fields::SNAKE_CASE`` is deprecated. Use ``Fields::kCamelCase``.
495
496   Transitional support for ``Fields::SNAKE_CASE`` will soon only be available by
497   explicitly setting the following GN variable in your project:
498   ``pw_protobuf_compiler_GENERATE_LEGACY_ENUM_SNAKE_CASE_NAMES=true``
499
500   This support will be removed after downstream projects have been migrated.
501
502.. _module-pw_protobuf-read:
503
504Reading a single field
505----------------------
506Sometimes, only a single field from a serialized message needs to be read. In
507these cases, setting up a decoder and iterating through the message is a lot of
508boilerplate. ``pw_protobuf`` generates convenient ``Find*()`` functions for
509most fields in a message which handle this for you.
510
511.. code-block:: c++
512
513   pw::Status ReadCustomerData(pw::ConstByteSpan serialized_customer) {
514     pw::Result<uint32_t> age = Customer::FindAge(serialized_customer);
515     if (!age.ok()) {
516       return age.status();
517     }
518
519     // This will scan the buffer again from the start, which is less efficient
520     // than writing a custom decoder loop.
521     pw::Result<std::string_view> name = Customer::FindName(serialized_customer);
522     if (!age.ok()) {
523       return age.status();
524     }
525
526     DoStuff(age, name);
527     return pw::OkStatus();
528   }
529
530The ``Find`` APIs also work with streamed data, as shown below.
531
532.. code-block:: c++
533
534   pw::Status ReadCustomerData(pw::stream::Reader& customer_stream) {
535     pw::Result<uint32_t> age = Customer::FindAge(customer_stream);
536     if (!age.ok()) {
537       return age.status();
538     }
539
540     // This will begin scanning for `name` from the current position of the
541     // stream (following the `age` field). If `name` appeared before `age` in
542     // the serialized data, it will not be found.
543     //
544     // Note that unlike with the buffer APIs, stream Find methods copy `string`
545     // and `bytes` fields into a user-provided buffer.
546     char name[32];
547     pw::StatusWithSize sws = Customer::FindName(serialized_customer, name);
548     if (!sws.ok()) {
549       return sws.status();
550     }
551     if (sws.size() >= sizeof(name)) {
552       return pw::Status::OutOfRange();
553     }
554     name[sws.size()] = '\0';
555
556     DoStuff(age, name);
557     return pw::OkStatus();
558   }
559
560``Find`` APIs can also be used to iterate over occurrences of a repeated field.
561For example, given the protobuf definition:
562
563.. code-block:: proto
564
565   message Samples {
566     repeated uint32 raw_values = 1;
567   }
568
569The ``raw_values`` field can be iterated over as follows:
570
571.. code-block:: c++
572
573   pw::Status ReadSamples(pw::ConstByteSpan serialized_samples) {
574     pw::protobuf::Uint32Finder raw_values =
575         Samples::FindRawValues(serialized_samples);
576     for (Result<int32_t> val = finder.Next(); val.ok(); val = finder.Next()) {
577       ConsumeValue(*val);
578     }
579   }
580
581.. note::
582
583   Each call to ``Find*()`` linearly scans through the message. If you have to
584   read multiple fields, it is more efficient to instantiate your own decoder as
585   described above.
586
587
588Direct Writers and Readers
589==========================
590The lowest level API is provided by the core C++ implementation, and requires
591the caller to provide the correct field number and value types for encoding, or
592check the same when decoding.
593
594Encoding
595--------
596The two fundamental classes are ``MemoryEncoder`` which directly encodes a proto
597to an in-memory buffer, and ``StreamEncoder`` that operates on
598``pw::stream::Writer`` objects to serialize proto data.
599
600``StreamEncoder`` allows you encode a proto to something like ``pw::sys_io``
601without needing to build the complete message in memory
602
603To encode the same message we've used in the examples thus far, we would use
604the following parts of the core API:
605
606.. code-block:: c++
607
608   class pw::protobuf::StreamEncoder {
609    public:
610     Status WriteInt32(uint32_t field_number, int32_t);
611     Status WriteUint32(uint32_t field_number, uint32_t);
612
613     Status WriteString(uint32_t field_number, std::string_view);
614     Status WriteString(uint32_t field_number, const char*, size_t);
615
616     // And many other methods, see pw_protobuf/encoder.h
617   };
618
619Encoding the same message requires that we specify the field numbers, which we
620can hardcode, or supplement using the C++ code generated ``Fields`` enum, and
621cast the enumerated type.
622
623.. code-block:: c++
624
625   #include "pw_protobuf/encoder.h"
626   #include "example_protos/customer.pwpb.h"
627
628   Status EncodeCustomer(pw::protobuf::StreamEncoder& encoder) {
629     PW_TRY(encoder.WriteInt32(static_cast<uint32_t>(Customer::Fields::kAge),
630                               33));
631     PW_TRY(encoder.WriteString(static_cast<uint32_t>(Customer::Fields::kName),
632                                "Joe Bloggs"sv));
633     PW_TRY(encoder.WriteUint32(
634         static_cast<uint32_t>(Customer::Fields::kStatus),
635         static_cast<uint32_t>(Customer::Status::INACTIVE)));
636   }
637
638Decoding
639--------
640``StreamDecoder`` reads data from a ``pw::stream::Reader`` and mirrors the API
641of the encoders.
642
643To decode the same message we would use the following parts of the core API:
644
645.. code-block:: c++
646
647   class pw::protobuf::StreamDecoder {
648    public:
649     // Returns the identity of the current field.
650     ::pw::Result<uint32_t> FieldNumber();
651
652     Result<int32_t> ReadInt32();
653     Result<uint32_t> ReadUint32();
654
655     StatusWithSize ReadString(pw::span<char>);
656
657     // And many other methods, see pw_protobuf/stream_decoder.h
658   };
659
660As with the typed per-field API, complete and correct decoding requires looping
661through the fields and checking the field numbers, along with casting types.
662
663.. code-block:: c++
664
665   pw::Status DecodeCustomer(pw::protobuf::StreamDecoder& decoder) {
666     uint32_t age;
667     char name[32];
668     Customer::Status status;
669
670     while ((status = decoder.Next()).ok()) {
671       switch (decoder.FieldNumber().value()) {
672         case static_cast<uint32_t>(Customer::Fields::kAge): {
673           PW_TRY_ASSIGN(age, decoder.ReadInt32());
674           break;
675         }
676         case static_cast<uint32_t>(Customer::Fields::kName): {
677           PW_TRY(decoder.ReadString(name));
678           break;
679         }
680         case static_cast<uint32_t>(Customer::Fields::kStatus): {
681           uint32_t status_value;
682           PW_TRY_ASSIGN(status_value, decoder.ReadUint32());
683           status = static_cast<Customer::Status>(status_value);
684           break;
685         }
686       }
687     }
688
689     return status.IsOutOfRange() ? OkStatus() : status;
690   }
691
692Find APIs
693---------
694
695.. doxygenfile:: pw_protobuf/public/pw_protobuf/find.h
696
697
698Handling of packages
699====================
700
701Package declarations in ``.proto`` files are converted to namespace
702declarations. Unlike ``protoc``'s native C++ codegen, pw_protobuf appends an
703additional ``::pwpb`` namespace after the user-specified package name: for
704example, ``package my.cool.project`` becomes ``namespace
705my::cool::project::pwpb``. We emit a different package name than stated, in
706order to avoid clashes for projects that link against multiple C++ proto
707libraries in the same library.
708
709..
710  TODO: b/258832150 - Remove this section, if possible
711
712In some cases, pw_protobuf codegen may encounter external message references
713during parsing, where it is unable to resolve the package name of the message.
714In these situations, the codegen is instead forced to emit the package name as
715``pw::pwpb_xxx::my::cool::project``, where "pwpb_xxx" is the name of some
716unspecified private namespace. Users are expected to manually identify the
717intended namespace name of that symbol, as described above, and must not rely
718on any such private namespaces, even if they appear in codegen output.
719
720-------
721Codegen
722-------
723pw_protobuf codegen integration is supported in GN, Bazel, and CMake.
724
725``pw_protobuf`` only supports targeting ``proto3`` syntax and its semantics.
726Files which specify ``syntax = "proto2"`` will be treated as proto3, and any
727proto2-specific options will be ignored.
728
729.. admonition:: Protobuf edition support
730
731   `Protobuf editions <https://protobuf.dev/editions/overview>`_ are the future
732   of Protobuf versioning, replacing proto2 and proto3 with more granular code
733   generation options. ``pw_protobuf`` has some basic support for editions, but
734   it is still a work in progress.
735
736This module's codegen is available through the ``*.pwpb`` sub-target of a
737``pw_proto_library`` in GN, CMake, and Bazel. See :ref:`pw_protobuf_compiler's
738documentation <module-pw_protobuf_compiler>` for more information on build
739system integration for pw_protobuf codegen.
740
741Example ``BUILD.gn``:
742
743.. code-block::
744
745   import("//build_overrides/pigweed.gni")
746
747   import("$dir_pw_build/target_types.gni")
748   import("$dir_pw_protobuf_compiler/proto.gni")
749
750   # This target controls where the *.pwpb.h headers end up on the include path.
751   # In this example, it's at "pet_daycare_protos/client.pwpb.h".
752   pw_proto_library("pet_daycare_protos") {
753     sources = [
754       "pet_daycare_protos/client.proto",
755     ]
756   }
757
758   pw_source_set("example_client") {
759     sources = [ "example_client.cc" ]
760     deps = [
761       ":pet_daycare_protos.pwpb",
762       dir_pw_bytes,
763       dir_pw_stream,
764     ]
765   }
766
767-------------
768Configuration
769-------------
770``pw_protobuf`` supports the following configuration options.
771
772* ``PW_PROTOBUF_CFG_MAX_VARINT_SIZE``:
773  When encoding nested messages, the number of bytes to reserve for the varint
774  submessage length. Nested messages are limited in size to the maximum value
775  that can be varint-encoded into this reserved space.
776
777  The values that can be set, and their corresponding maximum submessage
778  lengths, are outlined below.
779
780  +-------------------+----------------------------------------+
781  | MAX_VARINT_SIZE   | Maximum submessage length              |
782  +===================+========================================+
783  | 1 byte            | 127                                    |
784  +-------------------+----------------------------------------+
785  | 2 bytes           | 16,383 or < 16KiB                      |
786  +-------------------+----------------------------------------+
787  | 3 bytes           | 2,097,151 or < 2048KiB                 |
788  +-------------------+----------------------------------------+
789  | 4 bytes (default) | 268,435,455 or < 256MiB                |
790  +-------------------+----------------------------------------+
791  | 5 bytes           | 4,294,967,295 or < 4GiB (max uint32_t) |
792  +-------------------+----------------------------------------+
793
794Field Options
795=============
796``pw_protobuf`` supports the following field options for specifying
797protocol-level limitations, rather than code generation parameters (although
798they do influence code generation):
799
800
801* ``max_count``:
802  Maximum number of entries for repeated fields.
803
804* ``max_size``:
805  Maximum size of `bytes` or `string` fields.
806
807Even though other proto codegen implementations do not respect these field
808options, they can still compile protos which use these options. This is
809especially useful for host builds using upstream protoc code generation, where
810host software can use the reflection API to query for the options and validate
811messages comply with the specified limitations.
812
813.. code-block:: text
814
815   import "pw_protobuf_protos/field_options.proto";
816
817   message Demo {
818     string size_limited_string = 1 [(pw.protobuf.pwpb).max_size = 16];
819   };
820
821Options Files
822=============
823Code generation can be configured using a separate ``.pwpb_options`` file placed
824alongside the relevant ``.proto`` file.
825
826The format of this file is a series of fully qualified field names, or patterns,
827followed by one or more options. Lines starting with ``#`` or ``//`` are
828comments, and blank lines are ignored.
829
830Example:
831
832.. code-block::
833
834   // Set an option for a specific field.
835   fuzzy_friends.Client.visit_dates max_count:16
836
837   // Set options for multiple fields by wildcard matching.
838   fuzzy_friends.Pet.* max_size:32
839
840   // Set multiple options in one go.
841   fuzzy_friends.Dog.paws max_count:4 fixed_count:true
842
843Options files should be listed as ``inputs`` when defining ``pw_proto_library``,
844e.g.
845
846.. code-block::
847
848   pw_proto_library("pet_daycare_protos") {
849     sources = [
850       "pet_daycare_protos/client.proto",
851     ]
852     inputs = [
853       "pet_daycare_protos/client.pwpb_options",
854     ]
855   }
856
857Valid options are:
858
859* ``max_count``:
860  Maximum number of entries for repeated fields. When set, repeated scalar
861  fields will use the ``pw::Vector`` container type instead of a callback.
862
863* ``fixed_count``:
864  Specified with ``max_count`` to use a fixed length ``std::array`` container
865  instead of ``pw::Vector``.
866
867* ``max_size``:
868  Maximum size of `bytes` or `string` fields. When set, `bytes` fields use
869  ``pw::Vector`` and `string` fields use ``pw::InlineString`` instead of a
870  callback.
871
872* ``fixed_size``:
873  Specified with ``max_size`` to use a fixed length ``std::array`` container
874  instead of ``pw::Vector`` for `bytes` fields.
875
876* ``use_callback``:
877  Replaces the structure member for the field with a callback function even
878  where a simpler type could be used. This can be useful to ignore fields, to
879  stop decoding of complex structures if certain values are not as expected, or
880  to provide special handling for nested messages.
881
882.. admonition:: Rationale
883
884  The choice of a separate options file, over embedding options within the proto
885  file, are driven by the need for proto files to be shared across multiple
886  contexts.
887
888  A typical product would require the same proto be used on a hardware
889  component, running Pigweed; a server-side component, running on a cloud
890  platform; and an app component, running on a Phone OS.
891
892  While related, each of these will likely have different source projects and
893  build systems.
894
895  Were the Pigweed options embedded in the protos, it would be necessary for
896  both the cloud platform and Phone OS to be able to ``"import pigweed"`` ---
897  and equivalently for options relevant to their platforms in the embedded
898  software project.
899
900------------------
901Message Structures
902------------------
903The C++ code generator creates a ``struct Message`` for each protobuf message
904that can hold the set of values encoded by it, following these rules.
905
906* Scalar fields are represented by their appropriate C++ type.
907
908  .. code-block:: protobuf
909
910     message Customer {
911       int32 age = 1;
912       uint32 birth_year = 2;
913       sint64 rating = 3;
914       bool is_active = 4;
915     }
916
917  .. code-block:: c++
918
919     struct Customer::Message {
920       int32_t age;
921       uint32_t birth_year;
922       int64_t rating;
923       bool is_active;
924     };
925
926* Enumerations are represented by a code generated namespaced proto enum.
927
928  .. code-block:: protobuf
929
930     message Award {
931       enum Service {
932         BRONZE = 1;
933         SILVER = 2;
934         GOLD = 3;
935       }
936       Service service = 1;
937     }
938
939  .. code-block:: c++
940
941     enum class Award::Service : uint32_t {
942       BRONZE = 1,
943       SILVER = 2,
944       GOLD = 3,
945
946       kBronze = BRONZE,
947       kSilver = SILVER,
948       kGold = GOLD,
949     };
950
951     struct Award::Message {
952       Award::Service service;
953     };
954
955  Aliases to the enum values are also included in the "constant" style to match
956  your preferred coding style. These aliases have any common prefix to the
957  enumeration values removed, such that:
958
959  .. code-block:: protobuf
960
961     enum Activity {
962       ACTIVITY_CYCLING = 1;
963       ACTIVITY_RUNNING = 2;
964       ACTIVITY_SWIMMING = 3;
965     }
966
967  .. code-block:: c++
968
969     enum class Activity : uint32_t {
970       ACTIVITY_CYCLING = 1,
971       ACTIVITY_RUNNING = 2,
972       ACTIVITY_SWIMMING = 3,
973
974       kCycling = ACTIVITY_CYCLING,
975       kRunning = ACTIVITY_RUNNING,
976       kSwimming = ACTIVITY_SWIMMING,
977     };
978
979
980* Nested messages are represented by their own ``struct Message`` provided that
981  a reference cycle does not exist.
982
983  .. code-block:: protobuf
984
985     message Sale {
986       Customer customer = 1;
987       Product product = 2;
988     }
989
990  .. code-block:: c++
991
992     struct Sale::Message {
993       Customer::Message customer;
994       Product::Message product;
995     };
996
997* Optional scalar fields are represented by the appropriate C++ type wrapped in
998  ``std::optional``. Optional fields are not encoded when the value is not
999  present.
1000
1001  .. code-block:: protobuf
1002
1003     message Loyalty {
1004       optional int32 points = 1;
1005     }
1006
1007  .. code-block:: c++
1008
1009     struct Loyalty::Message {
1010       std::optional<int32_t> points;
1011     };
1012
1013* Repeated scalar fields are represented by ``pw::Vector`` when the
1014  ``max_count`` option is set for that field, or by ``std::array`` when both
1015  ``max_count`` and ``fixed_count:true`` are set.
1016
1017  The max count is exposed as an UpperCamelCase constant ``k{FieldName}MaxSize``.
1018
1019  .. code-block:: protobuf
1020
1021     message Register {
1022       repeated int32 cash_in = 1;
1023       repeated int32 cash_out = 2;
1024     }
1025
1026  .. code-block:: text
1027
1028     Register.cash_in max_count:32 fixed_count:true
1029     Register.cash_out max_count:64
1030
1031  .. code-block:: c++
1032
1033     namespace Register {
1034       static constexpr size_t kCashInMaxSize = 32;
1035       static constexpr size_t kCashOutMaxSize = 64;
1036     }
1037
1038     struct Register::Message {
1039       std::array<int32_t, kCashInMaxSize> cash_in;
1040       pw::Vector<int32_t, kCashOutMaxSize> cash_out;
1041     };
1042
1043* `bytes` fields are represented by ``pw::Vector`` when the ``max_size`` option
1044  is set for that field, or by ``std::array`` when both ``max_size`` and
1045  ``fixed_size:true`` are set.
1046
1047  The max size is exposed as an UpperCamelCase constant ``k{FieldName}MaxSize``.
1048
1049  .. code-block:: protobuf
1050
1051     message Product {
1052       bytes sku = 1;
1053       bytes serial_number = 2;
1054     }
1055
1056  .. code-block:: text
1057
1058     Product.sku max_size:8 fixed_size:true
1059     Product.serial_number max_size:64
1060
1061  .. code-block:: c++
1062
1063     namespace Product {
1064       static constexpr size_t kSkuMaxSize = 8;
1065       static constexpr size_t kSerialNumberMaxSize = 64;
1066     }
1067
1068     struct Product::Message {
1069       std::array<std::byte, kSkuMaxSize> sku;
1070       pw::Vector<std::byte, kSerialNumberMaxSize> serial_number;
1071     };
1072
1073* `string` fields are represented by a :cpp:type:`pw::InlineString` when the
1074  ``max_size`` option is set for that field. The string can hold up to
1075  ``max_size`` characters, and is always null terminated. The null terminator is
1076  not counted in ``max_size``.
1077
1078  The max size is exposed as an UpperCamelCase constant ``k{FieldName}MaxSize``.
1079
1080  .. code-block:: protobuf
1081
1082     message Employee {
1083       string name = 1;
1084     }
1085
1086  .. code-block:: text
1087
1088     Employee.name max_size:128
1089
1090  .. code-block:: c++
1091
1092     namespace Employee {
1093       static constexpr size_t kNameMaxSize = 128;
1094     }
1095
1096     struct Employee::Message {
1097       pw::InlineString<kNameMaxSize> name;
1098     };
1099
1100* Nested messages with a dependency cycle, repeated scalar fields without a
1101  ``max_count`` option set, `bytes` and `strings` fields without a ``max_size``
1102  option set, and repeated nested messages, repeated `bytes`, and repeated
1103  `strings` fields, are represented by a callback.
1104
1105  You set the callback to a custom function for encoding or decoding
1106  before passing the structure to ``Write()`` or ``Read()`` appropriately.
1107
1108  .. code-block:: protobuf
1109
1110     message Store {
1111       Store nearest_store = 1;
1112       repeated int32 employee_numbers = 2;
1113       string directions = 3;
1114       repeated string address = 4;
1115       repeated Employee employees = 5;
1116     }
1117
1118  .. code-block::
1119
1120     // No options set.
1121
1122  .. code-block:: c++
1123
1124     struct Store::Message {
1125       pw::protobuf::Callback<Store::StreamEncoder, Store::StreamDecoder> nearest_store;
1126       pw::protobuf::Callback<Store::StreamEncoder, Store::StreamDecoder> employee_numbers;
1127       pw::protobuf::Callback<Store::StreamEncoder, Store::StreamDecoder> directions;
1128       pw::protobuf::Callback<Store::StreamEncoder, Store::StreamDecoder> address;
1129       pw::protobuf::Callback<Store::StreamEncoder, Store::StreamDecoder> employees;
1130     };
1131
1132  A Callback object can be converted to a ``bool`` indicating whether a callback
1133  is set.
1134
1135* Fields defined within a ``oneof`` group are represented by a ``OneOf``
1136  callback.
1137
1138  .. code-block:: protobuf
1139
1140     message OnlineOrder {
1141       Product product = 1;
1142       Customer customer = 2;
1143
1144       oneof delivery {
1145         Address shipping_address = 3;
1146         Date pickup_date = 4;
1147       }
1148     }
1149
1150  .. code-block::
1151
1152     // No options set.
1153
1154  .. code-block:: c++
1155
1156     struct OnlineOrder::Message {
1157       Product::Message product;
1158       Customer::Message customer;
1159       pw::protobuf::OneOf<OnlineOrder::StreamEncoder,
1160                           OnlineOrder::StreamDecoder,
1161                           OnlineOrder::Fields>
1162         delivery;
1163     };
1164
1165  Encoding a ``oneof`` field is identical to using a regular field callback.
1166  The encode callback will be invoked once when the message is written. Users
1167  must ensure that only a single field is written to the encoder within the
1168  callback.
1169
1170  .. code-block:: c++
1171
1172     OnlineOrder::Message message;
1173     message.delivery.SetEncoder(
1174         [&pickup_date](OnlineOrder::StreamEncoder& encoder) {
1175           return encoder.GetPickupDateEncoder().Write(pickup_date);
1176         });
1177
1178  The ``OneOf`` decoder callback is invoked when reading a message structure
1179  when a field within the ``oneof`` group is encountered. The found field is
1180  passed to the callback.
1181
1182  If multiple fields from the ``oneof`` group are encountered within a ``Read``,
1183  it will fail with a ``DATA_LOSS`` status.
1184
1185  .. code-block:: c++
1186
1187     OnlineOrder::Message message;
1188     message.delivery.SetDecoder(
1189         [this](OnlineOrder::Fields field, OnlineOrder::StreamDecoder& decoder) {
1190           switch (field) {
1191             case OnlineOrder::Fields::kShippingAddress:
1192               PW_TRY(decoder.GetShippingAddressDecoder().Read(&this->shipping_address));
1193               break;
1194             case OnlineOrder::Fields::kPickupDate:
1195               PW_TRY(decoder.GetPickupDateDecoder().Read(&this->pickup_date));
1196               break;
1197             default:
1198               return pw::Status::DataLoss();
1199           }
1200
1201           return pw::OkStatus();
1202         });
1203
1204Message structures can be copied, but doing so will clear any assigned
1205callbacks. To preserve functions applied to callbacks, ensure that the message
1206structure is moved.
1207
1208Message structures can also be compared with each other for equality. This
1209includes all repeated and nested fields represented by value types, but does not
1210compare any field represented by a callback.
1211
1212Reserved-Word Conflicts
1213=======================
1214Generated symbols whose names conflict with reserved C++ keywords or
1215standard-library macros are suffixed with underscores to avoid compilation
1216failures. This can be seen below in ``Channel.operator``, which is mapped to
1217``Channel::Message::operator_`` to avoid conflicting with the ``operator``
1218keyword.
1219
1220.. code-block:: protobuf
1221
1222   message Channel {
1223     int32 bitrate = 1;
1224     float signal_to_noise_ratio = 2;
1225     Company operator = 3;
1226   }
1227
1228.. code-block:: c++
1229
1230   struct Channel::Message {
1231     int32_t bitrate;
1232     float signal_to_noise_ratio;
1233     Company::Message operator_;
1234   };
1235
1236Similarly, as shown in the example below, some POSIX-signal names conflict with
1237macros defined by the standard-library header ``<csignal>`` and therefore
1238require underscore suffixes in the generated code. Note, however, that some
1239signal names are left alone. This is because ``<csignal>`` only defines a subset
1240of the POSIX signals as macros; the rest are perfectly valid identifiers that
1241won't cause any problems unless the user defines custom macros for them. Any
1242naming conflicts caused by user-defined macros are the user's responsibility
1243(https://google.github.io/styleguide/cppguide.html#Preprocessor_Macros).
1244
1245.. code-block:: protobuf
1246
1247   enum PosixSignal {
1248     NONE = 0;
1249     SIGHUP = 1;
1250     SIGINT = 2;
1251     SIGQUIT = 3;
1252     SIGILL = 4;
1253     SIGTRAP = 5;
1254     SIGABRT = 6;
1255     SIGFPE = 8;
1256     SIGKILL = 9;
1257     SIGSEGV = 11;
1258     SIGPIPE = 13;
1259     SIGALRM = 14;
1260     SIGTERM = 15;
1261   }
1262
1263.. code-block:: c++
1264
1265   enum class PosixSignal : uint32_t {
1266     NONE = 0,
1267     SIGHUP = 1,
1268     SIGINT_ = 2,
1269     SIGQUIT = 3,
1270     SIGILL_ = 4,
1271     SIGTRAP = 5,
1272     SIGABRT_ = 6,
1273     SIGFPE_ = 8,
1274     SIGKILL = 9,
1275     SIGSEGV_ = 11,
1276     SIGPIPE = 13,
1277     SIGALRM = 14,
1278     SIGTERM_ = 15,
1279
1280     kNone = NONE,
1281     kSighup = SIGHUP,
1282     kSigint = SIGINT_,
1283     kSigquit = SIGQUIT,
1284     kSigill = SIGILL_,
1285     kSigtrap = SIGTRAP,
1286     kSigabrt = SIGABRT_,
1287     kSigfpe = SIGFPE_,
1288     kSigkill = SIGKILL,
1289     kSigsegv = SIGSEGV_,
1290     kSigpipe = SIGPIPE,
1291     kSigalrm = SIGALRM,
1292     kSigterm = SIGTERM_,
1293   };
1294
1295Much like reserved words and macros, the names ``Message`` and ``Fields`` are
1296suffixed with underscores in generated C++ code. This is to prevent name
1297conflicts with the codegen internals if they're used in a nested context as in
1298the example below.
1299
1300.. code-block:: protobuf
1301
1302   message Function {
1303     message Message {
1304       string content = 1;
1305     }
1306
1307     enum Fields {
1308       NONE = 0;
1309       COMPLEX_NUMBERS = 1;
1310       INTEGERS_MOD_5 = 2;
1311       MEROMORPHIC_FUNCTIONS_ON_COMPLEX_PLANE = 3;
1312       OTHER = 4;
1313     }
1314
1315     Message description = 1;
1316     Fields domain = 2;
1317     Fields codomain = 3;
1318   }
1319
1320.. code-block::
1321
1322   Function.Message.content max_size:128
1323
1324.. code-block:: c++
1325
1326   struct Function::Message_::Message {
1327     pw::InlineString<128> content;
1328   };
1329
1330   enum class Function::Message_::Fields : uint32_t {
1331     CONTENT = 1,
1332   };
1333
1334   enum class Function::Fields_ uint32_t {
1335     NONE = 0,
1336     COMPLEX_NUMBERS = 1,
1337     INTEGERS_MOD_5 = 2,
1338     MEROMORPHIC_FUNCTIONS_ON_COMPLEX_PLANE = 3,
1339     OTHER = 4,
1340
1341     kNone = NONE,
1342     kComplexNumbers = COMPLEX_NUMBERS,
1343     kIntegersMod5 = INTEGERS_MOD_5,
1344     kMeromorphicFunctionsOnComplexPlane =
1345         MEROMORPHIC_FUNCTIONS_ON_COMPLEX_PLANE,
1346     kOther = OTHER,
1347   };
1348
1349   struct Function::Message {
1350     Function::Message_::Message description;
1351     Function::Fields_ domain;
1352     Function::Fields_ codomain;
1353   };
1354
1355   enum class Function::Fields : uint32_t {
1356     DESCRIPTION = 1,
1357     DOMAIN = 2,
1358     CODOMAIN = 3,
1359   };
1360
1361.. warning::
1362   Note that the C++ spec also reserves two categories of identifiers for the
1363   compiler to use in ways that may conflict with generated code:
1364
1365   * Any identifier that contains two consecutive underscores anywhere in it.
1366   * Any identifier that starts with an underscore followed by a capital letter.
1367
1368   Appending underscores to symbols in these categories wouldn't change the fact
1369   that they match patterns reserved for the compiler, so the codegen does not
1370   currently attempt to fix them. Such names will therefore result in
1371   non-portable code that may or may not work depending on the compiler. These
1372   naming patterns are of course strongly discouraged in any protobufs that will
1373   be used with ``pw_protobuf`` codegen.
1374
1375Overhead
1376========
1377A single encoder and decoder is used for these structures, with a one-time code
1378cost. When the code generator creates the ``struct Message``, it also creates
1379a description of this structure that the shared encoder and decoder use.
1380
1381The cost of this description is a shared table for each protobuf message
1382definition used, with four words per field within the protobuf message, and an
1383addition word to store the size of the table.
1384
1385--------
1386Encoding
1387--------
1388The simplest way to use ``MemoryEncoder`` to encode a proto is from its code
1389generated ``Message`` structure into an in-memory buffer.
1390
1391.. code-block:: c++
1392
1393   #include "my_protos/my_proto.pwpb.h"
1394   #include "pw_bytes/span.h"
1395   #include "pw_protobuf/encoder.h"
1396   #include "pw_status/status_with_size.h"
1397
1398   // Writes a proto response to the provided buffer, returning the encode
1399   // status and number of bytes written.
1400   pw::StatusWithSize WriteProtoResponse(pw::ByteSpan response) {
1401     MyProto::Message message{}
1402     message.magic_number = 0x1a1a2b2b;
1403     message.favorite_food = "cookies";
1404     message.calories = 600;
1405
1406     // All proto writes are directly written to the `response` buffer.
1407     MyProto::MemoryEncoder encoder(response);
1408     encoder.Write(message);
1409
1410     return pw::StatusWithSize(encoder.status(), encoder.size());
1411   }
1412
1413All fields of a message are written, including those initialized to their
1414default values.
1415
1416Alternatively, for example if only a subset of fields are required to be
1417encoded, fields can be written a field at a time through the code generated
1418or lower-level APIs. This can be more convenient if finer grained control or
1419other custom handling is required.
1420
1421.. code-block:: c++
1422
1423   #include "my_protos/my_proto.pwpb.h"
1424   #include "pw_bytes/span.h"
1425   #include "pw_protobuf/encoder.h"
1426   #include "pw_status/status_with_size.h"
1427
1428   // Writes a proto response to the provided buffer, returning the encode
1429   // status and number of bytes written.
1430   pw::StatusWithSize WriteProtoResponse(pw::ByteSpan response) {
1431     // All proto writes are directly written to the `response` buffer.
1432     MyProto::MemoryEncoder encoder(response);
1433     encoder.WriteMagicNumber(0x1a1a2b2b);
1434     encoder.WriteFavoriteFood("cookies");
1435     // Only conditionally write calories.
1436     if (on_diet) {
1437       encoder.WriteCalories(600);
1438     }
1439     return pw::StatusWithSize(encoder.status(), encoder.size());
1440   }
1441
1442StreamEncoder
1443=============
1444``StreamEncoder`` is constructed with the destination stream, and a scratch
1445buffer used to handle nested submessages.
1446
1447.. code-block:: c++
1448
1449   #include "my_protos/my_proto.pwpb.h"
1450   #include "pw_bytes/span.h"
1451   #include "pw_protobuf/encoder.h"
1452   #include "pw_stream/sys_io_stream.h"
1453
1454   pw::stream::SysIoWriter sys_io_writer;
1455   MyProto::StreamEncoder encoder(sys_io_writer, pw::ByteSpan());
1456
1457   // Once this line returns, the field has been written to the Writer.
1458   encoder.WriteTimestamp(system::GetUnixEpoch());
1459
1460   // There's no intermediate buffering when writing a string directly to a
1461   // StreamEncoder.
1462   encoder.WriteWelcomeMessage("Welcome to Pigweed!");
1463
1464   if (!encoder.status().ok()) {
1465     PW_LOG_INFO("Failed to encode proto; %s", encoder.status().str());
1466   }
1467
1468Callbacks
1469=========
1470When using the ``Write()`` method with a ``struct Message``, certain fields may
1471require a callback function be set to encode the values for those fields.
1472Otherwise the values will be treated as an empty repeated field and not encoded.
1473
1474The callback is called with the cursor at the field in question, and passed
1475a reference to the typed encoder that can write the required values to the
1476stream or buffer.
1477
1478Callback implementations may use any level of API. For example a callback for a
1479nested submessage (with a dependency cycle, or repeated) can be implemented by
1480calling ``Write()`` on a nested encoder.
1481
1482.. code-block:: c++
1483
1484   Store::Message store{};
1485   store.employees.SetEncoder([](Store::StreamEncoder& encoder) {
1486     Employee::Message employee{};
1487     // Populate `employee`.
1488     return encoder.GetEmployeesEncoder().Write(employee);
1489   ));
1490
1491Nested submessages
1492==================
1493Code generated ``GetFieldEncoder`` methods are provided that return a correctly
1494typed ``StreamEncoder`` or ``MemoryEncoder`` for the message.
1495
1496.. code-block:: protobuf
1497
1498   message Owner {
1499     Animal pet = 1;
1500   }
1501
1502Note that the accessor method is named for the field, while the returned encoder
1503is named for the message type.
1504
1505.. cpp:function:: Animal::StreamEncoder Owner::StreamEncoder::GetPetEncoder()
1506
1507A lower-level API method returns an untyped encoder, which only provides the
1508lower-level API methods. This can be cast to a typed encoder if needed.
1509
1510.. cpp:function:: pw::protobuf::StreamEncoder pw::protobuf::StreamEncoder::GetNestedEncoder(uint32_t field_number, EmptyEncoderBehavior empty_encoder_behavior = EmptyEncoderBehavior::kWriteFieldNumber)
1511
1512(The optional `empty_encoder_behavior` parameter allows the user to disable
1513writing the tag number for the nested encoder, if no data was written to
1514that nested decoder.)
1515
1516.. warning::
1517   When a nested submessage is created, any use of the parent encoder that
1518   created the nested encoder will trigger a crash. To resume using the parent
1519   encoder, destroy the submessage encoder first.
1520
1521Buffering
1522---------
1523Writing proto messages with nested submessages requires buffering due to
1524limitations of the proto format. Every proto submessage must know the size of
1525the submessage before its final serialization can begin. A streaming encoder can
1526be passed a scratch buffer to use when constructing nested messages. All
1527submessage data is buffered to this scratch buffer until the submessage is
1528finalized. Note that the contents of this scratch buffer is not necessarily
1529valid proto data, so don't try to use it directly.
1530
1531The code generation includes a ``kScratchBufferSizeBytes`` constant that
1532represents the size of the largest submessage and all necessary overhead,
1533excluding the contents of any field values which require a callback.
1534
1535If a submessage field requires a callback, due to a dependency cycle, or a
1536repeated field of unknown length, the size of the submessage cannot be included
1537in the ``kScratchBufferSizeBytes`` constant. If you encode a submessage of this
1538type (which you'll know you're doing because you set an encoder callback for it)
1539simply add the appropriate structure's ``kMaxEncodedSizeBytes`` constant to the
1540scratch buffer size to guarantee enough space.
1541
1542When calculating yourself, the ``MaxScratchBufferSize()`` helper function can
1543also be useful in estimating how much space to allocate to account for nested
1544submessage encoding overhead.
1545
1546.. code-block:: c++
1547
1548   #include "my_protos/pets.pwpb.h"
1549   #include "pw_bytes/span.h"
1550   #include "pw_protobuf/encoder.h"
1551   #include "pw_stream/sys_io_stream.h"
1552
1553   pw::stream::SysIoWriter sys_io_writer;
1554   // The scratch buffer should be at least as big as the largest nested
1555   // submessage. It's a good idea to be a little generous.
1556   std::byte submessage_scratch_buffer[Owner::kScratchBufferSizeBytes];
1557
1558   // Provide the scratch buffer to the proto encoder. The buffer's lifetime must
1559   // match the lifetime of the encoder.
1560   Owner::StreamEncoder owner_encoder(sys_io_writer, submessage_scratch_buffer);
1561
1562   {
1563     // Note that the parent encoder, owner_encoder, cannot be used until the
1564     // nested encoder, pet_encoder, has been destroyed.
1565     Animal::StreamEncoder pet_encoder = owner_encoder.GetPetEncoder();
1566
1567     // There's intermediate buffering when writing to a nested encoder.
1568     pet_encoder.WriteName("Spot");
1569     pet_encoder.WriteType(Pet::Type::DOG);
1570
1571     // When this scope ends, the nested encoder is serialized to the Writer.
1572     // In addition, the parent encoder, owner_encoder, can be used again.
1573   }
1574
1575   // If an encode error occurs when encoding the nested messages, it will be
1576   // reflected at the root encoder.
1577   if (!owner_encoder.status().ok()) {
1578     PW_LOG_INFO("Failed to encode proto; %s", owner_encoder.status().str());
1579   }
1580
1581MemoryEncoder objects use the final destination buffer rather than relying on a
1582scratch buffer.  The ``kMaxEncodedSizeBytes`` constant takes into account the
1583overhead required for nesting submessages. If you calculate the buffer size
1584yourself, your destination buffer might need additional space.
1585
1586.. warning::
1587   If the scratch buffer size is not sufficient, the encoding will fail with
1588   ``Status::ResourceExhausted()``. Always check the results of ``Write`` calls
1589   or the encoder status to ensure success, as otherwise the encoded data will
1590   be invalid.
1591
1592Scalar Fields
1593=============
1594As shown, scalar fields are written using code generated ``WriteFoo``
1595methods that accept the appropriate type and automatically writes the correct
1596field number.
1597
1598.. cpp:function:: Status MyProto::StreamEncoder::WriteFoo(T)
1599
1600These can be freely intermixed with the lower-level API that provides a method
1601per field type, requiring that the field number be passed in. The code
1602generation includes a ``Fields`` enum to provide the field number values.
1603
1604.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteUint64(uint32_t field_number, uint64_t)
1605.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteSint64(uint32_t field_number, int64_t)
1606.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteInt64(uint32_t field_number, int64_t)
1607.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteUint32(uint32_t field_number, uint32_t)
1608.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteSint32(uint32_t field_number, int32_t)
1609.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteInt32(uint32_t field_number, int32_t)
1610.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteFixed64(uint32_t field_number, uint64_t)
1611.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteFixed32(uint32_t field_number, uint64_t)
1612.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteDouble(uint32_t field_number, double)
1613.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteFloat(uint32_t field_number, float)
1614.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteBool(uint32_t field_number, bool)
1615
1616The following two method calls are equivalent, where the first is using the
1617code generated API, and the second implemented by hand.
1618
1619.. code-block:: c++
1620
1621   my_proto_encoder.WriteAge(42);
1622   my_proto_encoder.WriteInt32(static_cast<uint32_t>(MyProto::Fields::kAge), 42);
1623
1624Repeated Fields
1625---------------
1626For repeated scalar fields, multiple code generated ``WriteFoos`` methods
1627are provided.
1628
1629.. cpp:function:: Status MyProto::StreamEncoder::WriteFoos(T)
1630
1631   This writes a single unpacked value.
1632
1633.. cpp:function:: Status MyProto::StreamEncoder::WriteFoos(pw::span<const T>)
1634.. cpp:function:: Status MyProto::StreamEncoder::WriteFoos(const pw::Vector<T>&)
1635
1636   These write a packed field containing all of the values in the provided span
1637   or vector.
1638
1639These too can be freely intermixed with the lower-level API methods, both to
1640write a single value, or to write packed values from either a ``pw::span`` or
1641``pw::Vector`` source.
1642
1643.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedUint64(uint32_t field_number, pw::span<const uint64_t>)
1644.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedUint64(uint32_t field_number, const pw::Vector<uint64_t>&)
1645.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedSint64(uint32_t field_number, pw::span<const int64_t>)
1646.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedSint64(uint32_t field_number, const pw::Vector<int64_t>&)
1647.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedInt64(uint32_t field_number, pw::span<const int64_t>)
1648.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedInt64(uint32_t field_number, const pw::Vector<int64_t>&)
1649.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedUint32(uint32_t field_number, pw::span<const uint32_t>)
1650.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedUint32(uint32_t field_number, const pw::Vector<uint32_t>&)
1651.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedSint32(uint32_t field_number, pw::span<const int32_t>)
1652.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedSint32(uint32_t field_number, const pw::Vector<int32_t>&)
1653.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedInt32(uint32_t field_number, pw::span<const int32_t>)
1654.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedInt32(uint32_t field_number, const pw::Vector<int32_t>&)
1655.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedFixed64(uint32_t field_number, pw::span<const uint64_t>)
1656.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedFixed64(uint32_t field_number, const pw::Vector<uint64_t>&)
1657.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedFixed32(uint32_t field_number, pw::span<const uint64_t>)
1658.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedFixed32(uint32_t field_number, const pw::Vector<uint64_t>&)
1659.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedDouble(uint32_t field_number, pw::span<const double>)
1660.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedDouble(uint32_t field_number, const pw::Vector<double>&)
1661.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedFloat(uint32_t field_number, pw::span<const float>)
1662.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedFloat(uint32_t field_number, const pw::Vector<float>&)
1663.. cpp:function:: Status pw::protobuf::StreamEncoder::WritePackedBool(uint32_t field_number, pw::span<const bool>)
1664.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteRepeatedBool(uint32_t field_number, const pw::Vector<bool>&)
1665
1666The following two method calls are equivalent, where the first is using the
1667code generated API, and the second implemented by hand.
1668
1669.. code-block:: c++
1670
1671   constexpr std::array<int32_t, 5> numbers = { 4, 8, 15, 16, 23, 42 };
1672
1673   my_proto_encoder.WriteNumbers(numbers);
1674   my_proto_encoder.WritePackedInt32(
1675       static_cast<uint32_t>(MyProto::Fields::kNumbers),
1676       numbers);
1677
1678Enumerations
1679============
1680Enumerations are written using code generated ``WriteEnum`` methods that
1681accept the code generated enumeration as the appropriate type and automatically
1682writes both the correct field number and corresponding value.
1683
1684.. cpp:function:: Status MyProto::StreamEncoder::WriteEnum(MyProto::Enum)
1685
1686To write enumerations with the lower-level API, you would need to cast both
1687the field number and value to the ``uint32_t`` type.
1688
1689The following two methods are equivalent, where the first is code generated,
1690and the second implemented by hand.
1691
1692.. code-block:: c++
1693
1694   my_proto_encoder.WriteAward(MyProto::Award::SILVER);
1695   my_proto_encoder.WriteUint32(
1696       static_cast<uint32_t>(MyProto::Fields::kAward),
1697       static_cast<uint32_t>(MyProto::Award::SILVER));
1698
1699Repeated Fields
1700---------------
1701For repeated enum fields, multiple code generated ``WriteEnums`` methods
1702are provided.
1703
1704.. cpp:function:: Status MyProto::StreamEncoder::WriteEnums(MyProto::Enums)
1705
1706   This writes a single unpacked value.
1707
1708.. cpp:function:: Status MyProto::StreamEncoder::WriteEnums(pw::span<const MyProto::Enums>)
1709.. cpp:function:: Status MyProto::StreamEncoder::WriteEnums(const pw::Vector<MyProto::Enums>&)
1710
1711   These write a packed field containing all of the values in the provided span
1712   or vector.
1713
1714Their use is as scalar fields.
1715
1716Strings
1717=======
1718Strings fields have multiple code generated methods provided.
1719
1720.. cpp:function:: Status MyProto::StreamEncoder::WriteName(std::string_view)
1721.. cpp:function:: Status MyProto::StreamEncoder::WriteName(const char*, size_t)
1722
1723These can be freely intermixed with the lower-level API methods.
1724
1725.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteString(uint32_t field_number, std::string_view)
1726.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteString(uint32_t field_number, const char*, size_t)
1727
1728A lower level API method is provided that can write a string from another
1729stream.
1730
1731.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteStringFromStream(uint32_t field_number, stream::Reader& bytes_reader, size_t num_bytes, ByteSpan stream_pipe_buffer)
1732
1733   The payload for the value is provided through the stream::Reader
1734   ``bytes_reader``. The method reads a chunk of the data from the reader using
1735   the ``stream_pipe_buffer`` and writes it to the encoder.
1736
1737Bytes
1738=====
1739Bytes fields provide the ``WriteData`` code generated method.
1740
1741.. cpp:function:: Status MyProto::StreamEncoder::WriteData(ConstByteSpan)
1742
1743This can be freely intermixed with the lower-level API method.
1744
1745.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteBytes(uint32_t field_number, ConstByteSpan)
1746
1747And with the API method that can write bytes from another stream.
1748
1749.. cpp:function:: Status pw::protobuf::StreamEncoder::WriteBytesFromStream(uint32_t field_number, stream::Reader& bytes_reader, size_t num_bytes, ByteSpan stream_pipe_buffer)
1750
1751   The payload for the value is provided through the stream::Reader
1752   ``bytes_reader``. The method reads a chunk of the data from the reader using
1753   the ``stream_pipe_buffer`` and writes it to the encoder.
1754
1755Error Handling
1756==============
1757While individual write calls on a proto encoder return ``pw::Status`` objects,
1758the encoder tracks all status returns and "latches" onto the first error
1759encountered. This status can be accessed via ``StreamEncoder::status()``.
1760
1761Proto map encoding utils
1762========================
1763Some additional helpers for encoding more complex but common protobuf
1764submessages (e.g. ``map<string, bytes>``) are provided in
1765``pw_protobuf/map_utils.h``.
1766
1767.. note::
1768   The helper API are currently in-development and may not remain stable.
1769
1770--------
1771Decoding
1772--------
1773The simplest way to use ``StreamDecoder`` is to decode a proto from the stream
1774into its code generated ``Message`` structure.
1775
1776.. code-block:: c++
1777
1778   #include "my_protos/my_proto.pwpb.h"
1779   #include "pw_protobuf/stream_decoder.h"
1780   #include "pw_status/status.h"
1781   #include "pw_stream/stream.h"
1782
1783   pw::Status DecodeProtoFromStream(pw::stream::Reader& reader) {
1784     MyProto::Message message{};
1785     MyProto::StreamDecoder decoder(reader);
1786     decoder.Read(message);
1787     return decoder.status();
1788   }
1789
1790In the case of errors, the decoding will stop and return with the cursor on the
1791field that caused the error. It is valid in some cases to inspect the error and
1792continue decoding by calling ``Read()`` again on the same structure, or fall
1793back to using the lower-level APIs.
1794
1795Unknown fields in the wire encoding are skipped.
1796
1797If finer-grained control is required, the ``StreamDecoder`` class provides an
1798iterator-style API for processing a message a field at a time where calling
1799``Next()`` advances the decoder to the next proto field.
1800
1801.. cpp:function:: Status pw::protobuf::StreamDecoder::Next()
1802
1803In the code generated classes the ``Field()`` method returns the current field
1804as a typed ``Fields`` enumeration member, while the lower-level API provides a
1805``FieldNumber()`` method that returns the number of the field.
1806
1807.. cpp:function:: Result<MyProto::Fields> MyProto::StreamDecoder::Field()
1808.. cpp:function:: Result<uint32_t> pw::protobuf::StreamDecoder::FieldNumber()
1809
1810.. code-block:: c++
1811
1812   #include "my_protos/my_proto.pwpb.h"
1813   #include "pw_protobuf/strema_decoder.h"
1814   #include "pw_status/status.h"
1815   #include "pw_status/try.h"
1816   #include "pw_stream/stream.h"
1817
1818   pw::Status DecodeProtoFromStream(pw::stream::Reader& reader) {
1819     MyProto::StreamDecoder decoder(reader);
1820     pw::Status status;
1821
1822     uint32_t age;
1823     char name[16];
1824
1825     // Iterate over the fields in the message. A return value of OK indicates
1826     // that a valid field has been found and can be read. When the decoder
1827     // reaches the end of the message, Next() will return OUT_OF_RANGE.
1828     // Other return values indicate an error trying to decode the message.
1829     while ((status = decoder.Next()).ok()) {
1830       // Field() returns a Result<Fields> as it may fail sometimes.
1831       // However, Field() is guaranteed to be valid after a call to Next()
1832       // that returns OK, so the value can be used directly here.
1833       switch (decoder.Field().value()) {
1834         case MyProto::Fields::kAge: {
1835           PW_TRY_ASSIGN(age, decoder.ReadAge());
1836           break;
1837         }
1838         case MyProto::Fields::kName:
1839           // The string field is copied into the provided buffer. If the buffer
1840           // is too small to fit the string, RESOURCE_EXHAUSTED is returned and
1841           // the decoder is not advanced, allowing the field to be re-read.
1842           PW_TRY(decoder.ReadName(name));
1843           break;
1844       }
1845     }
1846
1847     // Do something with the fields...
1848
1849     return status.IsOutOfRange() ? OkStatus() : status;
1850   }
1851
1852Callbacks
1853=========
1854When using the ``Read()`` method with a ``struct Message``, certain fields may
1855require a callback function be set, otherwise a ``DataLoss`` error will be
1856returned should that field be encountered in the wire encoding.
1857
1858The callback is called with the cursor at the field in question, and passed
1859a reference to the typed decoder that can examine the field and be used to
1860decode it.
1861
1862Callback implementations may use any level of API. For example a callback for a
1863nested submessage (with a dependency cycle, or repeated) can be implemented by
1864calling ``Read()`` on a nested decoder.
1865
1866.. code-block:: c++
1867
1868   Store::Message store{};
1869   store.employees.SetDecoder([](Store::StreamDecoder& decoder) {
1870     PW_ASSERT(decoder.Field().value() == Store::Fields::kEmployees);
1871
1872     Employee::Message employee{};
1873     // Set any callbacks on `employee`.
1874     PW_TRY(decoder.GetEmployeesDecoder().Read(employee));
1875     // Do things with `employee`.
1876     return OkStatus();
1877   ));
1878
1879Nested submessages
1880==================
1881Code generated ``GetFieldDecoder`` methods are provided that return a correctly
1882typed ``StreamDecoder`` for the message.
1883
1884.. code-block:: protobuf
1885
1886   message Owner {
1887     Animal pet = 1;
1888   }
1889
1890As with encoding, note that the accessor method is named for the field, while
1891the returned decoder is named for the message type.
1892
1893.. cpp:function:: Animal::StreamDecoder Owner::StreamDecoder::GetPetDecoder()
1894
1895A lower-level API method returns an untyped decoder, which only provides the
1896lower-level API methods. This can be moved to a typed decoder later.
1897
1898.. cpp:function:: pw::protobuf::StreamDecoder pw::protobuf::StreamDecoder::GetNestedDecoder()
1899
1900.. warning::
1901   When a nested submessage is being decoded, any use of the parent decoder that
1902   created the nested decoder will trigger a crash. To resume using the parent
1903   decoder, destroy the submessage decoder first.
1904
1905
1906.. code-block:: c++
1907
1908   case Owner::Fields::kPet: {
1909     // Note that the parent decoder, owner_decoder, cannot be used until the
1910     // nested decoder, pet_decoder, has been destroyed.
1911     Animal::StreamDecoder pet_decoder = owner_decoder.GetPetDecoder();
1912
1913     while ((status = pet_decoder.Next()).ok()) {
1914       switch (pet_decoder.Field().value()) {
1915         // Decode pet fields...
1916       }
1917     }
1918
1919     // When this scope ends, the nested decoder is destroyed and the
1920     // parent decoder, owner_decoder, can be used again.
1921     break;
1922   }
1923
1924Scalar Fields
1925=============
1926Scalar fields are read using code generated ``ReadFoo`` methods that return the
1927appropriate type and assert that the correct field number ie being read.
1928
1929.. cpp:function:: Result<T> MyProto::StreamDecoder::ReadFoo()
1930
1931These can be freely intermixed with the lower-level API that provides a method
1932per field type, requiring that the caller first check the field number.
1933
1934.. cpp:function:: Result<uint64_t> pw::protobuf::StreamDecoder::ReadUint64()
1935.. cpp:function:: Result<int64_t> pw::protobuf::StreamDecoder::ReadSint64()
1936.. cpp:function:: Result<int64_t> pw::protobuf::StreamDecoder::ReadInt64()
1937.. cpp:function:: Result<uint32_t> pw::protobuf::StreamDecoder::ReadUint32()
1938.. cpp:function:: Result<int32_t> pw::protobuf::StreamDecoder::ReadSint32()
1939.. cpp:function:: Result<int32_t> pw::protobuf::StreamDecoder::ReadInt32()
1940.. cpp:function:: Result<uint64_t> pw::protobuf::StreamDecoder::ReadFixed64()
1941.. cpp:function:: Result<uint64_t> pw::protobuf::StreamDecoder::ReadFixed32()
1942.. cpp:function:: Result<double> pw::protobuf::StreamDecoder::ReadDouble()
1943.. cpp:function:: Result<float> pw::protobuf::StreamDecoder::ReadFloat()
1944.. cpp:function:: Result<bool> pw::protobuf::StreamDecoder::ReadBool()
1945
1946The following two code snippets are equivalent, where the first uses the code
1947generated API, and the second implemented by hand.
1948
1949.. code-block:: c++
1950
1951   pw::Result<int32_t> age = my_proto_decoder.ReadAge();
1952
1953.. code-block:: c++
1954
1955   PW_ASSERT(my_proto_decoder.FieldNumber().value() ==
1956       static_cast<uint32_t>(MyProto::Fields::kAge));
1957   pw::Result<int32_t> my_proto_decoder.ReadInt32();
1958
1959Repeated Fields
1960---------------
1961For repeated scalar fields, multiple code generated ``ReadFoos`` methods
1962are provided.
1963
1964.. cpp:function:: Result<T> MyProto::StreamDecoder::ReadFoos()
1965
1966   This reads a single unpacked value.
1967
1968.. cpp:function:: StatusWithSize MyProto::StreamDecoder::ReadFoos(pw::span<T>)
1969
1970   This reads a packed field containing all of the values into the provided span.
1971
1972.. cpp:function:: Status MyProto::StreamDecoder::ReadFoos(pw::Vector<T>&)
1973
1974   Protobuf encoders are permitted to choose either repeating single unpacked
1975   values, or a packed field, including splitting repeated fields up into
1976   multiple packed fields.
1977
1978   This method supports either format, appending values to the provided
1979   ``pw::Vector``.
1980
1981These too can be freely intermixed with the lower-level API methods, to read a
1982single value, a field of packed values into a ``pw::span``, or support both
1983formats appending to a ``pw::Vector`` source.
1984
1985.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedUint64(pw::span<uint64_t>)
1986.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedUint64(pw::Vector<uint64_t>&)
1987.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedSint64(pw::span<int64_t>)
1988.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedSint64(pw::Vector<int64_t>&)
1989.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedInt64(pw::span<int64_t>)
1990.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedInt64(pw::Vector<int64_t>&)
1991.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedUint32(pw::span<uint32_t>)
1992.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedUint32(pw::Vector<uint32_t>&)
1993.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedSint32(pw::span<int32_t>)
1994.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedSint32(pw::Vector<int32_t>&)
1995.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedInt32(pw::span<int32_t>)
1996.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedInt32(pw::Vector<int32_t>&)
1997.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedFixed64(pw::span<uint64_t>)
1998.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedFixed64(pw::Vector<uint64_t>&)
1999.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedFixed32(pw::span<uint64_t>)
2000.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedFixed32(pw::Vector<uint64_t>&)
2001.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedDouble(pw::span<double>)
2002.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedDouble(pw::Vector<double>&)
2003.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedFloat(pw::span<float>)
2004.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedFloat(pw::Vector<float>&)
2005.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadPackedBool(pw::span<bool>)
2006.. cpp:function:: Status pw::protobuf::StreamDecoder::ReadRepeatedBool(pw::Vector<bool>&)
2007
2008The following two code blocks are equivalent, where the first uses the code
2009generated API, and the second is implemented by hand.
2010
2011.. code-block:: c++
2012
2013   pw::Vector<int32_t, 8> numbers;
2014
2015   my_proto_decoder.ReadNumbers(numbers);
2016
2017.. code-block:: c++
2018
2019   pw::Vector<int32_t, 8> numbers;
2020
2021   PW_ASSERT(my_proto_decoder.FieldNumber().value() ==
2022       static_cast<uint32_t>(MyProto::Fields::kNumbers));
2023   my_proto_decoder.ReadRepeatedInt32(numbers);
2024
2025Enumerations
2026============
2027``pw_protobuf`` generates a few functions for working with enumerations.
2028Most importantly, enumerations are read using generated ``ReadEnum`` methods
2029that return the enumeration as the appropriate generated type.
2030
2031.. cpp:function:: Result<MyProto::Enum> MyProto::StreamDecoder::ReadEnum()
2032
2033   Decodes an enum from the stream.
2034
2035.. cpp:function:: constexpr bool MyProto::IsValidEnum(MyProto::Enum value)
2036
2037   Validates the value encoded in the wire format against the known set of
2038   enumerates.
2039
2040.. cpp:function:: constexpr const char* MyProto::EnumToString(MyProto::Enum value)
2041
2042   Returns the string representation of the enum value. For example,
2043   ``FooToString(Foo::kBarBaz)`` returns ``"BAR_BAZ"``. Returns the empty string
2044   if the value is not a valid value.
2045
2046To read enumerations with the lower-level API, you would need to cast the
2047retured value from the ``uint32_t``.
2048
2049The following two code blocks are equivalent, where the first is using the code
2050generated API, and the second implemented by hand.
2051
2052.. code-block:: c++
2053
2054   pw::Result<MyProto::Award> award = my_proto_decoder.ReadAward();
2055   if (!MyProto::IsValidAward(award)) {
2056     PW_LOG_DBG("Unknown award");
2057   }
2058
2059.. code-block:: c++
2060
2061   PW_ASSERT(my_proto_decoder.FieldNumber().value() ==
2062       static_cast<uint32_t>(MyProto::Fields::kAward));
2063   pw::Result<uint32_t> award_value = my_proto_decoder.ReadUint32();
2064   if (award_value.ok()) {
2065     MyProto::Award award = static_cast<MyProto::Award>(award_value);
2066   }
2067
2068Repeated Fields
2069---------------
2070For repeated enum fields, multiple code generated ``ReadEnums`` methods
2071are provided.
2072
2073.. cpp:function:: Result<MyProto::Enums> MyProto::StreamDecoder::ReadEnums()
2074
2075   This reads a single unpacked value.
2076
2077.. cpp:function:: StatusWithSize MyProto::StreamDecoder::ReadEnums(pw::span<MyProto::Enums>)
2078
2079   This reads a packed field containing all of the checked values into the
2080   provided span.
2081
2082.. cpp:function:: Status MyProto::StreamDecoder::ReadEnums(pw::Vector<MyProto::Enums>&)
2083
2084   This method supports either repeated unpacked or packed formats, appending
2085   checked values to the provided ``pw::Vector``.
2086
2087Their use is as scalar fields.
2088
2089Strings
2090=======
2091Strings fields provide a code generated method to read the string into the
2092provided span. Since the span is updated with the size of the string, the string
2093is not automatically null-terminated. :ref:`module-pw_string` provides utility
2094methods to copy string data from spans into other targets.
2095
2096.. cpp:function:: StatusWithSize MyProto::StreamDecoder::ReadName(pw::span<char>)
2097
2098An additional code generated method is provided to return a nested
2099``BytesReader`` to access the data as a stream. As with nested submessage
2100decoders, any use of the parent decoder that created the bytes reader will
2101trigger a crash. To resume using the parent decoder, destroy the bytes reader
2102first.
2103
2104.. cpp:function:: pw::protobuf::StreamDecoder::BytesReader MyProto::StreamDecoder::GetNameReader()
2105
2106These can be freely intermixed with the lower-level API method:
2107
2108.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadString(pw::span<char>)
2109
2110The lower-level ``GetBytesReader()`` method can also be used to read string data
2111as bytes.
2112
2113Bytes
2114=====
2115Bytes fields provide the ``WriteData`` code generated method to read the bytes
2116into the provided span.
2117
2118.. cpp:function:: StatusWithSize MyProto::StreamDecoder::ReadData(ByteSpan)
2119
2120An additional code generated method is provided to return a nested
2121``BytesReader`` to access the data as a stream. As with nested submessage
2122decoders, any use of the parent decoder that created the bytes reader will
2123trigger a crash. To resume using the parent decoder, destroy the bytes reader
2124first.
2125
2126.. cpp:function:: pw::protobuf::StreamDecoder::BytesReader MyProto::StreamDecoder::GetDataReader()
2127
2128These can be freely intermixed with the lower-level API methods.
2129
2130.. cpp:function:: StatusWithSize pw::protobuf::StreamDecoder::ReadBytes(ByteSpan)
2131.. cpp:function:: pw::protobuf::StreamDecoder::BytesReader pw::protobuf::StreamDecoder::GetBytesReader()
2132
2133The ``BytesReader`` supports seeking only if the ``StreamDecoder``'s reader
2134supports seeking.
2135
2136Error Handling
2137==============
2138While individual read calls on a proto decoder return ``pw::Result``,
2139``pw::StatusWithSize``, or ``pw::Status`` objects, the decoder tracks all status
2140returns and "latches" onto the first error encountered. This status can be
2141accessed via ``StreamDecoder::status()``.
2142
2143Length Limited Decoding
2144=======================
2145Where the length of the protobuf message is known in advance, the decoder can
2146be prevented from reading from the stream beyond the known bounds by specifying
2147the known length to the decoder:
2148
2149.. code-block:: c++
2150
2151   pw::protobuf::StreamDecoder decoder(reader, message_length);
2152
2153When a decoder constructed in this way goes out of scope, it will consume any
2154remaining bytes in ``message_length`` allowing the next ``Read()`` on the stream
2155to be data after the protobuf, even when it was not fully parsed.
2156
2157-----------------
2158In-memory Decoder
2159-----------------
2160The separate ``Decoder`` class operates on an protobuf message located in a
2161buffer in memory. It is more efficient than the ``StreamDecoder`` in cases
2162where the complete protobuf data can be stored in memory. The tradeoff of this
2163efficiency is that no code generation is provided, so all decoding must be
2164performed by hand.
2165
2166As ``StreamDecoder``, it provides an iterator-style API for processing a
2167message. Calling ``Next()`` advances the decoder to the next proto field, which
2168can then be read by calling the appropriate ``Read*`` function for the field
2169number.
2170
2171When reading ``bytes`` and ``string`` fields, the decoder returns a view of that
2172field within the buffer; no data is copied out.
2173
2174.. code-block:: c++
2175
2176   #include "pw_protobuf/decoder.h"
2177   #include "pw_status/try.h"
2178
2179   pw::Status DecodeProtoFromBuffer(pw::span<const std::byte> buffer) {
2180     pw::protobuf::Decoder decoder(buffer);
2181     pw::Status status;
2182
2183     uint32_t uint32_field;
2184     std::string_view string_field;
2185
2186     // Iterate over the fields in the message. A return value of OK indicates
2187     // that a valid field has been found and can be read. When the decoder
2188     // reaches the end of the message, Next() will return OUT_OF_RANGE.
2189     // Other return values indicate an error trying to decode the message.
2190     while ((status = decoder.Next()).ok()) {
2191       switch (decoder.FieldNumber()) {
2192         case 1:
2193           PW_TRY(decoder.ReadUint32(&uint32_field));
2194           break;
2195         case 2:
2196           // The passed-in string_view will point to the contents of the string
2197           // field within the buffer.
2198           PW_TRY(decoder.ReadString(&string_field));
2199           break;
2200       }
2201     }
2202
2203     // Do something with the fields...
2204
2205     return status.IsOutOfRange() ? OkStatus() : status;
2206   }
2207
2208---------------
2209Message Decoder
2210---------------
2211
2212.. note::
2213
2214   ``pw::protobuf::Message`` is unrelated to the codegen ``struct Message``
2215   used with ``StreamDecoder``.
2216
2217The module implements a message parsing helper class ``Message``, in
2218``pw_protobuf/message.h``, to faciliate proto message parsing and field access.
2219The class provides interfaces for searching fields in a proto message and
2220creating helper classes for it according to its interpreted field type, i.e.
2221uint32, bytes, string, map<>, repeated etc. The class works on top of
2222``StreamDecoder`` and thus requires a ``pw::stream::SeekableReader`` for proto
2223message access. The following gives examples for using the class to process
2224different fields in a proto message:
2225
2226.. code-block:: c++
2227
2228   // Consider the proto messages defined as follows:
2229   //
2230   // message Nested {
2231   //   string nested_str = 1;
2232   //   bytes nested_bytes = 2;
2233   // }
2234   //
2235   // message {
2236   //   uint32 integer = 1;
2237   //   string str = 2;
2238   //   bytes bytes = 3;
2239   //   Nested nested = 4;
2240   //   repeated string rep_str = 5;
2241   //   repeated Nested rep_nested  = 6;
2242   //   map<string, bytes> str_to_bytes = 7;
2243   //   map<string, Nested> str_to_nested = 8;
2244   // }
2245
2246   // Given a seekable `reader` that reads the top-level proto message, and
2247   // a <proto_size> that gives the size of the proto message:
2248   Message message(reader, proto_size);
2249
2250   // Parse a proto integer field
2251   Uint32 integer = messasge_parser.AsUint32(1);
2252   if (!integer.ok()) {
2253     // handle parsing error. i.e. return integer.status().
2254   }
2255   uint32_t integer_value = integer.value(); // obtained the value
2256
2257   // Parse a string field
2258   String str = message.AsString(2);
2259   if (!str.ok()) {
2260     // handle parsing error. i.e. return str.status();
2261   }
2262
2263   // check string equal
2264   Result<bool> str_check = str.Equal("foo");
2265
2266   // Parse a bytes field
2267   Bytes bytes = message.AsBytes(3);
2268   if (!bytes.ok()) {
2269     // handle parsing error. i.e. return bytes.status();
2270   }
2271
2272   // Get a reader to the bytes.
2273   stream::IntervalReader bytes_reader = bytes.GetBytesReader();
2274
2275   // Parse nested message `Nested nested = 4;`
2276   Message nested = message.AsMessage(4).
2277   // Get the fields in the nested message.
2278   String nested_str = nested.AsString(1);
2279   Bytes nested_bytes = nested.AsBytes(2);
2280
2281   // Parse repeated field `repeated string rep_str = 5;`
2282   RepeatedStrings rep_str = message.AsRepeatedString(5);
2283   // Iterate through the entries. If proto is malformed when
2284   // iterating, the next element (`str` in this case) will be invalid
2285   // and loop will end in the iteration after.
2286   for (String element : rep_str) {
2287     // Check status
2288     if (!str.ok()) {
2289       // In the case of error, loop will end in the next iteration if
2290       // continues. This is the chance for code to catch the error.
2291     }
2292     // Process str
2293   }
2294
2295   // Parse repeated field `repeated Nested rep_nested = 6;`
2296   RepeatedStrings rep_str = message.AsRepeatedString(6);
2297   // Iterate through the entries. For iteration
2298   for (Message element : rep_rep_nestedstr) {
2299     // Check status
2300     if (!element.ok()) {
2301       // In the case of error, loop will end in the next iteration if
2302       // continues. This is the chance for code to catch the error.
2303     }
2304     // Process element
2305   }
2306
2307   // Parse map field `map<string, bytes> str_to_bytes = 7;`
2308   StringToBytesMap str_to_bytes = message.AsStringToBytesMap(7);
2309   // Access the entry by a given key value
2310   Bytes bytes_for_key = str_to_bytes["key"];
2311   // Or iterate through map entries
2312   for (StringToBytesMapEntry entry : str_to_bytes) {
2313     // Check status
2314     if (!entry.ok()) {
2315       // In the case of error, loop will end in the next iteration if
2316       // continues. This is the chance for code to catch the error.
2317     }
2318     String key = entry.Key();
2319     Bytes value = entry.Value();
2320     // process entry
2321   }
2322
2323   // Parse map field `map<string, Nested> str_to_nested = 8;`
2324   StringToMessageMap str_to_nested = message.AsStringToBytesMap(8);
2325   // Access the entry by a given key value
2326   Message nested_for_key = str_to_nested["key"];
2327   // Or iterate through map entries
2328   for (StringToMessageMapEntry entry : str_to_nested) {
2329     // Check status
2330     if (!entry.ok()) {
2331       // In the case of error, loop will end in the next iteration if
2332       // continues. This is the chance for code to catch the error.
2333       // However it is still recommended that the user breaks here.
2334       break;
2335     }
2336     String key = entry.Key();
2337     Message value = entry.Value();
2338     // process entry
2339   }
2340
2341The methods in ``Message`` for parsing a single field, i.e. everty `AsXXX()`
2342method except AsRepeatedXXX() and AsStringMapXXX(), internally performs a
2343linear scan of the entire proto message to find the field with the given
2344field number. This can be expensive if performed multiple times, especially
2345on slow reader. The same applies to the ``operator[]`` of StringToXXXXMap
2346helper class. Therefore, for performance consideration, whenever possible, it
2347is recommended to use the following for-range style to iterate and process
2348single fields directly.
2349
2350
2351.. code-block:: c++
2352
2353   for (Message::Field field : message) {
2354     // Check status
2355     if (!field.ok()) {
2356       // In the case of error, loop will end in the next iteration if
2357       // continues. This is the chance for code to catch the error.
2358     }
2359     if (field.field_number() == 1) {
2360       Uint32 integer = field.As<Uint32>();
2361       ...
2362     } else if (field.field_number() == 2) {
2363       String str = field.As<String>();
2364       ...
2365     } else if (field.field_number() == 3) {
2366       Bytes bytes = field.As<Bytes>();
2367       ...
2368     } else if (field.field_number() == 4) {
2369       Message nested = field.As<Message>();
2370       ...
2371     }
2372   }
2373
2374
2375.. Note::
2376  The helper API are currently in-development and may not remain stable.
2377
2378-----------
2379Size report
2380-----------
2381
2382Full size report
2383================
2384
2385This report demonstrates the size of using the entire decoder with all of its
2386decode methods and a decode callback for a proto message containing each of the
2387protobuf field types.
2388
2389.. include:: size_report/decoder_partial
2390
2391
2392Incremental size report
2393=======================
2394
2395This report is generated using the full report as a base and adding some int32
2396fields to the decode callback to demonstrate the incremental cost of decoding
2397fields in a message.
2398
2399.. include:: size_report/decoder_incremental
2400
2401---------------------------
2402Serialized size calculation
2403---------------------------
2404``pw_protobuf/serialized_size.h`` provides a set of functions for calculating
2405how much memory serialized protocol buffer fields require. The
2406``kMaxSizeBytes*`` variables provide the maximum encoded sizes of each field
2407type. The ``SizeOfField*()`` functions calculate the encoded size of a field of
2408the specified type, given a particular key and, for variable length fields
2409(varint or delimited), a value. The ``SizeOf*Field`` functions calculate the
2410encoded size of fields with a particular wire format (delimited, varint).
2411
2412In the rare event that you need to know the serialized size of a field's tag
2413(field number and wire type), you can use ``TagSizeBytes()`` to calculate the
2414tag size for a given field number.
2415
2416--------------------------
2417Available protobuf modules
2418--------------------------
2419There are a handful of messages ready to be used in Pigweed projects. These are
2420located in ``pw_protobuf/pw_protobuf_protos``.
2421
2422common.proto
2423============
2424Contains Empty message proto used in many RPC calls.
2425
2426
2427status.proto
2428============
2429Contains the enum for pw::Status.
2430
2431.. note::
2432   ``pw::protobuf::StatusCode`` values should not be used outside of a .proto
2433   file. Instead, the StatusCodes should be converted to the Status type in the
2434   language. In C++, this would be:
2435
2436   .. code-block:: c++
2437
2438      // Reading from a proto
2439      pw::Status status = static_cast<pw::Status::Code>(proto.status_field));
2440      // Writing to a proto
2441      proto.status_field = static_cast<pw::protobuf::StatusCode>(status.code()));
2442
2443----------------------------------------
2444Comparison with other protobuf libraries
2445----------------------------------------
2446
2447protobuf-lite
2448=============
2449protobuf-lite is the official reduced-size C++ implementation of protobuf. It
2450uses a restricted subset of the protobuf library's features to minimize code
2451size. However, is is still around 150K in size and requires dynamic memory
2452allocation, making it unsuitable for many embedded systems.
2453
2454nanopb
2455======
2456`Nanopb`_ is a commonly used embedded protobuf library with very small code size
2457and full code generation. It provides both encoding/decoding functionality and
2458in-memory C structs representing protobuf messages.
2459
2460nanopb works well for many embedded products; however, using its generated code
2461can run into RAM usage issues when processing nontrivial protobuf messages due
2462to the necessity of defining a struct capable of storing all configurations of
2463the message, which can grow incredibly large. In one project, Pigweed developers
2464encountered an 11K struct statically allocated for a single message---over twice
2465the size of the final encoded output! (This was what prompted the development of
2466``pw_protobuf``.)
2467
2468To avoid this issue, it is possible to use nanopb's low-level encode/decode
2469functions to process individual message fields directly, but this loses all of
2470the useful semantics of code generation. ``pw_protobuf`` is designed to optimize
2471for this use case; it allows for efficient operations on the wire format with an
2472intuitive user interface.
2473
2474Depending on the requirements of a project, either of these libraries could be
2475suitable.
2476
2477.. _Nanopb: https://jpa.kapsi.fi/nanopb/
2478