xref: /aosp_15_r20/external/pigweed/pw_metric/docs.rst (revision 61c4878ac05f98d0ceed94b57d316916de578985)
1.. _module-pw_metric:
2
3=========
4pw_metric
5=========
6
7.. attention::
8   This module is **not yet production ready**; ask us if you are interested in
9   using it out or have ideas about how to improve it.
10
11--------
12Overview
13--------
14Pigweed's metric module is a **lightweight manual instrumentation system** for
15tracking system health metrics like counts or set values. For example,
16``pw_metric`` could help with tracking the number of I2C bus writes, or the
17number of times a buffer was filled before it could drain in time, or safely
18incrementing counters from ISRs.
19
20Key features of ``pw_metric``:
21
22- **Tokenized names** - Names are tokenized using the ``pw_tokenizer`` enabling
23  long metric names that don't bloat your binary.
24
25- **Tree structure** - Metrics can form a tree, enabling grouping of related
26  metrics for clearer organization.
27
28- **Per object collection** - Metrics and groups can live on object instances
29  and be flexibly combined with metrics from other instances.
30
31- **Global registration** - For legacy code bases or just because it's easier,
32  ``pw_metric`` supports automatic aggregation of metrics. This is optional but
33  convenient in many cases.
34
35- **Simple design** - There are only two core data structures: ``Metric`` and
36  ``Group``, which are both simple to understand and use. The only type of
37  metric supported is ``uint32_t`` and ``float``. This module does not support
38  complicated aggregations like running average or min/max.
39
40Example: Instrumenting a single object
41--------------------------------------
42The below example illustrates what instrumenting a class with a metric group
43and metrics might look like. In this case, the object's
44``MySubsystem::metrics()`` member is not globally registered; the user is on
45their own for combining this subsystem's metrics with others.
46
47.. code-block:: cpp
48
49   #include "pw_metric/metric.h"
50
51   class MySubsystem {
52    public:
53     void DoSomething() {
54       attempts_.Increment();
55       if (ActionSucceeds()) {
56         successes_.Increment();
57       }
58     }
59     Group& metrics() { return metrics_; }
60
61    private:
62     PW_METRIC_GROUP(metrics_, "my_subsystem");
63     PW_METRIC(metrics_, attempts_, "attempts", 0u);
64     PW_METRIC(metrics_, successes_, "successes", 0u);
65   };
66
67The metrics subsystem has no canonical output format at this time, but a JSON
68dump might look something like this:
69
70.. code-block:: json
71
72   {
73     "my_subsystem" : {
74       "successes" : 1000,
75       "attempts" : 1200,
76     }
77   }
78
79In this case, every instance of ``MySubsystem`` will have unique counters.
80
81Example: Instrumenting a legacy codebase
82----------------------------------------
83A common situation in embedded development is **debugging legacy code** or code
84which is hard to change; where it is perhaps impossible to plumb metrics
85objects around with dependency injection. The alternative to plumbing metrics
86is to register the metrics through a global mechanism. ``pw_metric`` supports
87this use case. For example:
88
89**Before instrumenting:**
90
91.. code-block:: cpp
92
93   // This code was passed down from generations of developers before; no one
94   // knows what it does or how it works. But it needs to be fixed!
95   void OldCodeThatDoesntWorkButWeDontKnowWhy() {
96     if (some_variable) {
97       DoSomething();
98     } else {
99       DoSomethingElse();
100     }
101   }
102
103**After instrumenting:**
104
105.. code-block:: cpp
106
107   #include "pw_metric/global.h"
108   #include "pw_metric/metric.h"
109
110   PW_METRIC_GLOBAL(legacy_do_something, "legacy_do_something");
111   PW_METRIC_GLOBAL(legacy_do_something_else, "legacy_do_something_else");
112
113   // This code was passed down from generations of developers before; no one
114   // knows what it does or how it works. But it needs to be fixed!
115   void OldCodeThatDoesntWorkButWeDontKnowWhy() {
116     if (some_variable) {
117       legacy_do_something.Increment();
118       DoSomething();
119     } else {
120       legacy_do_something_else.Increment();
121       DoSomethingElse();
122     }
123   }
124
125In this case, the developer merely had to add the metrics header, define some
126metrics, and then start incrementing them. These metrics will be available
127globally through the ``pw::metric::global_metrics`` object defined in
128``pw_metric/global.h``.
129
130Why not just use simple counter variables?
131------------------------------------------
132One might wonder what the point of leveraging a metric library is when it is
133trivial to make some global variables and print them out. There are a few
134reasons:
135
136- **Metrics offload** - To make it easy to get metrics off-device by sharing
137  the infrastructure for offloading.
138
139- **Consistent format** - To get the metrics in a consistent format (e.g.
140  protobuf or JSON) for analysis
141
142- **Uncoordinated collection** - To provide a simple and reliable way for
143  developers on a team to all collect metrics for their subsystems, without
144  having to coordinate to offload. This could extend to code in libraries
145  written by other teams.
146
147- **Pre-boot or interrupt visibility** - Some of the most challenging bugs come
148  from early system boot when not all system facilities are up (e.g. logging or
149  UART). In those cases, metrics provide a low-overhead approach to understand
150  what is happening. During early boot, metrics can be incremented, then after
151  boot dumping the metrics provides insights into what happened. While basic
152  counter variables can work in these contexts too, one still has to deal with
153  the offloading problem; which the library handles.
154
155---------------------
156Metrics API reference
157---------------------
158
159The metrics API consists of just a few components:
160
161- The core data structures ``pw::metric::Metric`` and ``pw::metric::Group``
162- The macros for scoped metrics and groups ``PW_METRIC`` and
163  ``PW_METRIC_GROUP``
164- The macros for globally registered metrics and groups
165  ``PW_METRIC_GLOBAL`` and ``PW_METRIC_GROUP_GLOBAL``
166- The global groups and metrics list: ``pw::metric::global_groups`` and
167  ``pw::metric::global_metrics``.
168
169Metric
170------
171The ``pw::metric::Metric`` provides:
172
173- A 31-bit tokenized name
174- A 1-bit discriminator for int or float
175- A 32-bit payload (int or float)
176- A 32-bit next pointer (intrusive list)
177
178The metric object is 12 bytes on 32-bit platforms.
179
180.. cpp:class:: pw::metric::Metric
181
182   .. cpp:function:: Increment(uint32_t amount = 0)
183
184      Increment the metric by the given amount. Results in undefined behaviour if
185      the metric is not of type int.
186
187   .. cpp:function:: Set(uint32_t value)
188
189      Set the metric to the given value. Results in undefined behaviour if the
190      metric is not of type int.
191
192   .. cpp:function:: Set(float value)
193
194      Set the metric to the given value. Results in undefined behaviour if the
195      metric is not of type float.
196
197.. _module-pw_metric-group:
198
199Group
200-----
201The ``pw::metric::Group`` object is simply:
202
203- A name for the group
204- A list of children groups
205- A list of leaf metrics groups
206- A 32-bit next pointer (intrusive list)
207
208The group object is 16 bytes on 32-bit platforms.
209
210.. cpp:class:: pw::metric::Group
211
212   .. cpp:function:: Dump(int indent_level = 0)
213
214      Recursively dump a metrics group to ``pw_log``. Produces output like:
215
216      .. code-block:: none
217
218         "$6doqFw==": {
219           "$05OCZw==": {
220             "$VpPfzg==": 1,
221             "$LGPMBQ==": 1.000000,
222             "$+iJvUg==": 5,
223           }
224           "$9hPNxw==": 65,
225           "$oK7HmA==": 13,
226           "$FCM4qQ==": 0,
227         }
228
229      Note the metric names are tokenized with base64. Decoding requires using
230      the Pigweed detokenizer. With a detokenizing-enabled logger, you could get
231      something like:
232
233      .. code-block:: none
234
235         "i2c_1": {
236           "gyro": {
237             "num_sampleses": 1,
238             "init_time_us": 1.000000,
239             "initialized": 5,
240           }
241           "bus_errors": 65,
242           "transactions": 13,
243           "bytes_sent": 0,
244         }
245
246Macros
247------
248The **macros are the primary mechanism for creating metrics**, and should be
249used instead of directly constructing metrics or groups. The macros handle
250tokenizing the metric and group names.
251
252.. cpp:function:: PW_METRIC(identifier, name, value)
253.. cpp:function:: PW_METRIC(group, identifier, name, value)
254.. cpp:function:: PW_METRIC_STATIC(identifier, name, value)
255.. cpp:function:: PW_METRIC_STATIC(group, identifier, name, value)
256
257   Declare a metric, optionally adding it to a group.
258
259   - **identifier** - An identifier name for the created variable or member.
260     For example: ``i2c_transactions`` might be used as a local or global
261     metric; inside a class, could be named according to members
262     (``i2c_transactions_`` for Google's C++ style).
263   - **name** - The string name for the metric. This will be tokenized. There
264     are no restrictions on the contents of the name; however, consider
265     restricting these to be valid C++ identifiers to ease integration with
266     other systems.
267   - **value** - The initial value for the metric. Must be either a floating
268     point value (e.g. ``3.2f``) or unsigned int (e.g. ``21u``).
269   - **group** - A ``pw::metric::Group`` instance. If provided, the metric is
270     added to the given group.
271
272   The macro declares a variable or member named "name" with type
273   ``pw::metric::Metric``, and works in three contexts: global, local, and
274   member.
275
276   If the ``_STATIC`` variant is used, the macro declares a variable with static
277   storage. These can be used in function scopes, but not in classes.
278
279   1. At global scope:
280
281      .. code-block:: cpp
282
283         PW_METRIC(foo, "foo", 15.5f);
284
285         void MyFunc() {
286           foo.Increment();
287         }
288
289   2. At local function or member function scope:
290
291      .. code-block:: cpp
292
293         void MyFunc() {
294           PW_METRIC(foo, "foo", 15.5f);
295           foo.Increment();
296           // foo goes out of scope here; be careful!
297         }
298
299   3. At member level inside a class or struct:
300
301      .. code-block:: cpp
302
303         struct MyStructy {
304           void DoSomething() {
305             somethings.Increment();
306           }
307           // Every instance of MyStructy will have a separate somethings counter.
308           PW_METRIC(somethings, "somethings", 0u);
309         }
310
311   You can also put a metric into a group with the macro. Metrics can belong to
312   strictly one group, otherwise an assertion will fail. Example:
313
314   .. code-block:: cpp
315
316      PW_METRIC_GROUP(my_group, "my_group");
317      PW_METRIC(my_group, foo, "foo", 0.2f);
318      PW_METRIC(my_group, bar, "bar", 44000u);
319      PW_METRIC(my_group, zap, "zap", 3.14f);
320
321   .. tip::
322      If you want a globally registered metric, see ``pw_metric/global.h``; in
323      that context, metrics are globally registered without the need to
324      centrally register in a single place.
325
326.. cpp:function:: PW_METRIC_GROUP(identifier, name)
327.. cpp:function:: PW_METRIC_GROUP(parent_group, identifier, name)
328.. cpp:function:: PW_METRIC_GROUP_STATIC(identifier, name)
329.. cpp:function:: PW_METRIC_GROUP_STATIC(parent_group, identifier, name)
330
331   Declares a ``pw::metric::Group`` with name name; the name is tokenized.
332   Works similar to ``PW_METRIC`` and can be used in the same contexts (global,
333   local, and member). Optionally, the group can be added to a parent group.
334
335   If the ``_STATIC`` variant is used, the macro declares a variable with static
336   storage. These can be used in function scopes, but not in classes.
337
338   Example:
339
340   .. code-block:: cpp
341
342      PW_METRIC_GROUP(my_group, "my_group");
343      PW_METRIC(my_group, foo, "foo", 0.2f);
344      PW_METRIC(my_group, bar, "bar", 44000u);
345      PW_METRIC(my_group, zap, "zap", 3.14f);
346
347.. cpp:function:: PW_METRIC_GLOBAL(identifier, name, value)
348
349   Declare a ``pw::metric::Metric`` with name name, and register it in the
350   global metrics list ``pw::metric::global_metrics``.
351
352   Example:
353
354   .. code-block:: cpp
355
356      #include "pw_metric/metric.h"
357      #include "pw_metric/global.h"
358
359      // No need to coordinate collection of foo and bar; they're autoregistered.
360      PW_METRIC_GLOBAL(foo, "foo", 0.2f);
361      PW_METRIC_GLOBAL(bar, "bar", 44000u);
362
363   Note that metrics defined with ``PW_METRIC_GLOBAL`` should never be added to
364   groups defined with ``PW_METRIC_GROUP_GLOBAL``. Each metric can only belong
365   to one group, and metrics defined with ``PW_METRIC_GLOBAL`` are
366   pre-registered with the global metrics list.
367
368   .. attention::
369      Do not create ``PW_METRIC_GLOBAL`` instances anywhere other than global
370      scope. Putting these on an instance (member context) would lead to dangling
371      pointers and misery. Metrics are never deleted or unregistered!
372
373.. cpp:function:: PW_METRIC_GROUP_GLOBAL(identifier, name, value)
374
375   Declare a ``pw::metric::Group`` with name name, and register it in the
376   global metric groups list ``pw::metric::global_groups``.
377
378   Note that metrics created with ``PW_METRIC_GLOBAL`` should never be added to
379   groups! Instead, just create a freestanding metric and register it into the
380   global group (like in the example below).
381
382   Example:
383
384   .. code-block:: cpp
385
386      #include "pw_metric/metric.h"
387      #include "pw_metric/global.h"
388
389      // No need to coordinate collection of this group; it's globally registered.
390      PW_METRIC_GROUP_GLOBAL(leagcy_system, "legacy_system");
391      PW_METRIC(leagcy_system, foo, "foo",0.2f);
392      PW_METRIC(leagcy_system, bar, "bar",44000u);
393
394   .. attention::
395      Do not create ``PW_METRIC_GROUP_GLOBAL`` instances anywhere other than
396      global scope. Putting these on an instance (member context) would lead to
397      dangling pointers and misery. Metrics are never deleted or unregistered!
398
399.. cpp:function:: PW_METRIC_TOKEN(name)
400
401   Declare a ``pw::metric::Token`` (``pw::tokenizer::Token``) for a metric with
402   name ``name``. This token matches the ``.name()`` of a metric with the same
403   name.
404
405   This is a wrapper around ``PW_TOKENIZE_STRING_MASK`` and carries the same
406   semantics.
407
408   This is unlikely to be used by most pw_metric consumers.
409
410----------------------
411Usage & Best Practices
412----------------------
413This library makes several tradeoffs to enable low memory use per-metric, and
414one of those tradeoffs results in requiring care in constructing the metric
415trees.
416
417Use the Init() pattern for static objects with metrics
418------------------------------------------------------
419A common pattern in embedded systems is to allocate many objects globally, and
420reduce reliance on dynamic allocation (or eschew malloc entirely). This leads
421to a pattern where rich/large objects are statically constructed at global
422scope, then interacted with via tasks or threads. For example, consider a
423hypothetical global ``Uart`` object:
424
425.. code-block:: cpp
426
427   class Uart {
428    public:
429     Uart(span<std::byte> rx_buffer, span<std::byte> tx_buffer)
430       : rx_buffer_(rx_buffer), tx_buffer_(tx_buffer) {}
431
432     // Send/receive here...
433
434    private:
435     pw::span<std::byte> rx_buffer;
436     pw::span<std::byte> tx_buffer;
437   };
438
439   std::array<std::byte, 512> uart_rx_buffer;
440   std::array<std::byte, 512> uart_tx_buffer;
441   Uart uart1(uart_rx_buffer, uart_tx_buffer);
442
443Through the course of building a product, the team may want to add metrics to
444the UART to for example gain insight into which operations are triggering lots
445of data transfer. When adding metrics to the above imaginary UART object, one
446might consider the following approach:
447
448.. code-block:: cpp
449
450   class Uart {
451    public:
452     Uart(span<std::byte> rx_buffer,
453          span<std::byte> tx_buffer,
454          Group& parent_metrics)
455       : rx_buffer_(rx_buffer),
456         tx_buffer_(tx_buffer) {
457         // PROBLEM! parent_metrics may not be constructed if it's a reference
458         // to a static global.
459         parent_metrics.Add(tx_bytes_);
460         parent_metrics.Add(rx_bytes_);
461      }
462
463     // Send/receive here which increment tx/rx_bytes.
464
465    private:
466     pw::span<std::byte> rx_buffer;
467     pw::span<std::byte> tx_buffer;
468
469     PW_METRIC(tx_bytes_, "tx_bytes", 0);
470     PW_METRIC(rx_bytes_, "rx_bytes", 0);
471   };
472
473   PW_METRIC_GROUP(global_metrics, "/");
474   PW_METRIC_GROUP(global_metrics, uart1_metrics, "uart1");
475
476   std::array<std::byte, 512> uart_rx_buffer;
477   std::array<std::byte, 512> uart_tx_buffer;
478   Uart uart1(uart_rx_buffer,
479              uart_tx_buffer,
480              uart1_metrics);
481
482However, this **is incorrect**, since the ``parent_metrics`` (pointing to
483``uart1_metrics`` in this case) may not be constructed at the point of
484``uart1`` getting constructed. Thankfully in the case of ``pw_metric`` this
485will result in an assertion failure (or it will work correctly if the
486constructors are called in a favorable order), so the problem will not go
487unnoticed.  Instead, consider using the ``Init()`` pattern for static objects,
488where references to dependencies may only be stored during construction, but no
489methods on the dependencies are called.
490
491Instead, the ``Init()`` approach separates global object construction into two
492phases: The constructor where references are stored, and a ``Init()`` function
493which is called after all static constructors have run. This approach works
494correctly, even when the objects are allocated globally:
495
496.. code-block:: cpp
497
498   class Uart {
499    public:
500     // Note that metrics is not passed in here at all.
501     Uart(span<std::byte> rx_buffer,
502          span<std::byte> tx_buffer)
503       : rx_buffer_(rx_buffer),
504         tx_buffer_(tx_buffer) {}
505
506      // Precondition: parent_metrics is already constructed.
507      void Init(Group& parent_metrics) {
508         parent_metrics.Add(tx_bytes_);
509         parent_metrics.Add(rx_bytes_);
510      }
511
512     // Send/receive here which increment tx/rx_bytes.
513
514    private:
515     pw::span<std::byte> rx_buffer;
516     pw::span<std::byte> tx_buffer;
517
518     PW_METRIC(tx_bytes_, "tx_bytes", 0);
519     PW_METRIC(rx_bytes_, "rx_bytes", 0);
520   };
521
522   PW_METRIC_GROUP(root_metrics, "/");
523   PW_METRIC_GROUP(root_metrics, uart1_metrics, "uart1");
524
525   std::array<std::byte, 512> uart_rx_buffer;
526   std::array<std::byte, 512> uart_tx_buffer;
527   Uart uart1(uart_rx_buffer,
528              uart_tx_buffer);
529
530   void main() {
531     // uart1_metrics is guaranteed to be initialized by this point, so it is
532     safe to pass it to Init().
533     uart1.Init(uart1_metrics);
534   }
535
536.. attention::
537   Be extra careful about **static global metric registration**. Consider using
538   the ``Init()`` pattern.
539
540Metric member order matters in objects
541--------------------------------------
542The order of declaring in-class groups and metrics matters if the metrics are
543within a group declared inside the class. For example, the following class will
544work fine:
545
546.. code-block:: cpp
547
548   #include "pw_metric/metric.h"
549
550   class PowerSubsystem {
551    public:
552      Group& metrics() { return metrics_; }
553      const Group& metrics() const { return metrics_; }
554
555    private:
556     PW_METRIC_GROUP(metrics_, "power");  // Note metrics_ declared first.
557     PW_METRIC(metrics_, foo, "foo", 0.2f);
558     PW_METRIC(metrics_, bar, "bar", 44000u);
559   };
560
561but the following one will not since the group is constructed after the metrics
562(and will result in a compile error):
563
564.. code-block:: cpp
565
566   #include "pw_metric/metric.h"
567
568   class PowerSubsystem {
569    public:
570      Group& metrics() { return metrics_; }
571      const Group& metrics() const { return metrics_; }
572
573    private:
574     PW_METRIC(metrics_, foo, "foo", 0.2f);
575     PW_METRIC(metrics_, bar, "bar", 44000u);
576     PW_METRIC_GROUP(metrics_, "power");  // Error: metrics_ must be first.
577   };
578
579.. attention::
580
581   Put **groups before metrics** when declaring metrics members inside classes.
582
583Thread safety
584-------------
585``pw_metric`` has **no built-in synchronization for manipulating the tree**
586structure. Users are expected to either rely on shared global mutex when
587constructing the metric tree, or do the metric construction in a single thread
588(e.g. a boot/init thread). The same applies for destruction, though we do not
589advise destructing metrics or groups.
590
591Individual metrics have atomic ``Increment()``, ``Set()``, and the value
592accessors ``as_float()`` and ``as_int()`` which don't require separate
593synchronization, and can be used from ISRs.
594
595.. attention::
596
597   **You must synchronize access to metrics**. ``pw_metrics`` does not
598   internally synchronize access during construction. Metric Set/Increment are
599   safe.
600
601Lifecycle
602---------
603Metric objects are not designed to be destructed, and are expected to live for
604the lifetime of the program or application. If you need dynamic
605creation/destruction of metrics, ``pw_metric`` does not attempt to cover that
606use case. Instead, ``pw_metric`` covers the case of products with two execution
607phases:
608
6091. A boot phase where the metric tree is created.
6102. A run phase where metrics are collected. The tree structure is fixed.
611
612Technically, it is possible to destruct metrics provided care is taken to
613remove the given metric (or group) from the list it's contained in. However,
614there are no helper functions for this, so be careful.
615
616Below is an example that **is incorrect**. Don't do what follows!
617
618.. code-block:: cpp
619
620   #include "pw_metric/metric.h"
621
622   void main() {
623     PW_METRIC_GROUP(root, "/");
624     {
625       // BAD! The metrics have a different lifetime than the group.
626       PW_METRIC(root, temperature, "temperature_f", 72.3f);
627       PW_METRIC(root, humidity, "humidity_relative_percent", 33.2f);
628     }
629     // OOPS! root now has a linked list that points to the destructed
630     // "humidity" object.
631   }
632
633.. attention::
634   **Don't destruct metrics**. Metrics are designed to be registered /
635   structured upfront, then manipulated during a device's active phase. They do
636   not support destruction.
637
638.. _module-pw_metric-exporting:
639
640-----------------
641Exporting metrics
642-----------------
643Collecting metrics on a device is not useful without a mechanism to export
644those metrics for analysis and debugging. ``pw_metric`` offers optional RPC
645service libraries (``:metric_service_nanopb`` based on nanopb, and
646``:metric_service_pwpb`` based on pw_protobuf) that enable exporting a
647user-supplied set of on-device metrics via RPC. This facility is intended to
648function from the early stages of device bringup through production in the
649field.
650
651The metrics are fetched by calling the ``MetricService.Get`` RPC method, which
652streams all registered metrics to the caller in batches (server streaming RPC).
653Batching the returned metrics avoids requiring a large buffer or large RPC MTU.
654
655The returned metric objects have flattened paths to the root. For example, the
656returned metrics (post detokenization and jsonified) might look something like:
657
658.. code-block:: json
659
660   {
661     "/i2c1/failed_txns": 17,
662     "/i2c1/total_txns": 2013,
663     "/i2c1/gyro/resets": 24,
664     "/i2c1/gyro/hangs": 1,
665     "/spi1/thermocouple/reads": 242,
666     "/spi1/thermocouple/temp_celsius": 34.52,
667   }
668
669Note that there is no nesting of the groups; the nesting is implied from the
670path.
671
672RPC service setup
673-----------------
674To expose a ``MetricService`` in your application, do the following:
675
6761. Define metrics around the system, and put them in a group or list of
677   metrics. Easy choices include for example the ``global_groups`` and
678   ``global_metrics`` variables; or creat your own.
6792. Create an instance of ``pw::metric::MetricService``.
6803. Register the service with your RPC server.
681
682For example:
683
684.. code-block:: cpp
685
686   #include "pw_rpc/server.h"
687   #include "pw_metric/metric.h"
688   #include "pw_metric/global.h"
689   #include "pw_metric/metric_service_nanopb.h"
690
691   // Note: You must customize the RPC server setup; see pw_rpc.
692   Channel channels[] = {
693    Channel::Create<1>(&uart_output),
694   };
695   Server server(channels);
696
697   // Metric service instance, pointing to the global metric objects.
698   // This could also point to custom per-product or application objects.
699   pw::metric::MetricService metric_service(
700       pw::metric::global_metrics,
701       pw::metric::global_groups);
702
703   void RegisterServices() {
704     server.RegisterService(metric_service);
705     // Register other services here.
706   }
707
708   void main() {
709     // ... system initialization ...
710
711     RegisterServices();
712
713     // ... start your applcation ...
714   }
715
716.. attention::
717   Take care when exporting metrics. Ensure **appropriate access control** is in
718   place. In some cases it may make sense to entirely disable metrics export for
719   production builds. Although reading metrics via RPC won't influence the
720   device, in some cases the metrics could expose sensitive information if
721   product owners are not careful.
722
723.. attention::
724   **MetricService::Get is a synchronous RPC method**
725
726   Calls to is ``MetricService::Get`` are blocking and will send all metrics
727   immediately, even though it is a server-streaming RPC. This will work fine if
728   the device doesn't have too many metrics, or doesn't have concurrent RPCs
729   like logging, but could be a problem in some cases.
730
731   We plan to offer an async version where the application is responsible for
732   pumping the metrics into the streaming response. This gives flow control to
733   the application.
734
735-----------
736Size report
737-----------
738The below size report shows the cost in code and memory for a few examples of
739metrics. This does not include the RPC service.
740
741.. include:: metric_size_report
742
743.. attention::
744   At time of writing, **the above sizes show an unexpectedly large flash
745   impact**. We are investigating why GCC is inserting large global static
746   constructors per group, when all the logic should be reused across objects.
747
748-------------
749Metric Parser
750-------------
751The metric_parser Python Module requests the system metrics via RPC, then parses the
752response while detokenizing the group and metrics names, and returns the metrics
753in a dictionary organized by group and value.
754
755----------------
756Design tradeoffs
757----------------
758There are many possible approaches to metrics collection and aggregation. We've
759chosen some points on the tradeoff curve:
760
761- **Atomic-sized metrics** - Using simple metric objects with just uint32/float
762  enables atomic operations. While it might be nice to support larger types, it
763  is more useful to have safe metrics increment from interrupt subroutines.
764
765- **No aggregate metrics (yet)** - Aggregate metrics (e.g. average, max, min,
766  histograms) are not supported, and must be built on top of the simple base
767  metrics. By taking this route, we can considerably simplify the core metrics
768  system and have aggregation logic in separate modules. Those modules can then
769  feed into the metrics system - for example by creating multiple metrics for a
770  single underlying metric. For example: "foo", "foo_max", "foo_min" and so on.
771
772  The other problem with automatic aggregation is that what period the
773  aggregation happens over is often important, and it can be hard to design
774  this cleanly into the API. Instead, this responsibility is pushed to the user
775  who must take more care.
776
777  Note that we will add helpers for aggregated metrics.
778
779- **No virtual metrics** - An alternate approach to the concrete Metric class
780  in the current module is to have a virtual interface for metrics, and then
781  allow those metrics to have their own storage. This is attractive but can
782  lead to many vtables and excess memory use in simple one-metric use cases.
783
784- **Linked list registration** - Using linked lists for registration is a
785  tradeoff, accepting some memory overhead in exchange for flexibility. Other
786  alternatives include a global table of metrics, which has the disadvantage of
787  requiring centralizing the metrics -- an impossibility for middleware like
788  Pigweed.
789
790- **Synchronization** - The only synchronization guarantee provided by
791  pw_metric is that increment and set are atomic. Other than that, users are on
792  their own to synchonize metric collection and updating.
793
794- **No fast metric lookup** - The current design does not make it fast to
795  lookup a metric at runtime; instead, one must run a linear search of the tree
796  to find the matching metric. In most non-dynamic use cases, this is fine in
797  practice, and saves having a more involved hash table. Metric updates will be
798  through direct member or variable accesses.
799
800- **Relying on C++ static initialization** - In short, the convenience
801  outweighs the cost and risk. Without static initializers, it would be
802  impossible to automatically collect the metrics without post-processing the
803  C++ code to find the metrics; a huge and debatably worthwhile approach. We
804  have carefully analyzed the static initializer behaviour of Pigweed's
805  IntrusiveList and are confident it is correct.
806
807- **Both local & global support** - Potentially just one approach (the local or
808  global one) could be offered, making the module less complex. However, we
809  feel the additional complexity is worthwhile since there are legimitate use
810  cases for both e.g. ``PW_METRIC`` and ``PW_METRIC_GLOBAL``. We'd prefer to
811  have a well-tested upstream solution for these use cases rather than have
812  customers re-implement one of these.
813
814----------------
815Roadmap & Status
816----------------
817- **String metric names** - ``pw_metric`` stores metric names as tokens. On one
818  hand, this is great for production where having a compact binary is often a
819  requirement to fit the application in the given part. However, in early
820  development before flash is a constraint, string names are more convenient to
821  work with since there is no need for host-side detokenization. We plan to add
822  optional support for using supporting strings.
823
824- **Aggregate metrics** - We plan to add support for aggregate metrics on top
825  of the simple metric mechanism, either as another module or as additional
826  functionality inside this one. Likely examples include min/max,
827
828- **Selectively enable or disable metrics** - Currently the metrics are always
829  enabled once included. In practice this is not ideal since many times only a
830  few metrics are wanted in production, but having to strip all the metrics
831  code is error prone. Instead, we will add support for controlling what
832  metrics are enabled or disabled at compile time. This may rely on of C++20's
833  support for zero-sized members to fully remove the cost.
834
835- **Async RPC** - The current RPC service exports the metrics by streaming
836  them to the client in batches. However, the current solution streams all the
837  metrics to completion; this may block the RPC thread. In the future we will
838  have an async solution where the user is in control of flow priority.
839
840- **Timer integration** - We would like to add a stopwatch type mechanism to
841  time multiple in-flight events.
842
843- **C support** - In practice it's often useful or necessary to instrument
844  C-only code. While it will be impossible to support the global registration
845  system that the C++ version supports, we will figure out a solution to make
846  instrumenting C code relatively smooth.
847
848- **Global counter** - We may add a global metric counter to help detect cases
849  where post-initialization metrics manipulations are done.
850
851- **Proto structure** - It may be possible to directly map metrics to a custom
852  proto structure, where instead of a name or token field, a tag field is
853  provided. This could result in elegant export to an easily machine parsable
854  and compact representation on the host. We may investigate this in the
855  future.
856
857- **Safer data structures** - At a cost of 4B per metric and 4B per group, it
858  may be possible to make metric structure instantiation safe even in static
859  constructors, and also make it safe to remove metrics dynamically. We will
860  consider whether this tradeoff is the right one, since a 4B cost per metric
861  is substantial on projects with many metrics.
862