1.. _module-pw_tokenizer: 2 3============ 4pw_tokenizer 5============ 6.. pigweed-module:: 7 :name: pw_tokenizer 8 9Logging is critical, but developers are often forced to choose between 10additional logging or saving crucial flash space. The ``pw_tokenizer`` module 11enables **extensive logging with substantially less memory usage** by replacing 12printf-style strings with binary tokens during compilation. It is designed to 13integrate easily into existing logging systems. 14 15Although the most common application of ``pw_tokenizer`` is binary logging, 16**the tokenizer is general purpose and can be used to tokenize any strings**, 17with or without printf-style arguments. 18 19Why tokenize strings? 20 21* **Dramatically reduce binary size** by removing string literals from binaries. 22* **Reduce I/O traffic, RAM, and flash usage** by sending and storing compact tokens 23 instead of strings. We've seen over 50% reduction in encoded log contents. 24* **Reduce CPU usage** by replacing snprintf calls with simple tokenization code. 25* **Remove potentially sensitive log, assert, and other strings** from binaries. 26 27.. grid:: 1 28 29 .. grid-item-card:: :octicon:`rocket` Get started 30 :link: module-pw_tokenizer-get-started 31 :link-type: ref 32 :class-item: sales-pitch-cta-primary 33 34 Integrate pw_tokenizer into your project. 35 36.. grid:: 2 37 38 .. grid-item-card:: :octicon:`code-square` Tokenization 39 :link: module-pw_tokenizer-tokenization 40 :link-type: ref 41 :class-item: sales-pitch-cta-secondary 42 43 Convert strings and arguments to tokens. 44 45 .. grid-item-card:: :octicon:`code-square` Token databases 46 :link: module-pw_tokenizer-token-databases 47 :link-type: ref 48 :class-item: sales-pitch-cta-secondary 49 50 Store a mapping of tokens to the strings and arguments they represent. 51 52.. grid:: 2 53 54 .. grid-item-card:: :octicon:`code-square` Detokenization 55 :link: module-pw_tokenizer-detokenization 56 :link-type: ref 57 :class-item: sales-pitch-cta-secondary 58 59 Expand tokens back to the strings and arguments they represent. 60 61 .. grid-item-card:: :octicon:`info` API reference 62 :link: module-pw_tokenizer-api 63 :link-type: ref 64 :class-item: sales-pitch-cta-secondary 65 66 Detailed reference information about the pw_tokenizer API. 67 68 69.. _module-pw_tokenizer-tokenized-logging-example: 70 71--------------------------- 72Tokenized logging in action 73--------------------------- 74Here's an example of how ``pw_tokenizer`` enables you to store 75and send the same logging information using significantly less 76resources: 77 78.. mermaid:: 79 80 flowchart TD 81 82 subgraph after["After: Tokenized Logs (37 bytes saved!)"] 83 after_log["LOG(#quot;Battery Voltage: %d mV#quot;, voltage)"] -- 4 bytes stored on-device as... --> 84 after_encoding["d9 28 47 8e"] -- 6 bytes sent over the wire as... --> 85 after_transmission["d9 28 47 8e aa 3e"] -- Displayed in logs as... --> 86 after_display["#quot;Battery Voltage: 3989 mV#quot;"] 87 end 88 89 subgraph before["Before: No Tokenization"] 90 before_log["LOG(#quot;Battery Voltage: %d mV#quot;, voltage)"] -- 41 bytes stored on-device as... --> 91 before_encoding["#quot;Battery Voltage: %d mV#quot;"] -- 43 bytes sent over the wire as... --> 92 before_transmission["#quot;Battery Voltage: 3989 mV#quot;"] -- Displayed in logs as... --> 93 before_display["#quot;Battery Voltage: 3989 mV#quot;"] 94 end 95 96 style after stroke:#00c852,stroke-width:3px 97 style before stroke:#ff5252,stroke-width:3px 98 99A quick overview of how the tokenized version works: 100 101* You tokenize ``"Battery Voltage: %d mV"`` with a macro like 102 :c:macro:`PW_TOKENIZE_STRING`. You can use :ref:`module-pw_log_tokenized` 103 to handle the tokenization automatically. 104* After tokenization, ``"Battery Voltage: %d mV"`` becomes ``d9 28 47 8e``. 105* The first 4 bytes sent over the wire is the tokenized version of 106 ``"Battery Voltage: %d mV"``. The last 2 bytes are the value of ``voltage`` 107 converted to a varint using :ref:`module-pw_varint`. 108* The logs are converted back to the original, human-readable message 109 via the :ref:`Detokenization API <module-pw_tokenizer-detokenization>` 110 and a :ref:`token database <module-pw_tokenizer-token-databases>`. 111 112.. toctree:: 113 :hidden: 114 :maxdepth: 1 115 116 Get started <get_started> 117 tokenization 118 token_databases 119 detokenization 120 API reference <api> 121