1--- 2title: 'Shaped Text' 3linkTitle: 'Shaped Text' 4--- 5 6A series of object models for describing a low-level builder for multi-line formatted text, and the resulting objects that expose the results of shaping that text. These are done outside of DOM Text nodes, and outside of any particular rendering model (e.g. canvas2d or webgl). 7 8A related explainer focuses on suggested [extensions to canvas2d](/docs/dev/design/text_c2d) to allow it to efficiently render the shaped results, and to offer helper objects for inspecting useful properties from a typeface. 9 10[Overview document](/docs/dev/design/text_overview) 11 12## Target audience 13 14We want to target web apps that have already chosen to render their content either in canvas2d, 15or webgl, or in some other fashion, but still want access to the powerful international text shaping 16and layout services inherent in the browser. In the case of canvas2d, which already has some facilities 17for text, we want to address the missing services and low-level results needed for creating interactive 18text editing or high-performance rendering and animations. 19 20Rather than 'extend' the existing canvas2d fillText() method, we propose an explicit 2-step model: 21process the 'rich' text input into shaped results, and then expose those results to the client, allowing 22them to draw or edit or consume the results as they choose. 23 24JavaScript frameworks are another target audience. This proposal is heavily influenced by successful 25APIs on native platforms (desktop and mobile) and seeks to deliver similar control and performance. 26Thus it may be quite natural that sophisticated frameworks build upon these interfaces, providing more 27'friendly', constrained versions of the features. This is expected, since multiple 'high level' models 28for text are valid, each with its own opinions and tradeoffs. The goal of this API is to expose the 29core services and results, and leave the opinionated layers to the JavaScript community. 30 31### Principles 32* An imperative JavaScript-friendly text representation. 33* Restrict input to only what is needed for shaping and metrics is provided. 34* Decorations (i.e. colors, underlines, shadows, effects) are explicitly not specified, as those can 35vary widely with rendering technologies (and client imagination). 36 37## Sequence of calls 38 39For maximum re-use and efficiency, the process of going from rich-text description to final shaped 40and formatted results is broken into stages. Each 'stage' carries out specific processing, and in-turn 41becomes a factory to return an instances of the next stage. 42 43`TextBuilder`, `ShapedText` and `FormattedText` objects are used in sequence: 44 45```js 46const builder = new ParagraphBuilder(font-fallback-chain); 47const shaped = builder.shape(DOMString text, sequence<TextBlock> blocks); 48const formatted = shaped.format(double width, double height, alignment); 49``` 50 51A Block is a descriptor for a run of text. Currently there are two specializations, but others may be 52added without breaking the design. 53 54```WebIDL 55interface Typeface { 56 // Number or opaque object: Whatever is needed for the client to know exactly 57 // what font-resource (e.g. file, byte-array, etc.) is being used. 58 // Without this, the glyph IDs would be meaningless. 59 // 60 // This interface is really an “instance” of the font-resource. It includes 61 // any font-wide modifies that the client (or the shaper) may have requested: 62 // e.g. variations, synthetic-bold, … 63 // 64 // Factories to create Typeface can be described elsewhere. The point here 65 // is that such a unique identifier exists for each font-asset-instance, 66 // and that they can be passed around (in/out of the browser), and compared 67 // to each other. 68}; 69 70interface TextBlock { 71 unsigned long length; // number of codepoints in this block 72}; 73 74interface InFont { 75 attribute sequence<Typeface> typefaces; // for preferred fallback faces 76 attribute double size; 77 attribute double scaleX?; // 1.0 if not specified 78 attribute double skewX?: // 0.0 if not specified (for oblique) 79 80 attribute sequence<FontFeature> features?; 81 // additional attributes for letter spacing, etc. 82}; 83 84interface FontBlock : TextBlock { 85 attribute InFont font; 86}; 87 88interface PlaceholderBlock : TextBlock { 89 attribute double width; 90 attribute double height; 91 attribute double offsetFromBaseline; 92}; 93 94interface ShapedTextBuilder { 95 constructor(TextDirection, // default direction (e.g. R2L, L2R) 96 sequence<Typeface>?, // optional shared fallback sequence (after TextBlock's) 97 ...); 98 99 ShapedText shape(DOMString text, sequence<TextBlock>); 100}; 101``` 102 103Here is a simple example, specifying 3 blocks for the text. 104 105```js 106const fontA = new Font({family-name: "Helvetica", size: 14}); 107const fontB = new Font({family-name: "Times", size: 18}); 108const blocks = [ 109 { length: 6, font: fontA }, 110 { length: 5, font: fontB }, 111 { length: 6, font: fontA }, 112]; 113 114const shaped = builder.shape("Hello text world.", blocks); 115 116// now we can format the shaped text to get access to glyphs and positions. 117 118const formatted = shaped.format({width: 50, alignment: CENTER}); 119``` 120 121This is explicitly intended to be efficient, both for the browser to digest, and for the client to 122be able to reuse compound objects as they choose (i.e. reusing fontA in this example). 123 124If there is a mismatch between the length of the text string, and the sum of the blocks' lengths, 125then an exception is raised. 126 127## Access the results of shaping and formatting 128FormattedText has methods and the raw data results: 129 130```WebIDL 131typedef unsigned long TextIndex; 132 133interface TextPosition { 134 readonly attribute TextIndex textIndex; 135 readonly attribute unsigned long lineIndex; 136 readonly attribute unsigned long runIndex; 137 readonly attribute unsigned long glyphIndex; 138}; 139 140interface FormattedText { 141 // Interaction methods 142 143 // Given a valid index into the text, adjust it for proper grapheme 144 // boundaries, and return the TextPosition. 145 TextPosition indexToPosition(TextIndex index); 146 147 // Given an x,y position, return the TextPosition 148 // (adjusted for proper grapheme boundaries). 149 TextPosition hitTextToPosition(double x, double y); 150 151 // Given two logical text indices (e.g. the start and end of a selection range), 152 // return the corresponding visual set of ranges (e.g. for highlighting). 153 sequence<TextPosition> indicesToVisualSelection(TextIndex t0, TextIndex t1); 154 155 // Raw data 156 157 readonly attribute Rect bounds; 158 159 readonly attribute sequence<TextLine> lines; 160}; 161``` 162 163The sequence of TextLines is really and array of arrays: each line containing an 164array of Runs (either Glyphs or Placeholders for now). 165 166```WebIDL 167// Shared by all output runs, specifying the range of code-points that produced 168// this run. Known subclasses: TextRun, PlaceholderRun. 169interface TextRun { 170 readonly attribute TextIndex startIndex; 171 readonly attribute TextIndex endIndex; 172}; 173 174interface GlyphRunFont { 175 // Information to know which font-resource (typeface) to use, 176 // and at what transformation (size, etc.) to use it. 177 // 178 readonly attribute Typeface typeface; 179 readonly attribute double size; 180 readonly attribute double scaleX?; // 1.0 if not specified 181 readonly attribute double skewX?: // 0.0 if not specified (could be a bool) 182}; 183 184interface GlyphRun : TextRun { 185 readonly attribute GlyphRunFont font; 186 187 // Information to know what positioned glyphs are in the run, 188 // and what the corresponding text offsets are for those glyphs. 189 // These “offsets” are not needed to correctly draw the glyphs, but are needed 190 // during selections and editing, to know the mapping back to the original text. 191 // 192 readonly attribute sequence<unsigned short> glyphs; // N glyphs 193 readonly attribute sequence<float> positions; // N+1 x,y pairs 194 readonly attribute sequence<TextIndex> indices; // N+1 indices 195}; 196 197interface PlaceholderRun : TextRun { 198 readonly attribute Rect bounds; 199}; 200 201interface TextLine { 202 readonly attribute TextIndex startIndex; 203 readonly attribute TextIndex endIndex; 204 205 readonly attribute double top; 206 readonly attribute double bottom; 207 readonly attribute double baselineY; 208 209 readonly attribute sequence<TextRun> runs; 210}; 211``` 212 213With these data results (specifically glyphs and positions for specific Typeface objects) 214callers will have all they need to draw the results in any fashion they wish. The corresponding 215start/end text indices allows them to map each run back to the original text. 216 217This last point is fundamental to the design. It is recognized that a client creating richly 218annotated text will associate both shaping (e.g. Font) information, as well as arbitrary decoration 219and other annotations with each block of text. Returning in each Run the corresponding text range 220allows the client to "look-up" all of their bespoke additional information for that run (e.g. 221colors, shadows, underlines, placeholders, etc.). This frees the Browser from having to support 222or even understand the union of all possible decorations (obviously impossible). 223 224 225## Alternatives and Prior Art 226 227This model is designed to be low-level, to appeal to performance sensitive applications 228(e.g. animations) and sophisticated text (editors). It is also intended to feel 'natural' to a developer 229coming from a native app environment (desktop or mobile). 230 231We recognized that many (more casual) users may also want access to some of these services. That is 232appropriate, but we posit that with the right primitives and data exposed, such higher-level models 233can be constructed by the JavaScript community itself, either as formal Frameworks or as refined 234sample / example code. 235 236One excellent example of a higher-level data model is [Formatted Text](https://github.com/WICG/canvas-formatted-text/blob/main/explainer-datamodel.md) and we hope to explore ways to layer these 237two proposals, allowing high-level clients to utilize their data model, but still have the option 238to access our lower level accessors (as they wish). 239 240## Rendering in Canvas2D 241The [next explainer](/docs/dev/design/text_c2d) describes how to take these results and render them 242into an (extended) Canvas context. 243 244## Contributors: 245 [mikerreed](https://github.com/mikerreed), 246