xref: /aosp_15_r20/external/executorch/docs/source/extension-tensor.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1# Managing Tensor Memory in C++
2
3**Author:** [Anthony Shoumikhin](https://github.com/shoumikhin)
4
5Tensors are fundamental data structures in ExecuTorch, representing multi-dimensional arrays used in computations for neural networks and other numerical algorithms. In ExecuTorch, the [Tensor](https://github.com/pytorch/executorch/blob/main/runtime/core/portable_type/tensor.h) class doesn’t own its metadata (sizes, strides, dim_order) or data, keeping the runtime lightweight. Users are responsible for supplying all these memory buffers and ensuring that the metadata and data outlive the `Tensor` instance. While this design is lightweight and flexible, especially for tiny embedded systems, it places a significant burden on the user. If your environment requires minimal dynamic allocations, a small binary footprint, or limited C++ standard library support, you’ll need to accept that trade-off and stick with the regular `Tensor` type.
6
7Imagine you’re working with a [`Module`](extension-module.md) interface, and you need to pass a `Tensor` to the `forward()` method. You would need to declare and maintain at least the sizes array and data separately, sometimes the strides too, often leading to the following pattern:
8
9```cpp
10#include <executorch/extension/module/module.h>
11
12using namespace executorch::aten;
13using namespace executorch::extension;
14
15SizesType sizes[] = {2, 3};
16DimOrderType dim_order[] = {0, 1};
17StridesType strides[] = {3, 1};
18float data[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f};
19TensorImpl tensor_impl(
20    ScalarType::Float,
21    std::size(sizes),
22    sizes,
23    data,
24    dim_order,
25    strides);
26// ...
27module.forward(Tensor(&tensor_impl));
28```
29
30You must ensure `sizes`, `dim_order`, `strides`, and `data` stay valid. This makes code maintenance difficult and error-prone. Users have struggled to manage lifetimes, and many have created their own ad-hoc managed tensor abstractions to hold all the pieces together, leading to a fragmented and inconsistent ecosystem.
31
32## Introducing TensorPtr
33
34To alleviate these issues, ExecuTorch provides `TensorPtr`, a smart pointer that manages the lifecycle of both the tensor's data and its dynamic metadata.
35
36With `TensorPtr`, users no longer need to worry about metadata lifetimes separately. Data ownership is determined based on whether it is passed by pointer or moved into the `TensorPtr` as an `std::vector`. Everything is bundled in one place and managed automatically, enabling you to focus on actual computations.
37
38Here’s how you can use it:
39
40```cpp
41#include <executorch/extension/module/module.h>
42#include <executorch/extension/tensor/tensor.h>
43
44using namespace executorch::extension;
45
46auto tensor = make_tensor_ptr(
47    {2, 3},                                // sizes
48    {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f}); // data
49// ...
50module.forward(tensor);
51```
52
53The data is now owned by the tensor instance because it's provided as a vector. To create a non-owning `TensorPtr`, just pass the data by pointer. The `type` is deduced automatically based on the data vector (`float`). `strides` and `dim_order` are computed automatically to default values based on the `sizes` if not specified explicitly as additional arguments.
54
55`EValue` in `Module::forward()` accepts `TensorPtr` directly, ensuring seamless integration. `EValue` can now be constructed implicitly with a smart pointer to any type that it can hold. This allows `TensorPtr` to be dereferenced implicitly when passed to `forward()`, and `EValue` will hold the `Tensor` that the `TensorPtr` points to.
56
57## API Overview
58
59`TensorPtr` is literally an alias for `std::shared_ptr<Tensor>`, so you can work with it easily without duplicating the data and metadata. Each `Tensor` instance may either own its data or reference external data.
60
61### Creating Tensors
62
63There are several ways to create a `TensorPtr`.
64
65#### Creating Scalar Tensors
66
67You can create a scalar tensor, i.e. a tensor with zero dimensions or with one of the sizes being zero.
68
69*Providing A Single Data Value*
70
71```cpp
72auto tensor = make_tensor_ptr(3.14);
73```
74
75The resulting tensor will contain a single value `3.14` of type double, which is deduced automatically.
76
77*Providing A Single Data Value with a Type*
78
79```cpp
80auto tensor = make_tensor_ptr(42, ScalarType::Float);
81```
82
83Now the integer `42` will be cast to float and the tensor will contain a single value `42` of type float.
84
85#### Owning Data from a Vector
86
87When you provide sizes and data vectors, `TensorPtr` takes ownership of both the data and the sizes.
88
89*Providing Data Vector*
90
91```cpp
92auto tensor = make_tensor_ptr(
93    {2, 3},                                 // sizes
94    {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f});  // data (float)
95```
96
97The type is deduced automatically as `ScalarType::Float` from the data vector.
98
99*Providing Data Vector with a Type*
100
101If you provide data of one type but specify a different scalar type, the data will be cast to the given type.
102
103```cpp
104auto tensor = make_tensor_ptr(
105    {1, 2, 3, 4, 5, 6},          // data (int)
106    ScalarType::Double);         // double scalar type
107```
108
109In this example, even though the data vector contains integers, we specify the scalar type as `Double`. The integers are cast to double, and the new data vector is owned by the `TensorPtr`. Since the `sizes` argument is skipped in this example, the tensor is one-dimensional with a size equal to the length of the data vector. Note that the reverse cast, from a floating-point type to an integral type, is not allowed because that loses precision. Similarly, casting other types to `Bool` is disallowed.
110
111*Providing Data Vector as `std::vector<uint8_t>`*
112
113You can also provide raw data in the form of a `std::vector<uint8_t>`, specifying the sizes and scalar type. The data will be reinterpreted according to the provided type.
114
115```cpp
116std::vector<uint8_t> data = /* raw data */;
117auto tensor = make_tensor_ptr(
118    {2, 3},                 // sizes
119    std::move(data),        // data as uint8_t vector
120    ScalarType::Int);       // int scalar type
121```
122
123The `data` vector must be large enough to accommodate all the elements according to the provided sizes and scalar type.
124
125#### Non-Owning Data from Raw Pointer
126
127You can create a `TensorPtr` that references existing data without taking ownership.
128
129*Providing Raw Data*
130
131```cpp
132float data[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f};
133auto tensor = make_tensor_ptr(
134    {2, 3},              // sizes
135    data,                // raw data pointer
136    ScalarType::Float);  // float scalar type
137```
138
139The `TensorPtr` does not own the data, so you must ensure the `data` remains valid.
140
141*Providing Raw Data with Custom Deleter*
142
143If you want the `TensorPtr` to manage the lifetime of the data, you can provide a custom deleter.
144
145```cpp
146auto* data = new double[6]{1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
147auto tensor = make_tensor_ptr(
148    {2, 3},                               // sizes
149    data,                                 // data pointer
150    ScalarType::Double,                   // double scalar type
151    TensorShapeDynamism::DYNAMIC_BOUND,   // default dynamism
152    [](void* ptr) { delete[] static_cast<double*>(ptr); });
153```
154
155The `TensorPtr` will call the custom deleter when it is destroyed, i.e., when the smart pointer is reset and no more references to the underlying `Tensor` exist.
156
157#### Sharing Existing Tensor
158
159Since `TensorPtr` is a `std::shared_ptr<Tensor>`, you can easily create a `TensorPtr` that shares an existing `Tensor`. Any changes made to the shared data are reflected across all instances sharing the same data.
160
161*Sharing Existing TensorPtr*
162
163```cpp
164auto tensor = make_tensor_ptr({2, 3}, {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f});
165auto tensor_copy = tensor;
166```
167
168Now `tensor` and `tensor_copy` point to the same data and metadata.
169
170#### Viewing Existing Tensor
171
172You can create a `TensorPtr` from an existing `Tensor`, copying its properties and referencing the same data.
173
174*Viewing Existing Tensor*
175
176```cpp
177Tensor original_tensor = /* some existing tensor */;
178auto tensor = make_tensor_ptr(original_tensor);
179```
180
181Now the newly created `TensorPtr` references the same data as the original tensor, but has its own metadata copy, so it can interpret or "view" the data differently, but any modifications to the data will be reflected in the original `Tensor` as well.
182
183### Cloning Tensors
184
185To create a new `TensorPtr` that owns a copy of the data from an existing tensor:
186
187```cpp
188Tensor original_tensor = /* some existing tensor */;
189auto tensor = clone_tensor_ptr(original_tensor);
190```
191
192The newly created `TensorPtr` has its own copy of the data, so it can modify and manage it independently.
193Likewise, you can create a clone of an existing `TensorPtr`.
194
195```cpp
196auto original_tensor = make_tensor_ptr(/* ... */);
197auto tensor = clone_tensor_ptr(original_tensor);
198```
199
200Note that, regardless of whether the original `TensorPtr` owns the data or not, the newly created `TensorPtr` will own a copy of the data.
201
202### Resizing Tensors
203
204The `TensorShapeDynamism` enum specifies the mutability of a tensor's shape:
205
206- `STATIC`: The tensor's shape cannot be changed.
207- `DYNAMIC_BOUND`: The tensor's shape can be changed but cannot contain more elements than it originally had at creation based on the initial sizes.
208- `DYNAMIC`: The tensor's shape can be changed arbitrarily. Currently, `DYNAMIC` is an alias for `DYNAMIC_BOUND`.
209
210When resizing a tensor, you must respect its dynamism setting. Resizing is only allowed for tensors with `DYNAMIC` or `DYNAMIC_BOUND` shapes, and you cannot resize `DYNAMIC_BOUND` tensors to contain more elements than they had initially.
211
212```cpp
213auto tensor = make_tensor_ptr(
214    {2, 3},                      // sizes
215    {1, 2, 3, 4, 5, 6},          // data
216    ScalarType::Int,
217    TensorShapeDynamism::DYNAMIC_BOUND);
218// Initial sizes: {2, 3}
219// Number of elements: 6
220
221resize_tensor_ptr(tensor, {2, 2});
222// The tensor sizes are now {2, 2}
223// Number of elements is 4 < initial 6
224
225resize_tensor_ptr(tensor, {1, 3});
226// The tensor sizes are now {1, 3}
227// Number of elements is 3 < initial 6
228
229resize_tensor_ptr(tensor, {3, 2});
230// The tensor sizes are now {3, 2}
231// Number of elements is 6 == initial 6
232
233resize_tensor_ptr(tensor, {6, 1});
234// The tensor sizes are now {6, 1}
235// Number of elements is 6 == initial 6
236```
237
238## Convenience Helpers
239
240ExecuTorch provides several helper functions to create tensors conveniently.
241
242### Creating Non-Owning Tensors with `for_blob` and `from_blob`
243
244These helpers allow you to create tensors that do not own the data.
245
246*Using `from_blob()`*
247
248```cpp
249float data[] = {1.0f, 2.0f, 3.0f};
250auto tensor = from_blob(
251    data,                // data pointer
252    {3},                 // sizes
253    ScalarType::Float);  // float scalar type
254```
255
256*Using `for_blob()` with Fluent Syntax*
257
258```cpp
259double data[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
260auto tensor = for_blob(data, {2, 3}, ScalarType::Double)
261                  .strides({3, 1})
262                  .dynamism(TensorShapeDynamism::STATIC)
263                  .make_tensor_ptr();
264```
265
266*Using Custom Deleter with `from_blob()`*
267
268```cpp
269int* data = new int[3]{1, 2, 3};
270auto tensor = from_blob(
271    data,             // data pointer
272    {3},              // sizes
273    ScalarType::Int,  // int scalar type
274    [](void* ptr) { delete[] static_cast<int*>(ptr); });
275```
276
277The `TensorPtr` will call the custom deleter when it is destroyed.
278
279### Creating Empty Tensors
280
281`empty()` creates an uninitialized tensor with sizes specified.
282
283```cpp
284auto tensor = empty({2, 3});
285```
286
287`empty_like()` creates an uninitialized tensor with the same sizes as an existing `TensorPtr`.
288
289```cpp
290TensorPtr original_tensor = /* some existing tensor */;
291auto tensor = empty_like(original_tensor);
292```
293
294And `empty_strided()` creates an uninitialized tensor with sizes and strides specified.
295
296```cpp
297auto tensor = empty_strided({2, 3}, {3, 1});
298```
299
300### Creating Tensors Filled with Specific Values
301
302`full()`, `zeros()` and `ones()` create a tensor filled with a provided value, zeros or ones respectively.
303
304```cpp
305auto tensor_full = full({2, 3}, 42.0f);
306auto tensor_zeros = zeros({2, 3});
307auto tensor_ones = ones({3, 4});
308```
309
310Similarly to `empty()`, there are extra helper functions `full_like()`, `full_strided()`, `zeros_like()`, `zeros_strided()`, `ones_like()` and `ones_strided()` to create filled tensors with the same properties as an existing `TensorPtr` or with custom strides.
311
312### Creating Random Tensors
313
314`rand()` creates a tensor filled with random values between 0 and 1.
315
316```cpp
317auto tensor_rand = rand({2, 3});
318```
319
320`randn()` creates a tensor filled with random values from a normal distribution.
321
322```cpp
323auto tensor_randn = randn({2, 3});
324```
325
326`randint()` creates a tensor filled with random integers between min (inclusive) and max (exclusive) integers specified.
327
328```cpp
329auto tensor_randint = randint(0, 10, {2, 3});
330```
331
332### Creating Scalar Tensors
333
334In addition to `make_tensor_ptr()` with a single data value, you can create a scalar tensor with `scalar_tensor()`.
335
336```cpp
337auto tensor = scalar_tensor(3.14f);
338```
339
340Note that the `scalar_tensor()` function expects a value of type `Scalar`. In ExecuTorch, `Scalar` can represent `bool`, `int`, or floating-point types, but not types like `Half` or `BFloat16`, etc. for which you'd need to use `make_tensor_ptr()` to skip the `Scalar` type.
341
342## Notes on EValue and Lifetime Management
343
344The [`Module`](extension-module.md) interface expects data in the form of `EValue`, a variant type that can hold a `Tensor` or other scalar types. When you pass a `TensorPtr` to a function expecting an `EValue`, you can dereference the `TensorPtr` to get the underlying `Tensor`.
345
346```cpp
347TensorPtr tensor = /* create a TensorPtr */
348//...
349module.forward(tensor);
350```
351
352Or even a vector of `EValues` for multiple parameters.
353
354```cpp
355TensorPtr tensor = /* create a TensorPtr */
356TensorPtr tensor2 = /* create another TensorPtr */
357//...
358module.forward({tensor, tensor2});
359```
360
361However, be cautious: `EValue` will not hold onto the dynamic data and metadata from the `TensorPtr`. It merely holds a regular `Tensor`, which does not own the data or metadata but refers to them using raw pointers. You need to ensure that the `TensorPtr` remains valid for as long as the `EValue` is in use.
362
363This also applies when using functions like `set_input()` or `set_output()` that expect `EValue`.
364
365## Interoperability with ATen
366
367If your code is compiled with the preprocessor flag `USE_ATEN_LIB` enabled, all the `TensorPtr` APIs will use `at::` APIs under the hood. E.g. `TensorPtr` becomes a `std::shared_ptr<at::Tensor>`. This allows for seamless integration with [PyTorch ATen](https://pytorch.org/cppdocs) library.
368
369### API Equivalence Table
370
371Here's a table matching `TensorPtr` creation functions with their corresponding ATen APIs:
372
373| ATen                                        | ExecuTorch                                  |
374|---------------------------------------------|---------------------------------------------|
375| `at::tensor(data, type)`                    | `make_tensor_ptr(data, type)`               |
376| `at::tensor(data, type).reshape(sizes)`     | `make_tensor_ptr(sizes, data, type)`        |
377| `tensor.clone()`                            | `clone_tensor_ptr(tensor)`                  |
378| `tensor.resize_(new_sizes)`                 | `resize_tensor_ptr(tensor, new_sizes)`      |
379| `at::scalar_tensor(value)`                  | `scalar_tensor(value)`                      |
380| `at::from_blob(data, sizes, type)`          | `from_blob(data, sizes, type)`              |
381| `at::empty(sizes)`                          | `empty(sizes)`                              |
382| `at::empty_like(tensor)`                    | `empty_like(tensor)`                        |
383| `at::empty_strided(sizes, strides)`         | `empty_strided(sizes, strides)`             |
384| `at::full(sizes, value)`                    | `full(sizes, value)`                        |
385| `at::full_like(tensor, value)`              | `full_like(tensor, value)`                  |
386| `at::full_strided(sizes, strides, value)`   | `full_strided(sizes, strides, value)`       |
387| `at::zeros(sizes)`                          | `zeros(sizes)`                              |
388| `at::zeros_like(tensor)`                    | `zeros_like(tensor)`                        |
389| `at::zeros_strided(sizes, strides)`         | `zeros_strided(sizes, strides)`             |
390| `at::ones(sizes)`                           | `ones(sizes)`                               |
391| `at::ones_like(tensor)`                     | `ones_like(tensor)`                         |
392| `at::ones_strided(sizes, strides)`          | `ones_strided(sizes, strides)`              |
393| `at::rand(sizes)`                           | `rand(sizes)`                               |
394| `at::rand_like(tensor)`                     | `rand_like(tensor)`                         |
395| `at::randn(sizes)`                          | `randn(sizes)`                              |
396| `at::randn_like(tensor)`                    | `randn_like(tensor)`                        |
397| `at::randint(low, high, sizes)`             | `randint(low, high, sizes)`                 |
398| `at::randint_like(tensor, low, high)`       | `randint_like(tensor, low, high)`           |
399
400## Best Practices
401
402- *Manage Lifetimes Carefully*: Even though `TensorPtr` handles memory management, ensure that any non-owned data (e.g., when using `from_blob()`) remains valid while the tensor is in use.
403- *Use Convenience Functions*: Utilize helper functions for common tensor creation patterns to write cleaner and more readable code.
404- *Be Aware of Data Ownership*: Know whether your tensor owns its data or references external data to avoid unintended side effects or memory leaks.
405- *Ensure `TensorPtr` Outlives `EValue`*: When passing tensors to modules that expect `EValue`, ensure that the `TensorPtr` remains valid as long as the `EValue` is in use.
406
407## Conclusion
408
409The `TensorPtr` in ExecuTorch simplifies tensor memory management by bundling the data and dynamic metadata into a smart pointer. This design eliminates the need for users to manage multiple pieces of data and ensures safer and more maintainable code.
410
411By providing interfaces similar to PyTorch's ATen library, ExecuTorch simplifies the adoption of the new API, allowing developers to transition without a steep learning curve.
412