1*523fa7a6SAndroid Build Coastguard Worker# ExecuTorch Vulkan Delegate 2*523fa7a6SAndroid Build Coastguard Worker 3*523fa7a6SAndroid Build Coastguard WorkerThe ExecuTorch Vulkan delegate is a native GPU delegate for ExecuTorch that is 4*523fa7a6SAndroid Build Coastguard Workerbuilt on top of the cross-platform Vulkan GPU API standard. It is primarily 5*523fa7a6SAndroid Build Coastguard Workerdesigned to leverage the GPU to accelerate model inference on Android devices, 6*523fa7a6SAndroid Build Coastguard Workerbut can be used on any platform that supports an implementation of Vulkan: 7*523fa7a6SAndroid Build Coastguard Workerlaptops, servers, and edge devices. 8*523fa7a6SAndroid Build Coastguard Worker 9*523fa7a6SAndroid Build Coastguard Worker::::{note} 10*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan delegate is currently under active development, and its components 11*523fa7a6SAndroid Build Coastguard Workerare subject to change. 12*523fa7a6SAndroid Build Coastguard Worker:::: 13*523fa7a6SAndroid Build Coastguard Worker 14*523fa7a6SAndroid Build Coastguard Worker## What is Vulkan? 15*523fa7a6SAndroid Build Coastguard Worker 16*523fa7a6SAndroid Build Coastguard WorkerVulkan is a low-level GPU API specification developed as a successor to OpenGL. 17*523fa7a6SAndroid Build Coastguard WorkerIt is designed to offer developers more explicit control over GPUs compared to 18*523fa7a6SAndroid Build Coastguard Workerprevious specifications in order to reduce overhead and maximize the 19*523fa7a6SAndroid Build Coastguard Workercapabilities of the modern graphics hardware. 20*523fa7a6SAndroid Build Coastguard Worker 21*523fa7a6SAndroid Build Coastguard WorkerVulkan has been widely adopted among GPU vendors, and most modern GPUs (both 22*523fa7a6SAndroid Build Coastguard Workerdesktop and mobile) in the market support Vulkan. Vulkan is also included in 23*523fa7a6SAndroid Build Coastguard WorkerAndroid from Android 7.0 onwards. 24*523fa7a6SAndroid Build Coastguard Worker 25*523fa7a6SAndroid Build Coastguard Worker**Note that Vulkan is a GPU API, not a GPU Math Library**. That is to say it 26*523fa7a6SAndroid Build Coastguard Workerprovides a way to execute compute and graphics operations on a GPU, but does not 27*523fa7a6SAndroid Build Coastguard Workercome with a built-in library of performant compute kernels. 28*523fa7a6SAndroid Build Coastguard Worker 29*523fa7a6SAndroid Build Coastguard Worker## The Vulkan Compute Library 30*523fa7a6SAndroid Build Coastguard Worker 31*523fa7a6SAndroid Build Coastguard WorkerThe ExecuTorch Vulkan Delegate is a wrapper around a standalone runtime known as 32*523fa7a6SAndroid Build Coastguard Workerthe **Vulkan Compute Library**. The aim of the Vulkan Compute Library is to 33*523fa7a6SAndroid Build Coastguard Workerprovide GPU implementations for PyTorch operators via GLSL compute shaders. 34*523fa7a6SAndroid Build Coastguard Worker 35*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan Compute Library is a fork/iteration of the [PyTorch Vulkan Backend](https://pytorch.org/tutorials/prototype/vulkan_workflow.html). 36*523fa7a6SAndroid Build Coastguard WorkerThe core components of the PyTorch Vulkan backend were forked into ExecuTorch 37*523fa7a6SAndroid Build Coastguard Workerand adapted for an AOT graph-mode style of model inference (as opposed to 38*523fa7a6SAndroid Build Coastguard WorkerPyTorch which adopted an eager execution style of model inference). 39*523fa7a6SAndroid Build Coastguard Worker 40*523fa7a6SAndroid Build Coastguard WorkerThe components of the Vulkan Compute Library are contained in the 41*523fa7a6SAndroid Build Coastguard Worker`executorch/backends/vulkan/runtime/` directory. The core components are listed 42*523fa7a6SAndroid Build Coastguard Workerand described below: 43*523fa7a6SAndroid Build Coastguard Worker 44*523fa7a6SAndroid Build Coastguard Worker``` 45*523fa7a6SAndroid Build Coastguard Workerruntime/ 46*523fa7a6SAndroid Build Coastguard Worker├── api/ .................... Wrapper API around Vulkan to manage Vulkan objects 47*523fa7a6SAndroid Build Coastguard Worker└── graph/ .................. ComputeGraph class which implements graph mode inference 48*523fa7a6SAndroid Build Coastguard Worker └── ops/ ................ Base directory for operator implementations 49*523fa7a6SAndroid Build Coastguard Worker ├── glsl/ ........... GLSL compute shaders 50*523fa7a6SAndroid Build Coastguard Worker │ ├── *.glsl 51*523fa7a6SAndroid Build Coastguard Worker │ └── conv2d.glsl 52*523fa7a6SAndroid Build Coastguard Worker └── impl/ ........... C++ code to dispatch GPU compute shaders 53*523fa7a6SAndroid Build Coastguard Worker ├── *.cpp 54*523fa7a6SAndroid Build Coastguard Worker └── Conv2d.cpp 55*523fa7a6SAndroid Build Coastguard Worker``` 56*523fa7a6SAndroid Build Coastguard Worker 57*523fa7a6SAndroid Build Coastguard Worker## Features 58*523fa7a6SAndroid Build Coastguard Worker 59*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan delegate currently supports the following features: 60*523fa7a6SAndroid Build Coastguard Worker 61*523fa7a6SAndroid Build Coastguard Worker* **Memory Planning** 62*523fa7a6SAndroid Build Coastguard Worker * Intermediate tensors whose lifetimes do not overlap will share memory allocations. This reduces the peak memory usage of model inference. 63*523fa7a6SAndroid Build Coastguard Worker* **Capability Based Partitioning**: 64*523fa7a6SAndroid Build Coastguard Worker * A graph can be partially lowered to the Vulkan delegate via a partitioner, which will identify nodes (i.e. operators) that are supported by the Vulkan delegate and lower only supported subgraphs 65*523fa7a6SAndroid Build Coastguard Worker* **Support for upper-bound dynamic shapes**: 66*523fa7a6SAndroid Build Coastguard Worker * Tensors can change shape between inferences as long as its current shape is smaller than the bounds specified during lowering 67*523fa7a6SAndroid Build Coastguard Worker 68*523fa7a6SAndroid Build Coastguard WorkerIn addition to increasing operator coverage, the following features are 69*523fa7a6SAndroid Build Coastguard Workercurrently in development: 70*523fa7a6SAndroid Build Coastguard Worker 71*523fa7a6SAndroid Build Coastguard Worker* **Quantization Support** 72*523fa7a6SAndroid Build Coastguard Worker * We are currently working on support for 8-bit dynamic quantization, with plans to extend to other quantization schemes in the future. 73*523fa7a6SAndroid Build Coastguard Worker* **Memory Layout Management** 74*523fa7a6SAndroid Build Coastguard Worker * Memory layout is an important factor to optimizing performance. We plan to introduce graph passes to introduce memory layout transitions throughout a graph to optimize memory-layout sensitive operators such as Convolution and Matrix Multiplication. 75*523fa7a6SAndroid Build Coastguard Worker* **Selective Build** 76*523fa7a6SAndroid Build Coastguard Worker * We plan to make it possible to control build size by selecting which operators/shaders you want to build with 77*523fa7a6SAndroid Build Coastguard Worker 78*523fa7a6SAndroid Build Coastguard Worker## End to End Example 79*523fa7a6SAndroid Build Coastguard Worker 80*523fa7a6SAndroid Build Coastguard WorkerTo further understand the features of the Vulkan Delegate and how to use it, 81*523fa7a6SAndroid Build Coastguard Workerconsider the following end to end example with a simple single operator model. 82*523fa7a6SAndroid Build Coastguard Worker 83*523fa7a6SAndroid Build Coastguard Worker### Compile and lower a model to the Vulkan Delegate 84*523fa7a6SAndroid Build Coastguard Worker 85*523fa7a6SAndroid Build Coastguard WorkerAssuming ExecuTorch has been set up and installed, the following script can be 86*523fa7a6SAndroid Build Coastguard Workerused to produce a lowered MobileNet V2 model as `vulkan_mobilenetv2.pte`. 87*523fa7a6SAndroid Build Coastguard Worker 88*523fa7a6SAndroid Build Coastguard WorkerOnce ExecuTorch has been set up and installed, the following script can be used 89*523fa7a6SAndroid Build Coastguard Workerto generate a simple model and lower it to the Vulkan delegate. 90*523fa7a6SAndroid Build Coastguard Worker 91*523fa7a6SAndroid Build Coastguard Worker``` 92*523fa7a6SAndroid Build Coastguard Worker# Note: this script is the same as the script from the "Setting up ExecuTorch" 93*523fa7a6SAndroid Build Coastguard Worker# page, with one minor addition to lower to the Vulkan backend. 94*523fa7a6SAndroid Build Coastguard Workerimport torch 95*523fa7a6SAndroid Build Coastguard Workerfrom torch.export import export 96*523fa7a6SAndroid Build Coastguard Workerfrom executorch.exir import to_edge 97*523fa7a6SAndroid Build Coastguard Worker 98*523fa7a6SAndroid Build Coastguard Workerfrom executorch.backends.vulkan.partitioner.vulkan_partitioner import VulkanPartitioner 99*523fa7a6SAndroid Build Coastguard Worker 100*523fa7a6SAndroid Build Coastguard Worker# Start with a PyTorch model that adds two input tensors (matrices) 101*523fa7a6SAndroid Build Coastguard Workerclass Add(torch.nn.Module): 102*523fa7a6SAndroid Build Coastguard Worker def __init__(self): 103*523fa7a6SAndroid Build Coastguard Worker super(Add, self).__init__() 104*523fa7a6SAndroid Build Coastguard Worker 105*523fa7a6SAndroid Build Coastguard Worker def forward(self, x: torch.Tensor, y: torch.Tensor): 106*523fa7a6SAndroid Build Coastguard Worker return x + y 107*523fa7a6SAndroid Build Coastguard Worker 108*523fa7a6SAndroid Build Coastguard Worker# 1. torch.export: Defines the program with the ATen operator set. 109*523fa7a6SAndroid Build Coastguard Workeraten_dialect = export(Add(), (torch.ones(1), torch.ones(1))) 110*523fa7a6SAndroid Build Coastguard Worker 111*523fa7a6SAndroid Build Coastguard Worker# 2. to_edge: Make optimizations for Edge devices 112*523fa7a6SAndroid Build Coastguard Workeredge_program = to_edge(aten_dialect) 113*523fa7a6SAndroid Build Coastguard Worker# 2.1 Lower to the Vulkan backend 114*523fa7a6SAndroid Build Coastguard Workeredge_program = edge_program.to_backend(VulkanPartitioner()) 115*523fa7a6SAndroid Build Coastguard Worker 116*523fa7a6SAndroid Build Coastguard Worker# 3. to_executorch: Convert the graph to an ExecuTorch program 117*523fa7a6SAndroid Build Coastguard Workerexecutorch_program = edge_program.to_executorch() 118*523fa7a6SAndroid Build Coastguard Worker 119*523fa7a6SAndroid Build Coastguard Worker# 4. Save the compiled .pte program 120*523fa7a6SAndroid Build Coastguard Workerwith open("vk_add.pte", "wb") as file: 121*523fa7a6SAndroid Build Coastguard Worker file.write(executorch_program.buffer) 122*523fa7a6SAndroid Build Coastguard Worker``` 123*523fa7a6SAndroid Build Coastguard Worker 124*523fa7a6SAndroid Build Coastguard WorkerLike other ExecuTorch delegates, a model can be lowered to the Vulkan Delegate 125*523fa7a6SAndroid Build Coastguard Workerusing the `to_backend()` API. The Vulkan Delegate implements the 126*523fa7a6SAndroid Build Coastguard Worker`VulkanPartitioner` class which identifies nodes (i.e. operators) in the graph 127*523fa7a6SAndroid Build Coastguard Workerthat are supported by the Vulkan delegate, and separates compatible sections of 128*523fa7a6SAndroid Build Coastguard Workerthe model to be executed on the GPU. 129*523fa7a6SAndroid Build Coastguard Worker 130*523fa7a6SAndroid Build Coastguard WorkerThis means the a model can be lowered to the Vulkan delegate even if it contains 131*523fa7a6SAndroid Build Coastguard Workersome unsupported operators. This will just mean that only parts of the graph 132*523fa7a6SAndroid Build Coastguard Workerwill be executed on the GPU. 133*523fa7a6SAndroid Build Coastguard Worker 134*523fa7a6SAndroid Build Coastguard Worker 135*523fa7a6SAndroid Build Coastguard Worker::::{note} 136*523fa7a6SAndroid Build Coastguard WorkerThe [supported ops list](https://github.com/pytorch/executorch/blob/main/backends/vulkan/partitioner/supported_ops.py) 137*523fa7a6SAndroid Build Coastguard WorkerVulkan partitioner code can be inspected to examine which ops are currently 138*523fa7a6SAndroid Build Coastguard Workerimplemented in the Vulkan delegate. 139*523fa7a6SAndroid Build Coastguard Worker:::: 140*523fa7a6SAndroid Build Coastguard Worker 141*523fa7a6SAndroid Build Coastguard Worker### Build Vulkan Delegate libraries 142*523fa7a6SAndroid Build Coastguard Worker 143*523fa7a6SAndroid Build Coastguard WorkerThe easiest way to build and test the Vulkan Delegate is to build for Android 144*523fa7a6SAndroid Build Coastguard Workerand test on a local Android device. Android devices have built in support for 145*523fa7a6SAndroid Build Coastguard WorkerVulkan, and the Android NDK ships with a GLSL compiler which is needed to 146*523fa7a6SAndroid Build Coastguard Workercompile the Vulkan Compute Library's GLSL compute shaders. 147*523fa7a6SAndroid Build Coastguard Worker 148*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan Delegate libraries can be built by setting `-DEXECUTORCH_BUILD_VULKAN=ON` 149*523fa7a6SAndroid Build Coastguard Workerwhen building with CMake. 150*523fa7a6SAndroid Build Coastguard Worker 151*523fa7a6SAndroid Build Coastguard WorkerFirst, make sure that you have the Android NDK installed; any NDK version past 152*523fa7a6SAndroid Build Coastguard WorkerNDK r19c should work. Note that the examples in this doc have been validated with 153*523fa7a6SAndroid Build Coastguard WorkerNDK r27b. The Android SDK should also be installed so that you have access to `adb`. 154*523fa7a6SAndroid Build Coastguard Worker 155*523fa7a6SAndroid Build Coastguard WorkerThe instructions in this page assumes that the following environment variables 156*523fa7a6SAndroid Build Coastguard Workerare set. 157*523fa7a6SAndroid Build Coastguard Worker 158*523fa7a6SAndroid Build Coastguard Worker```shell 159*523fa7a6SAndroid Build Coastguard Workerexport ANDROID_NDK=<path_to_ndk> 160*523fa7a6SAndroid Build Coastguard Worker# Select the appropriate Android ABI for your device 161*523fa7a6SAndroid Build Coastguard Workerexport ANDROID_ABI=arm64-v8a 162*523fa7a6SAndroid Build Coastguard Worker# All subsequent commands should be performed from ExecuTorch repo root 163*523fa7a6SAndroid Build Coastguard Workercd <path_to_executorch_root> 164*523fa7a6SAndroid Build Coastguard Worker# Make sure adb works 165*523fa7a6SAndroid Build Coastguard Workeradb --version 166*523fa7a6SAndroid Build Coastguard Worker``` 167*523fa7a6SAndroid Build Coastguard Worker 168*523fa7a6SAndroid Build Coastguard WorkerTo build and install ExecuTorch libraries (for Android) with the Vulkan 169*523fa7a6SAndroid Build Coastguard WorkerDelegate: 170*523fa7a6SAndroid Build Coastguard Worker 171*523fa7a6SAndroid Build Coastguard Worker```shell 172*523fa7a6SAndroid Build Coastguard Worker# From executorch root directory 173*523fa7a6SAndroid Build Coastguard Worker(rm -rf cmake-android-out && \ 174*523fa7a6SAndroid Build Coastguard Worker pp cmake . -DCMAKE_INSTALL_PREFIX=cmake-android-out \ 175*523fa7a6SAndroid Build Coastguard Worker -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ 176*523fa7a6SAndroid Build Coastguard Worker -DANDROID_ABI=$ANDROID_ABI \ 177*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_VULKAN=ON \ 178*523fa7a6SAndroid Build Coastguard Worker -DPYTHON_EXECUTABLE=python \ 179*523fa7a6SAndroid Build Coastguard Worker -Bcmake-android-out && \ 180*523fa7a6SAndroid Build Coastguard Worker cmake --build cmake-android-out -j16 --target install) 181*523fa7a6SAndroid Build Coastguard Worker``` 182*523fa7a6SAndroid Build Coastguard Worker 183*523fa7a6SAndroid Build Coastguard Worker### Run the Vulkan model on device 184*523fa7a6SAndroid Build Coastguard Worker 185*523fa7a6SAndroid Build Coastguard Worker::::{note} 186*523fa7a6SAndroid Build Coastguard WorkerSince operator support is currently limited, only binary arithmetic operators 187*523fa7a6SAndroid Build Coastguard Workerwill run on the GPU. Expect inference to be slow as the majority of operators 188*523fa7a6SAndroid Build Coastguard Workerare being executed via Portable operators. 189*523fa7a6SAndroid Build Coastguard Worker:::: 190*523fa7a6SAndroid Build Coastguard Worker 191*523fa7a6SAndroid Build Coastguard WorkerNow, the partially delegated model can be executed (partially) on your device's 192*523fa7a6SAndroid Build Coastguard WorkerGPU! 193*523fa7a6SAndroid Build Coastguard Worker 194*523fa7a6SAndroid Build Coastguard Worker```shell 195*523fa7a6SAndroid Build Coastguard Worker# Build a model runner binary linked with the Vulkan delegate libs 196*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-android-out --target vulkan_executor_runner -j32 197*523fa7a6SAndroid Build Coastguard Worker 198*523fa7a6SAndroid Build Coastguard Worker# Push model to device 199*523fa7a6SAndroid Build Coastguard Workeradb push vk_add.pte /data/local/tmp/vk_add.pte 200*523fa7a6SAndroid Build Coastguard Worker# Push binary to device 201*523fa7a6SAndroid Build Coastguard Workeradb push cmake-android-out/backends/vulkan/vulkan_executor_runner /data/local/tmp/runner_bin 202*523fa7a6SAndroid Build Coastguard Worker 203*523fa7a6SAndroid Build Coastguard Worker# Run the model 204*523fa7a6SAndroid Build Coastguard Workeradb shell /data/local/tmp/runner_bin --model_path /data/local/tmp/vk_add.pte 205*523fa7a6SAndroid Build Coastguard Worker``` 206