README.md
1# ExecuTorch QNN Backend examples
2
3This directory contains examples for some AI models.
4
5We have seperated the example scripts into the following subfolders, please refer to [README.md](../../backends/qualcomm/README.md) for the example scripts' directory structure:
6
71. executor_runner: This folder contains a general executor runner capable of running most of the models. As a rule of thumb, if a model does not have its own customized runner, execute the model using [executor_runner](./executor_runner/qnn_executor_runner.cpp). On the other hand, if a model has its own runner, such as [llama2](./oss_scripts/llama2/qnn_llama_runner.cpp), use the customized runner to execute the model. Customized runner should be located under the same folder as the model's python script.
8
92. oss_scripts: OSS stands for Open Source Software. This folder contains python scripts for open source models. Some models under this folder might also have their own customized runner.
10 For example, [llama2](./oss_scripts/llama2/qnn_llama_runner.cpp) contains not only the python scripts to prepare the model but also a customized runner for executing the model.
11
123. qaihub_scripts: QAIHub stands for [Qualcomm AI Hub](https://aihub.qualcomm.com/). On QAIHub, users can find pre-compiled context binaries, a format used by QNN to save its models. This provides users with a new option for model deployment. Different from oss_scripts & scripts, which the example scripts are converting a model from nn.Module to ExecuTorch .pte files, qaihub_scripts provides example scripts for converting pre-compiled context binaries to ExecuTorch .pte files. Additionaly, users can find customized example runners specific to the QAIHub models for execution. For example [qaihub_llama2_7b](./qaihub_scripts/llama2/qaihub_llama2_7b.py) is a script converting context binaries to ExecuTorch .pte files, and [qaihub_llama2_7b_runner](./qaihub_scripts/llama2/qaihub_llama2_7b_runner.cpp) is a customized example runner to execute llama2 .pte files. Please be aware that context-binaries downloaded from QAIHub are tied to a specific QNN SDK version.
13Before executing the scripts and runner, please ensure that you are using the QNN SDK version that is matching the context binary. Tutorial below will also cover how to check the QNN Version for a context binary.
14
154. scripts: This folder contains scripts to build models provided by executorch.
16
17
18
19Please check helper of each examples for detailed arguments.
20
21Here are some general information and limitations.
22
23## Prerequisite
24
25Please finish tutorial [Setting up executorch](https://pytorch.org/executorch/stable/getting-started-setup).
26
27Please finish [setup QNN backend](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md).
28
29## Environment
30
31Please set up `QNN_SDK_ROOT` environment variable.
32Note that this version should be exactly same as building QNN backend.
33Please check [setup](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md).
34
35Please set up `LD_LIBRARY_PATH` to `$QNN_SDK_ROOT/lib/x86_64-linux-clang`.
36Or, you could put QNN libraries to default search path of the dynamic linker.
37
38## Device
39
40Please connect an Android phone to the workstation. We use `adb` to communicate with the device.
41
42If the device is in a remote host, you might want to add `-H` to the `adb`
43commands in the `SimpleADB` class inside [utils.py](utils.py).
44
45## Please use python xxx.py --help for information of each examples.
46
47Some CLI examples here. Please adjust according to your environment:
48
49#### First switch to following folder
50```bash
51cd $EXECUTORCH_ROOT/examples/qualcomm/scripts
52```
53
54#### For MobileNet_v2
55```bash
56python mobilenet_v2.py -s <device_serial> -m "SM8550" -b path/to/build-android/ -d /path/to/imagenet-mini/val
57```
58
59#### For DeepLab_v3
60```bash
61python deeplab_v3.py -s <device_serial> -m "SM8550" -b path/to/build-android/ --download
62```
63
64#### Check context binary version
65```bash
66cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang
67./qnn-context-binary-utility --context_binary ${PATH_TO_CONTEXT_BINARY} --json_file ${OUTPUT_JSON_NAME}
68```
69After retreiving the json file, search in the json file for the field "buildId" and ensure it matches the ${QNN_SDK_ROOT} you are using for the environment variable.
70If you run into the following error, that means the ${QNN_SDK_ROOT} that you are using is older than the context binary QNN SDK version. In this case, please download a newer QNN SDK version.
71```
72Error: Failed to get context binary info.
73```
74
75## Additional Dependency
76
77The mobilebert multi-class text classification example requires `pandas` and `sklearn`.
78Please install them by something like
79
80```bash
81pip install scikit-learn pandas
82```
83
84## Limitation
85
861. QNN 2.24 is used for all examples. Newer or older QNN might work,
87but the performance and accuracy number can differ.
88
892. The mobilebert example is on QNN HTP fp16, which is only supported by a limited
90set of SoCs. Please check QNN documents for details.
91
923. The mobilebert example needs to train the last classifier layer a bit, so it takes
93time to run.
94
954. [**Important**] Due to the numerical limits of FP16, other use cases leveraging mobileBert wouldn't
96guarantee to work.
97