xref: /aosp_15_r20/external/executorch/examples/models/phi-3-mini/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1*523fa7a6SAndroid Build Coastguard Worker# Summary
2*523fa7a6SAndroid Build Coastguard WorkerThis example demonstrates how to run a [Phi-3-mini](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) 3.8B model via ExecuTorch. We use XNNPACK to accelarate the performance and XNNPACK symmetric per channel quantization.
3*523fa7a6SAndroid Build Coastguard Worker
4*523fa7a6SAndroid Build Coastguard Worker# Instructions
5*523fa7a6SAndroid Build Coastguard Worker## Step 1: Setup
6*523fa7a6SAndroid Build Coastguard Worker1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch. For installation run `./install_requirements.sh --pybind xnnpack`
7*523fa7a6SAndroid Build Coastguard Worker2. Currently, we support transformers v4.44.2. Install transformers with the following command:
8*523fa7a6SAndroid Build Coastguard Worker```
9*523fa7a6SAndroid Build Coastguard Workerpip uninstall -y transformers ; pip install transformers==4.44.2
10*523fa7a6SAndroid Build Coastguard Worker```
11*523fa7a6SAndroid Build Coastguard Worker## Step 2: Prepare and run the model
12*523fa7a6SAndroid Build Coastguard Worker1. Download the `tokenizer.model` from HuggingFace and create `tokenizer.bin`.
13*523fa7a6SAndroid Build Coastguard Worker```
14*523fa7a6SAndroid Build Coastguard Workercd executorch
15*523fa7a6SAndroid Build Coastguard Workerwget -O tokenizer.model "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/tokenizer.model?download=true"
16*523fa7a6SAndroid Build Coastguard Workerpython -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin
17*523fa7a6SAndroid Build Coastguard Worker```
18*523fa7a6SAndroid Build Coastguard Worker2. Export the model. This step will take a few minutes to finish.
19*523fa7a6SAndroid Build Coastguard Worker```
20*523fa7a6SAndroid Build Coastguard Workerpython -m examples.models.phi-3-mini.export_phi-3-mini -c "4k" -s 128 -o phi-3-mini.pte
21*523fa7a6SAndroid Build Coastguard Worker```
22*523fa7a6SAndroid Build Coastguard Worker3. Build and run the model.
23*523fa7a6SAndroid Build Coastguard Worker- Build executorch with optimized CPU performance as follows. Build options available [here](https://github.com/pytorch/executorch/blob/main/CMakeLists.txt#L59).
24*523fa7a6SAndroid Build Coastguard Worker ```
25*523fa7a6SAndroid Build Coastguard Worker cmake -DPYTHON_EXECUTABLE=python \
26*523fa7a6SAndroid Build Coastguard Worker     -DCMAKE_INSTALL_PREFIX=cmake-out \
27*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_ENABLE_LOGGING=1 \
28*523fa7a6SAndroid Build Coastguard Worker     -DCMAKE_BUILD_TYPE=Release \
29*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
30*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
31*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
32*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_XNNPACK=ON \
33*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
34*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
35*523fa7a6SAndroid Build Coastguard Worker     -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
36*523fa7a6SAndroid Build Coastguard Worker     -Bcmake-out .
37*523fa7a6SAndroid Build Coastguard Worker
38*523fa7a6SAndroid Build Coastguard Worker cmake --build cmake-out -j16 --target install --config Release
39*523fa7a6SAndroid Build Coastguard Worker ```
40*523fa7a6SAndroid Build Coastguard Worker- Build Phi-3-mini runner.
41*523fa7a6SAndroid Build Coastguard Worker```
42*523fa7a6SAndroid Build Coastguard Workercmake -DPYTHON_EXECUTABLE=python \
43*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_INSTALL_PREFIX=cmake-out \
44*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_BUILD_TYPE=Release \
45*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
46*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
47*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_XNNPACK=ON \
48*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
49*523fa7a6SAndroid Build Coastguard Worker    -Bcmake-out/examples/models/phi-3-mini \
50*523fa7a6SAndroid Build Coastguard Worker    examples/models/phi-3-mini
51*523fa7a6SAndroid Build Coastguard Worker
52*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out/examples/models/phi-3-mini -j16 --config Release
53*523fa7a6SAndroid Build Coastguard Worker```
54*523fa7a6SAndroid Build Coastguard Worker- Run model. Options available [here](https://github.com/pytorch/executorch/blob/main/examples/models/phi-3-mini/main.cpp#L13-L30)
55*523fa7a6SAndroid Build Coastguard Worker```
56*523fa7a6SAndroid Build Coastguard Workercmake-out/examples/models/phi-3-mini/phi_3_mini_runner \
57*523fa7a6SAndroid Build Coastguard Worker    --model_path=phi-3-mini.pte \
58*523fa7a6SAndroid Build Coastguard Worker    --tokenizer_path=tokenizer.bin \
59*523fa7a6SAndroid Build Coastguard Worker    --seq_len=128 \
60*523fa7a6SAndroid Build Coastguard Worker    --temperature=0 \
61*523fa7a6SAndroid Build Coastguard Worker    --prompt="<|system|>
62*523fa7a6SAndroid Build Coastguard WorkerYou are a helpful assistant.<|end|>
63*523fa7a6SAndroid Build Coastguard Worker<|user|>
64*523fa7a6SAndroid Build Coastguard WorkerWhat is the capital of France?<|end|>
65*523fa7a6SAndroid Build Coastguard Worker<|assistant|>"
66*523fa7a6SAndroid Build Coastguard Worker```
67