1*523fa7a6SAndroid Build Coastguard Worker# Summary 2*523fa7a6SAndroid Build Coastguard WorkerThis example demonstrates how to run a [Phi-3-mini](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) 3.8B model via ExecuTorch. We use XNNPACK to accelarate the performance and XNNPACK symmetric per channel quantization. 3*523fa7a6SAndroid Build Coastguard Worker 4*523fa7a6SAndroid Build Coastguard Worker# Instructions 5*523fa7a6SAndroid Build Coastguard Worker## Step 1: Setup 6*523fa7a6SAndroid Build Coastguard Worker1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch. For installation run `./install_requirements.sh --pybind xnnpack` 7*523fa7a6SAndroid Build Coastguard Worker2. Currently, we support transformers v4.44.2. Install transformers with the following command: 8*523fa7a6SAndroid Build Coastguard Worker``` 9*523fa7a6SAndroid Build Coastguard Workerpip uninstall -y transformers ; pip install transformers==4.44.2 10*523fa7a6SAndroid Build Coastguard Worker``` 11*523fa7a6SAndroid Build Coastguard Worker## Step 2: Prepare and run the model 12*523fa7a6SAndroid Build Coastguard Worker1. Download the `tokenizer.model` from HuggingFace and create `tokenizer.bin`. 13*523fa7a6SAndroid Build Coastguard Worker``` 14*523fa7a6SAndroid Build Coastguard Workercd executorch 15*523fa7a6SAndroid Build Coastguard Workerwget -O tokenizer.model "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/tokenizer.model?download=true" 16*523fa7a6SAndroid Build Coastguard Workerpython -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin 17*523fa7a6SAndroid Build Coastguard Worker``` 18*523fa7a6SAndroid Build Coastguard Worker2. Export the model. This step will take a few minutes to finish. 19*523fa7a6SAndroid Build Coastguard Worker``` 20*523fa7a6SAndroid Build Coastguard Workerpython -m examples.models.phi-3-mini.export_phi-3-mini -c "4k" -s 128 -o phi-3-mini.pte 21*523fa7a6SAndroid Build Coastguard Worker``` 22*523fa7a6SAndroid Build Coastguard Worker3. Build and run the model. 23*523fa7a6SAndroid Build Coastguard Worker- Build executorch with optimized CPU performance as follows. Build options available [here](https://github.com/pytorch/executorch/blob/main/CMakeLists.txt#L59). 24*523fa7a6SAndroid Build Coastguard Worker ``` 25*523fa7a6SAndroid Build Coastguard Worker cmake -DPYTHON_EXECUTABLE=python \ 26*523fa7a6SAndroid Build Coastguard Worker -DCMAKE_INSTALL_PREFIX=cmake-out \ 27*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_ENABLE_LOGGING=1 \ 28*523fa7a6SAndroid Build Coastguard Worker -DCMAKE_BUILD_TYPE=Release \ 29*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \ 30*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \ 31*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \ 32*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_XNNPACK=ON \ 33*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ 34*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ 35*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ 36*523fa7a6SAndroid Build Coastguard Worker -Bcmake-out . 37*523fa7a6SAndroid Build Coastguard Worker 38*523fa7a6SAndroid Build Coastguard Worker cmake --build cmake-out -j16 --target install --config Release 39*523fa7a6SAndroid Build Coastguard Worker ``` 40*523fa7a6SAndroid Build Coastguard Worker- Build Phi-3-mini runner. 41*523fa7a6SAndroid Build Coastguard Worker``` 42*523fa7a6SAndroid Build Coastguard Workercmake -DPYTHON_EXECUTABLE=python \ 43*523fa7a6SAndroid Build Coastguard Worker -DCMAKE_INSTALL_PREFIX=cmake-out \ 44*523fa7a6SAndroid Build Coastguard Worker -DCMAKE_BUILD_TYPE=Release \ 45*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ 46*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ 47*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_XNNPACK=ON \ 48*523fa7a6SAndroid Build Coastguard Worker -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ 49*523fa7a6SAndroid Build Coastguard Worker -Bcmake-out/examples/models/phi-3-mini \ 50*523fa7a6SAndroid Build Coastguard Worker examples/models/phi-3-mini 51*523fa7a6SAndroid Build Coastguard Worker 52*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out/examples/models/phi-3-mini -j16 --config Release 53*523fa7a6SAndroid Build Coastguard Worker``` 54*523fa7a6SAndroid Build Coastguard Worker- Run model. Options available [here](https://github.com/pytorch/executorch/blob/main/examples/models/phi-3-mini/main.cpp#L13-L30) 55*523fa7a6SAndroid Build Coastguard Worker``` 56*523fa7a6SAndroid Build Coastguard Workercmake-out/examples/models/phi-3-mini/phi_3_mini_runner \ 57*523fa7a6SAndroid Build Coastguard Worker --model_path=phi-3-mini.pte \ 58*523fa7a6SAndroid Build Coastguard Worker --tokenizer_path=tokenizer.bin \ 59*523fa7a6SAndroid Build Coastguard Worker --seq_len=128 \ 60*523fa7a6SAndroid Build Coastguard Worker --temperature=0 \ 61*523fa7a6SAndroid Build Coastguard Worker --prompt="<|system|> 62*523fa7a6SAndroid Build Coastguard WorkerYou are a helpful assistant.<|end|> 63*523fa7a6SAndroid Build Coastguard Worker<|user|> 64*523fa7a6SAndroid Build Coastguard WorkerWhat is the capital of France?<|end|> 65*523fa7a6SAndroid Build Coastguard Worker<|assistant|>" 66*523fa7a6SAndroid Build Coastguard Worker``` 67