Name | Date | Size | #Lines | LOC | ||
---|---|---|---|---|---|---|
.. | - | - | ||||
CMakeLists.txt | H A D | 25-Apr-2025 | 1.5 KiB | 54 | 47 | |
README.md | H A D | 25-Apr-2025 | 2.7 KiB | 67 | 63 | |
__init__.py | H A D | 25-Apr-2025 | 271 | 12 | 4 | |
eager.py | H A D | 25-Apr-2025 | 3.2 KiB | 119 | 85 | |
export_phi-3-mini.py | H A D | 25-Apr-2025 | 3.7 KiB | 121 | 93 | |
install_requirements.sh | H A D | 25-Apr-2025 | 299 | 15 | 4 | |
main.cpp | H A D | 25-Apr-2025 | 1.2 KiB | 51 | 27 | |
phi_3_mini.py | H A D | 25-Apr-2025 | 1.4 KiB | 42 | 24 | |
runner.cpp | H A D | 25-Apr-2025 | 3.1 KiB | 102 | 76 | |
runner.h | H A D | 25-Apr-2025 | 1.5 KiB | 51 | 24 | |
static_cache.py | H A D | 25-Apr-2025 | 1.3 KiB | 44 | 29 |
README.md
1# Summary 2This example demonstrates how to run a [Phi-3-mini](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) 3.8B model via ExecuTorch. We use XNNPACK to accelarate the performance and XNNPACK symmetric per channel quantization. 3 4# Instructions 5## Step 1: Setup 61. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch. For installation run `./install_requirements.sh --pybind xnnpack` 72. Currently, we support transformers v4.44.2. Install transformers with the following command: 8``` 9pip uninstall -y transformers ; pip install transformers==4.44.2 10``` 11## Step 2: Prepare and run the model 121. Download the `tokenizer.model` from HuggingFace and create `tokenizer.bin`. 13``` 14cd executorch 15wget -O tokenizer.model "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/tokenizer.model?download=true" 16python -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin 17``` 182. Export the model. This step will take a few minutes to finish. 19``` 20python -m examples.models.phi-3-mini.export_phi-3-mini -c "4k" -s 128 -o phi-3-mini.pte 21``` 223. Build and run the model. 23- Build executorch with optimized CPU performance as follows. Build options available [here](https://github.com/pytorch/executorch/blob/main/CMakeLists.txt#L59). 24 ``` 25 cmake -DPYTHON_EXECUTABLE=python \ 26 -DCMAKE_INSTALL_PREFIX=cmake-out \ 27 -DEXECUTORCH_ENABLE_LOGGING=1 \ 28 -DCMAKE_BUILD_TYPE=Release \ 29 -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \ 30 -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \ 31 -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \ 32 -DEXECUTORCH_BUILD_XNNPACK=ON \ 33 -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ 34 -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ 35 -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ 36 -Bcmake-out . 37 38 cmake --build cmake-out -j16 --target install --config Release 39 ``` 40- Build Phi-3-mini runner. 41``` 42cmake -DPYTHON_EXECUTABLE=python \ 43 -DCMAKE_INSTALL_PREFIX=cmake-out \ 44 -DCMAKE_BUILD_TYPE=Release \ 45 -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ 46 -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ 47 -DEXECUTORCH_BUILD_XNNPACK=ON \ 48 -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ 49 -Bcmake-out/examples/models/phi-3-mini \ 50 examples/models/phi-3-mini 51 52cmake --build cmake-out/examples/models/phi-3-mini -j16 --config Release 53``` 54- Run model. Options available [here](https://github.com/pytorch/executorch/blob/main/examples/models/phi-3-mini/main.cpp#L13-L30) 55``` 56cmake-out/examples/models/phi-3-mini/phi_3_mini_runner \ 57 --model_path=phi-3-mini.pte \ 58 --tokenizer_path=tokenizer.bin \ 59 --seq_len=128 \ 60 --temperature=0 \ 61 --prompt="<|system|> 62You are a helpful assistant.<|end|> 63<|user|> 64What is the capital of France?<|end|> 65<|assistant|>" 66``` 67