Name Date Size #Lines LOC

..--

README.mdH A D25-Apr-20251.9 KiB3322

gen_data_mm.pyH A D25-Apr-20254 KiB12390

gen_heuristic_a100.shH A D25-Apr-2025235 63

gen_heuristic_h100.shH A D25-Apr-2025235 63

get_mm_dataset.shH A D25-Apr-2025300 1310

train_decision_mm.pyH A D25-Apr-20251.9 KiB6545

README.md

1If you just want to re-generate existing heuristics with already collected data for mm for A100/H100, run the following scripts:
2
3`bash get_mm_dataset.sh # Downloads A100 and H100 datasets`
4`bash gen_heuristic_a100.sh # Generates A100 heuristic`
5`bash gen_heuristic_h100.sh # Generates H100 heuristic`
6
7If you want to collect new data, or generate a heuristic for another GPU, use the `generate_heuristic_mm.sh` script:
8First, go into the generate_heuristic_mm.sh and modify the variables according to the comments. Then, run the script to perform benchmarks and collect training data:
9
10`bash generate_heuristic.sh collect`
11
12This will collect training data on random inputs. Depending on how many GPUs you are using, this might take a day.
13If you use multiple GPU, you will have one file per GPU, e.g. "data_6.txt", "data_7.txt" if you used GPUs with id 6 and 7.
14To merge this into a single file run:
15`python torchgen/_autuoheuristic/merge_data.py mm_train.txt data_6.txt data_7.txt`
16
17For mm, we also want to incorporate data from huggingface and TIMM models into the training data.
18
19To collect data for huggingface, run the following command:
20
21```
22TORCHINDUCTOR_AUTOHEURISTIC_USE="" TORCHINDUCTOR_AUTOHEURISTIC_COLLECT="mm" TORCHINDUCTOR_AUTOHEURISTIC_LOG_PATH="hf_train_mm.txt" TORCHINDUCTOR_MAX_AUTOTUNE=1 time python ../../../benchmarks/dynamo/huggingface.py --ci --performance --timing --explain --inductor --device cuda --train --amp
23```
24
25To collect data for TIMM models, run the following command
26```
27TORCHINDUCTOR_AUTOHEURISTIC_USE="" TORCHINDUCTOR_AUTOHEURISTIC_COLLECT="mm" TORCHINDUCTOR_AUTOHEURISTIC_LOG_PATH="timm_train_mm.txt" TORCHINDUCTOR_MAX_AUTOTUNE=1 time python ../../../benchmarks/dynamo/timm_models.py --ci --performance --timing --explain --inductor --device cuda --train --amp
28```
29
30Afterwards, run the script in order to learn the heuristic:
31
32`bash generate_heuristic_mm.sh generate`
33