Name | Date | Size | #Lines | LOC | ||
---|---|---|---|---|---|---|
.. | - | - | ||||
data/ | H | 25-Apr-2025 | - | 502 | 369 | |
engine/ | H | 25-Apr-2025 | - | 204 | 102 | |
losses/ | H | 25-Apr-2025 | - | 277 | 202 | |
models/ | H | 25-Apr-2025 | - | 2,578 | 1,854 | |
stndrd/ | H | 25-Apr-2025 | - | 2,875 | 2,348 | |
utils/ | H | 25-Apr-2025 | - | 2,513 | 1,851 | |
README.md | H A D | 25-Apr-2025 | 2.5 KiB | 65 | 35 | |
adv_train_model.py | H A D | 25-Apr-2025 | 15.7 KiB | 463 | 317 | |
adv_train_vocoder.py | H A D | 25-Apr-2025 | 15 KiB | 452 | 304 | |
create_testvectors.py | H A D | 25-Apr-2025 | 6.5 KiB | 166 | 112 | |
export_model_weights.py | H A D | 25-Apr-2025 | 7.3 KiB | 175 | 137 | |
make_default_setup.py | H A D | 25-Apr-2025 | 3.6 KiB | 93 | 74 | |
requirements.txt | H A D | 25-Apr-2025 | 134 | 10 | 9 | |
test_model.py | H A D | 25-Apr-2025 | 3.1 KiB | 97 | 68 | |
test_vocoder.py | H A D | 25-Apr-2025 | 3.2 KiB | 104 | 74 | |
train_model.py | H A D | 25-Apr-2025 | 9.6 KiB | 308 | 206 | |
train_vocoder.py | H A D | 25-Apr-2025 | 9 KiB | 288 | 192 |
README.md
1# Opus Speech Coding Enhancement 2 3This folder hosts models for enhancing Opus SILK. 4 5## Environment setup 6The code is tested with python 3.11. Conda setup is done via 7 8 9`conda create -n osce python=3.11` 10 11`conda activate osce` 12 13`python -m pip install -r requirements.txt` 14 15 16## Generating training data 17First step is to convert all training items to 16 kHz and 16 bit pcm and then concatenate them. A convenient way to do this is to create a file list and then run 18 19`python scripts/concatenator.py filelist 16000 dataset/clean.s16 --db_min -40 --db_max 0` 20 21which on top provides some random scaling. 22 23Second step is to run a patched version of opus_demo in the dataset folder, which will produce the coded output and add feature files. To build the patched opus_demo binary, check out the exp-neural-silk-enhancement branch and build opus_demo the usual way. Then run 24 25`cd dataset && <path_to_patched_opus_demo>/opus_demo voip 16000 1 9000 -silk_random_switching 249 clean.s16 coded.s16 ` 26 27The argument to -silk_random_switching specifies the number of frames after which parameters are switched randomly. 28 29## Regression loss based training 30Create a default setup for LACE or NoLACE via 31 32`python make_default_setup.py model.yml --model lace/nolace --path2dataset <path2dataset>` 33 34Then run 35 36`python train_model.py model.yml <output folder> --no-redirect` 37 38for running the training script in foreground or 39 40`nohup python train_model.py model.yml <output folder> &` 41 42to run it in background. In the latter case the output is written to `<output folder>/out.txt`. 43 44## Adversarial training (NoLACE only) 45Create a default setup for NoLACE via 46 47`python make_default_setup.py nolace_adv.yml --model nolace --adversarial --path2dataset <path2dataset>` 48 49Then run 50 51`python adv_train_model.py nolace_adv.yml <output folder> --no-redirect` 52 53for running the training script in foreground or 54 55`nohup python adv_train_model.py nolace_adv.yml <output folder> &` 56 57to run it in background. In the latter case the output is written to `<output folder>/out.txt`. 58 59## Inference 60Generating inference data is analogous to generating training data. Given an item 'item1.wav' run 61`mkdir item1.se && sox item1.wav -r 16000 -e signed-integer -b 16 item1.raw && cd item1.se && <path_to_patched_opus_demo>/opus_demo voip 16000 1 <bitrate> ../item1.raw noisy.s16` 62 63The folder item1.se then serves as input for the test_model.py script or for the --testdata argument of train_model.py resp. adv_train_model.py 64 65Checkpoints of pre-trained models are located here: https://media.xiph.org/lpcnet/models/lace-20231019.tar.gz