xref: /aosp_15_r20/external/libopus/dnn/torch/neural-pitch/README.md (revision a58d3d2adb790c104798cd88c8a3aff4fa8b82cc)
1*a58d3d2aSXin Li## Neural Pitch Estimation
2*a58d3d2aSXin Li
3*a58d3d2aSXin Li- Dataset Installation
4*a58d3d2aSXin Li    1. Download and unzip PTDB Dataset:
5*a58d3d2aSXin Li        wget https://www2.spsc.tugraz.at/databases/PTDB-TUG/SPEECH_DATA_ZIPPED.zip
6*a58d3d2aSXin Li        unzip SPEECH_DATA_ZIPPED.zip
7*a58d3d2aSXin Li
8*a58d3d2aSXin Li    2. Inside "SPEECH DATA" above, run ptdb_process.sh to combine male/female
9*a58d3d2aSXin Li
10*a58d3d2aSXin Li    3. To Download and combine demand, simply run download_demand.sh
11*a58d3d2aSXin Li
12*a58d3d2aSXin Li- LPCNet preparation
13*a58d3d2aSXin Li    1. To extract xcorr, add lpcnet_extractor.c and add relevant functions to lpcnet_enc.c, add source for headers/c files and Makefile.am, and compile to generate ./lpcnet_xcorr_extractor object
14*a58d3d2aSXin Li
15*a58d3d2aSXin Li- Dataset Augmentation and training (check out arguments to each of the following)
16*a58d3d2aSXin Li    1. Run data_augmentation.py
17*a58d3d2aSXin Li    2. Run training.py using augmented data
18*a58d3d2aSXin Li    3. Run experiments.py
19