|
Name |
|
Date |
Size |
#Lines |
LOC |
| .. | | - | - |
| .github/workflows/ | H | 25-Apr-2025 | - | 52 | 44 |
| cmake/ | H | 25-Apr-2025 | - | 186 | 155 |
| common/ | H | 25-Apr-2025 | - | 8,823 | 7,823 |
| decoder/ | H | 25-Apr-2025 | - | 175,803 | 140,829 |
| docs/ | H | 25-Apr-2025 | - | | |
| encoder/ | H | 25-Apr-2025 | - | 110,596 | 91,532 |
| fuzzer/ | H | 25-Apr-2025 | - | 1,809 | 1,374 |
| test/ | H | 25-Apr-2025 | - | 6,234 | 4,569 |
| Android.bp | H A D | 25-Apr-2025 | 19.8 KiB | 522 | 494 |
| CMakeLists.txt | H A D | 25-Apr-2025 | 810 | 30 | 21 |
| LICENSE | H A D | 25-Apr-2025 | 10.5 KiB | 192 | 160 |
| METADATA | H A D | 25-Apr-2025 | 522 | 20 | 18 |
| MODULE_LICENSE_APACHE2 | HD | 25-Apr-2025 | 0 | | |
| NOTICE | H A D | 25-Apr-2025 | 932 | 20 | 19 |
| OWNERS | H A D | 25-Apr-2025 | 108 | 4 | 3 |
| PREUPLOAD.cfg | H A D | 25-Apr-2025 | 89 | 3 | 2 |
| README.experimental | H A D | 25-Apr-2025 | 210 | 6 | 4 |
| README.md | H A D | 25-Apr-2025 | 6.8 KiB | 103 | 80 |
| README_dec.md | H A D | 25-Apr-2025 | 19.1 KiB | 242 | 219 |
| README_enc.md | H A D | 25-Apr-2025 | 5.8 KiB | 123 | 111 |
README.experimental
1This xaac codec (external/xaac) is experimental; it is not yet intended
2to be used on production devices.
3
4This codec should not be configured into any production Android device
5that is intended to be shipped.
6
README.md
1# Introduction to libxaac
2
3Extended HE-AAC, the latest innovation member of the MPEG AAC codec family, is ideally suited for adaptive bit rate streaming and digital radio applications. Extended HE-AAC bridges the gap between speech and audio coding and ensures consistent high-quality audio for all signal types, including speech, music, and mixed material. It is the required audio codec for DRM (Digital Radio Mondiale). When it comes to coding, the codec is incredibly effective, generating high-quality audio for music and speech at bitrates as low as 6 kbit/s for mono and 12 kbit/s for stereo services. By switching to extremely low bitrate streams, Extended HE-AAC streaming apps and streaming radio players can provide uninterrupted playback even during very congested network conditions.
4
5As the Extended High Efficiency AAC Profile is a logical evolution of the MPEG Audio's popular AAC Family profiles, the codec supports AAC-LC, HE-AACv1 (AAC+) and HE-AACv2 (eAAC+) audio object type encoding. The bitrate that was saved with AAC family tools can be used to enhance video quality. Extended HE-AAC is a well-liked option for a number of applications since it is a strong and effective audio codec that provides high-quality audio at low bitrates.
6
7
8
9One of the key features of libxaac (refer to above image) is that it has support for AAC-LD (Low Delay), AAC-ELD (Enhanced Low Delay), and AAC-ELDv2 (Enhanced Low Delay version 2) modes. AAC-LD mode provides low latency encoding, making it suitable for applications such as interactive communication and live audio streaming. It helps to reduce the delay in the encoding process to improve the real-time performance of the system. AAC-ELD mode improves the low-delay performance of HE-AAC by reducing the coding delay while maintaining high audio quality. It was observed that minimum delay it can achieve is 15ms. In order to achieve low delay coding scheme and low bitrate, it uses the Low Delay SBR tool. AAC-ELDv2 is the most advanced version of AAC-based low delay coding. It provides an enhanced version of AAC-ELD, which provides even lower coding delay and higher audio quality.
10
11MPEG-D USAC, also known as Unified Speech and Audio Coding, is designed to provide high-quality audio coding at low bit rates. MPEG-D USAC combines advanced audio coding techniques with state-of-the-art speech coding algorithms to achieve significant compression gains while maintaining perceptual audio quality. The standard supports a wide range of audio content, including music, speech, and mixed audio, making it versatile for different use cases. With its ability to deliver high-fidelity audio at reduced bit rates, MPEG-D USAC plays a crucial role in optimizing bandwidth usage and enhancing the user experience in the digital audio domain.
12
13Overall, libxaac, with support for AAC-LD, AAC-ELD, and AAC-ELDv2 modes, is a versatile audio coding technology that can be used for a wide range of applications, such as broadcasting, streaming, and teleconferencing which requires high-quality audio compression with minimal delay.
14
15Also, the libxaac supports MPEG-D DRC (Dynamic Range Control) for the Extended HE-AAC profile in both encoder and decoder. MPEG-D DRC offers a bitrate efficient representation of dynamically compressed versions of an audio signal. This is achieved by adding a low-bitrate DRC metadata stream to the audio signal. DRC includes dedicated sections for metadata-based loudness leveling, clipping prevention, ducking, and for generating a fade-in and fade-out to supplement the main dynamic range compression functionality. The DRC effects available at the DRC decoder are generated at the DRC encoder side. At the DRC decoder side, the audio signal may be played back without applying DRC, or an appropriate DRC effect is selected and applied based on the given playback scenario. It offers flexible solutions to efficiently support the widespread demand for technologies such as loudness normalization and dynamic range compression for various playback scenarios.
16
17#### Note
18* The operating points for MPEG-D USAC (along with MPEG-D DRC) in libxaac encoder is currently restricted to 64 kbps and 96 kbps. It is recommended to use the encoder at these operating points only. The support shall be extended to other operating points soon.
19* Further Quality enhancements for AAC-ELD and AAC-ELDv2 modes may be pushed as quality assessment is in progress.
20
21# Building the libxaac decoder and encoder
22
23## Building for AOSP
24* Makefile for building the libxaac decoder and encoder library is provided in root(`libxaac/`) folder.
25* Makefile for building the libxaac decoder and encoder testbench is provided in `test` folder.
26* Build the library followed by the application using the below commands:
27Go to root directory
28```
29$ mm
30```
31
32## Using CMake
33Users can also use cmake to build for `x86`, `x86_64`, `armv7`, `armv8` and Windows (MSVS project) platforms.
34
35### Creating MSVS project files
36To create MSVS project files for the libxaac decoder and encoder from cmake, run the following commands:
37```
38Go to the root directory(libxaac/).
39Create a new folder in the project root directory and move to the newly created folder.
40
41$ cd <path to libxaac>
42$ mkdir bin
43$ cd bin
44$ cmake -G "Visual Studio 15 2017" ..
45```
46
47Above command will create Win32 version of MSVS workspace
48To create MSVS project files for Win64 version from cmake, run the following commands:
49```
50$ mkdir cmake_build
51$ cd cmake_build
52$ cmake -G "Visual Studio 15 2017 Win64" ..
53```
54The above command creates MSVS 2017 project files. If the version is different, modify the generator name accordingly.
55
56### Building for native platforms
57Run the following commands to build the libxaac decoder and encoder for native platform:
58```
59Go to the root directory(libxaac/).
60Create a new folder in the project root directory and move to the newly created folder.
61
62$ cd <path to libxaac>
63$ mkdir bin
64$ cd bin
65$ cmake ..
66$ make
67```
68
69### Cross-compiler based builds
70Ensure to edit the file cmake/toolchains/*_toolchain.cmake to set proper paths in host for corresponding platforms.
71
72### Building for x86_32 on a x86_64 Linux machine
73```
74$ cd <path to libxaac>
75$ mkdir build
76$ cd build
77$ cmake .. -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/x86_toolchain.cmake
78$ make
79```
80
81### Building for aarch32/aarch64
82Update 'CMAKE_C_COMPILER', 'CMAKE_CXX_COMPILER', 'CMAKE_C_COMPILER_AR', and 'CMAKE_CXX_COMPILER_AR' in CMAKE_TOOLCHAIN_FILE passed below
83```
84$ cd <path to libxaac>
85$ mkdir build
86$ cd build
87```
88
89### For aarch64
90```
91$ cmake .. -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/aarch64_toolchain.cmake
92$ make
93```
94
95### For aarch32
96```
97$ cmake .. -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/aarch32_toolchain.cmake
98$ make
99```
100
101### For API and testbench usage of decoder, please refer [`README_dec.md`](README_dec.md)
102### For API and testbench usage of encoder, please refer [`README_enc.md`](README_enc.md)
103
README_dec.md
1
2
3# Introduction to libxaac decoder APIs
4
5## Decoder APIs
6
7A single API is used to get and set configurations and execute the decode thread, based on command index passed.
8* ia_xheaacd_dec_api
9
10| **API Command** | **API Sub Command** | **Description** |
11|------|------|------|
12|IA_API_CMD_GET_LIB_ID_STRINGS | IA_CMD_TYPE_LIB_NAME | Gets the decoder library name |
13|IA_API_CMD_GET_LIB_ID_STRINGS | IA_CMD_TYPE_LIB_VERSION | Gets the decoder version |
14|IA_API_CMD_GET_API_SIZE | 0 | Gets the memory requirements size of the API |
15|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_API_PRE_CONFIG_PARAMS | Sets the configuration parameters of the libxaac decoder to default values |
16|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_API_POST_CONFIG_PARAMS | Sets the attributes(size, priority, alignment) of all memory types required by the application onto the memory structure |
17|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_PROCESS | Search for the valid header, does header decode to get the parameters and initializes state and configuration structure |
18|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_DONE_QUERY | Checks if the initialization process has completed |
19|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_PCM_WDSZ | Sets the bit width of the output PCM samples. The value has to be 16 |
20|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_SAMP_FREQ | Sets the core AAC sampling frequency for RAW header decoding |
21|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_EFFECT_TYPE | Sets the value of DRC effect type |
22|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_TARGET_LOUDNESS | Sets the value of DRC target loudness |
23|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DOWNMIX | Sets the parameter whether the output needs to be down-mix to mono(1) or not(0) |
24|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_TOSTEREO | Sets the flag to disable interleave mono to stereo |
25|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DSAMPLE | Sets the parameter whether the output needs to be downsampled(1) or not(0) |
26|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_ISMP4 | Sets the flag to 0 or 1 to indicate whether given test vector is an mp4 file or not |
27|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_MAX_CHANNEL | Sets the maximum number of channels present. Its maximum value is 2 for stereo library and 8 for multichannel library |
28|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_COUP_CHANNEL | Sets the number of coupling channels to be used for coupling. It can take values from 0 to 16. This command is supported only if the library has multichannel support |
29|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DOWNMIX_STEREO | Sets the flag of downmixing n number of channels to stereo. Can be 0 or 1. This command is supported only if the library has multichannel support |
30|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_DISABLE_SYNC | Sets the flag of ADTS syncing or not ADTS syncing as 0 or 1 |
31|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_AUTO_SBR_UPSAMPLE | Sets the parameter auto SBR upsample to 0 or 1. Used in case of stream changing from SBR present to SBR not present |
32|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_CUT | Sets the value of DRC cut factor |
33|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_BOOST | Sets the value of DRC boost factor |
34|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_HEAVY_COMP | Sets the parameter to either enable/disable DRC heavy compression |
35|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_FRAMESIZE | Sets the parameter whether decoder should decode for frame length 480 or 512 |
36|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_LD_TESTING | Sets the flag to enable LD testing in decoder |
37|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_HQ_ESBR | Sets the flag to enable/disable high quality eSBR |
38|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_PS_ENABLE | Sets the flag to indicate the presence of PS data in eSBR bit stream |
39|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_PEAK_LIMITER | Sets the flag to disable/enable peak limiter |
40|IA_API_CMD_SET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_FRAMELENGTH_FLAG | Sets to flag to indicate whether decoder should decode for frame length 960 or 1024 |
41|IA_API_CMD_SET_CONFIG_PARAM | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_TARGET_LEVEL | Sets the value of DRC target level |
42|IA_API_CMD_SET_CONFIG_PARAM | IA_XHEAAC_DEC_CONFIG_ERROR_CONCEALMENT | Sets to flag to disable/enable error concealment |
43|IA_API_CMD_SET_CONFIG_PARAM | IA_XHEAAC_DEC_CONFIG_PARAM_ESBR | Sets to flag to disable/enable eSBR |
44|IA_API_CMD_GET_N_MEMTABS | 0 | Gets the number of memory types |
45|IA_API_CMD_GET_N_TABLES | 0 | Gets the number of tables |
46|IA_API_CMD_GET_MEM_INFO_SIZE | 0 | Gets the size of the memory type being referred to by the index |
47|IA_API_CMD_GET_MEM_INFO_ALIGNMENT | 0 | Gets the alignment information of the memory-type being referred to by the index |
48|IA_API_CMD_GET_MEM_INFO_TYPE | 0 | Gets the type of memory being referred to by the index |
49|IA_API_CMD_SET_MEM_PTR | 0 | Sets the pointer to the memory being referred to by the index to the input value |
50|IA_API_CMD_GET_TABLE_INFO_SIZE | 0 | Gets the size of the memory type being referred to by the index |
51|IA_API_CMD_GET_TABLE_INFO_ALIGNMENT | 0 | Gets the alignment information of the memory-type being referred to by the index |
52|IA_API_CMD_GET_TABLE_PTR | 0 | Gets the address of the current location of the table |
53|IA_API_CMD_SET_TABLE_PTR | 0 | Sets the relocated table address |
54|IA_API_CMD_GET_MEMTABS_SIZE | 0 | Gets the size of the memory structures |
55|IA_API_CMD_SET_MEMTABS_PTR | 0 | Sets the memory structure pointer in the library to the allocated value |
56|IA_API_CMD_INPUT_OVER | 0 | Signals the end of bit-stream to the library |
57|IA_API_CMD_SET_INPUT_BYTES | 0 | Sets the number of bytes available in the input buffer for initialization |
58|IA_API_CMD_GET_CURIDX_INPUT_BUF | 0 | Gets the number of input buffer bytes consumed by the last initialization |
59|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_PCM_WDSZ | Gets the output PCM word size |
60|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_SAMP_FREQ | Gets the sampling frequency |
61|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_NUM_CHANNELS | Gets the output number of channels |
62|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_CHANNEL_MASK | Gets the channel mask |
63|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_CHANNEL_MODE | Gets the channel mode. (Mono or PS/Stereo/Dual-mono) |
64|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_SBR_MODE | Gets the SBR mode (Present/ Not Present) |
65|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_EFFECT_TYPE | Gets the value of DRC effect type |
66|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_TARGET_LOUDNESS | Gets the value of DRC target loudness |
67|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_LOUD_NORM | Gets the value of DRC loudness normalization level |
68|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_PARAM_AOT | Gets the value of audio object type |
69|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_EXT_ELE_PTR | Gets the extension element pointer |
70|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_EXT_ELE_BUF_SIZES | Gets the extension element buffer sizes |
71|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_NUM_ELE | Gets the number of configuration elements |
72|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_NUM_CONFIG_EXT | Gets the number of extension elements |
73|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_GAIN_PAYLOAD_LEN | Gets the gain payload length |
74|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_GAIN_PAYLOAD_BUF | Gets the gain payload buffer |
75|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_CONFIG_GET_NUM_PRE_ROLL_FRAMES | Gets the number of preroll frames |
76|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_DRC_IS_CONFIG_CHANGED | Gets the value of DRC config change flag |
77|IA_API_CMD_GET_CONFIG_PARAM | IA_ENHAACPLUS_DEC_DRC_APPLY_CROSSFADE | Gets the value of DRC crossfade flag |
78|IA_API_CMD_EXECUTE | IA_CMD_TYPE_DO_EXECUTE | Executes the decode thread |
79|IA_API_CMD_EXECUTE | IA_CMD_TYPE_DONE_QUERY | Checks if the end of decode has been reached |
80|IA_API_CMD_GET_OUTPUT_BYTES | 0 | Gets the number of bytes output by the decoder in the last frame |
81
82### Aliases for some of the macros exist as XHEAAC
83
84| **Macro** | **Alias** |
85|------|------|
86|IA_ENHAACPLUS_DEC_CONFIG_PARAM_PCM_WDSZ | IA_XHEAAC_DEC_CONFIG_PARAM_PCM_WDSZ |
87|IA_ENHAACPLUS_DEC_CONFIG_PARAM_SAMP_FREQ | IA_XHEAAC_DEC_CONFIG_PARAM_SAMP_FREQ |
88|IA_ENHAACPLUS_DEC_CONFIG_PARAM_NUM_CHANNELS | IA_XHEAAC_DEC_CONFIG_PARAM_NUM_CHANNELS |
89|IA_ENHAACPLUS_DEC_CONFIG_PARAM_CHANNEL_MASK | IA_XHEAAC_DEC_CONFIG_PARAM_CHANNEL_MASK |
90|IA_ENHAACPLUS_DEC_CONFIG_PARAM_CHANNEL_MODE | IA_XHEAAC_DEC_CONFIG_PARAM_CHANNEL_MODE |
91|IA_ENHAACPLUS_DEC_CONFIG_PARAM_SBR_MODE | IA_XHEAAC_DEC_CONFIG_PARAM_SBR_MODE |
92|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_EFFECT_TYPE | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_EFFECT_TYPE |
93|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_TARGET_LOUDNESS | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_TARGET_LOUDNESS |
94|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_LOUD_NORM | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_LOUD_NORM |
95|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DOWNMIX | IA_XHEAAC_DEC_CONFIG_PARAM_DOWNMIX |
96|IA_ENHAACPLUS_DEC_CONFIG_PARAM_TOSTEREO | IA_XHEAAC_DEC_CONFIG_PARAM_TOSTEREO |
97|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DSAMPLE | IA_XHEAAC_DEC_CONFIG_PARAM_DSAMPLE |
98|IA_ENHAACPLUS_DEC_CONFIG_PARAM_ISMP4 | IA_XHEAAC_DEC_CONFIG_PARAM_MP4FLAG |
99|IA_ENHAACPLUS_DEC_CONFIG_PARAM_MAX_CHANNEL | IA_XHEAAC_DEC_CONFIG_PARAM_MAX_CHANNEL |
100|IA_ENHAACPLUS_DEC_CONFIG_PARAM_COUP_CHANNEL | IA_XHEAAC_DEC_CONFIG_PARAM_COUP_CHANNEL |
101|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DOWNMIX_STEREO | IA_XHEAAC_DEC_CONFIG_PARAM_DOWNMIX_STEREO |
102|IA_ENHAACPLUS_DEC_CONFIG_DISABLE_SYNC | IA_XHEAAC_DEC_CONFIG_DISABLE_SYNC |
103|IA_ENHAACPLUS_DEC_CONFIG_PARAM_AUTO_SBR_UPSAMPLE | IA_XHEAAC_DEC_CONFIG_PARAM_AUTO_SBR_UPSAMPLE |
104|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_CUT | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_CUT |
105|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_BOOST | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_BOOST |
106|IA_ENHAACPLUS_DEC_CONFIG_PARAM_DRC_HEAVY_COMP | IA_XHEAAC_DEC_CONFIG_PARAM_DRC_HEAVY_COMP |
107|IA_ENHAACPLUS_DEC_CONFIG_PARAM_FRAMESIZE | IA_XHEAAC_DEC_CONFIG_PARAM_FRAMESIZE |
108|IA_ENHAACPLUS_DEC_CONFIG_PARAM_LD_TESTING | IA_XHEAAC_DEC_CONFIG_PARAM_LD_TESTING |
109|IA_ENHAACPLUS_DEC_CONFIG_PARAM_HQ_ESBR | IA_XHEAAC_DEC_CONFIG_PARAM_HQ_ESBR |
110|IA_ENHAACPLUS_DEC_CONFIG_PARAM_PS_ENABLE | IA_XHEAAC_DEC_CONFIG_PARAM_PS_ENABLE |
111|IA_ENHAACPLUS_DEC_CONFIG_PARAM_AOT | IA_XHEAAC_DEC_CONFIG_PARAM_AOT |
112|IA_ENHAACPLUS_DEC_CONFIG_PARAM_PEAK_LIMITER | IA_XHEAAC_DEC_CONFIG_PARAM_PEAK_LIMITER |
113|IA_ENHAACPLUS_DEC_CONFIG_PARAM_FRAMELENGTH_FLAG | IA_XHEAAC_DEC_CONFIG_PARAM_FRAMELENGTH_FLAG |
114
115## DRC APIs
116
117A single API is used to get and set configurations and execute the decode thread, based on command index passed.
118* ia_drc_dec_api
119
120| **API Command** | **API Sub Command** | **Description** |
121|------|------|------|
122|IA_API_CMD_GET_API_SIZE | 0 | Gets the memory requirements size of the API |
123|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_API_PRE_CONFIG_PARAMS | Sets the configuration parameters of the libxaac decoder to default values |
124|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_API_POST_CONFIG_PARAMS | Sets the attributes(size, priority, alignment) of all memory types required by the application onto the memory structure |
125|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_PROCESS | Search for the valid header, does header decode to get the parameters and initializes state and configuration structure |
126|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_CPY_BSF_BUFF | Sets the bitstream split format buffer |
127|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_CPY_IC_BSF_BUFF | Sets the configuration bitstream split format buffer |
128|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_CPY_IL_BSF_BUFF | Sets the loudness bitstream split format buffer |
129|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_CPY_IN_BSF_BUFF | Sets the interface bitstream split format buffer |
130|IA_API_CMD_INIT | IA_CMD_TYPE_INIT_SET_BUFF_PTR | Sets the input buffer pointer |
131|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_SAMP_FREQ | Sets the sampling frequency of the input stream/data |
132|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_NUM_CHANNELS | Sets the number of channels in the input stream/data |
133|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_PCM_WDSZ | Sets the PCM word size of the input data |
134|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_BITS_FORMAT | Sets the bit stream format |
135|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_INT_PRESENT | Sets the DRC decoder�s interface present flag to 1 or 0 |
136|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_FRAME_SIZE | Sets the frame size |
137|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_DRC_EFFECT_TYPE | Sets the value of DRC effect type |
138|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_DRC_TARGET_LOUDNESS | Sets the value of DRC target loudness |
139|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_DRC_LOUD_NORM | Sets the value of DRC loudness normalization level |
140|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_APPLY_CROSSFADE | Sets the value of DRC crossfade flag |
141|IA_API_CMD_SET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_CONFIG_CHANGED | Sets the value of DRC config change flag |
142|IA_API_CMD_GET_MEM_INFO_SIZE | 0 | Gets the size of the memory type being referred to by the index |
143|IA_API_CMD_GET_MEMTABS_SIZE | 0 | Gets the size of the memory structures |
144|IA_API_CMD_SET_MEMTABS_PTR | 0 | Sets the memory structure pointer in the library to the allocated value |
145|IA_API_CMD_GET_N_MEMTABS | 0 | Gets the number of memory types |
146|IA_API_CMD_SET_INPUT_BYTES | 0 | Sets the number of bytes available in the input buffer for initialization |
147|IA_API_CMD_GET_MEM_INFO_ALIGNMENT | 0 | Gets the alignment information of the memory-type being referred to by the index |
148|IA_API_CMD_GET_MEM_INFO_TYPE | 0 | Gets the type of memory being referred to by the index |
149|IA_API_CMD_SET_MEM_PTR | 0 | Sets the pointer to the memory being referred to by the index to the input value |
150|IA_API_CMD_SET_INPUT_BYTES_BS | 0 | Sets the number of bytes to be processed in bitstream split format |
151|IA_API_CMD_SET_INPUT_BYTES_IC_BS | 0 | Sets the number of bytes to be processed in configuration bitstream split format |
152|IA_API_CMD_SET_INPUT_BYTES_IL_BS | 0 | Sets the number of bytes to be processed in loudness bitstream split format |
153|IA_API_CMD_GET_CONFIG_PARAM | IA_DRC_DEC_CONFIG_PARAM_NUM_CHANNELS | Gets the output number of channels |
154|IA_API_CMD_EXECUTE | IA_CMD_TYPE_DO_EXECUTE | Executes the decode thread |
155
156## Flowchart of calling sequence
157
158
159
160# Running the libxaac decoder
161
162The libxaac decoder can be run by providing command-line parameters(CLI options) directly or by providing a parameter file as a command line argument.
163
164Command line usage :
165```
166<executable> -ifile:<input_file> -imeta:<meta_data_file> -ofile:<output_file> [options]
167
168[options] can be,
169[-mp4:<mp4_flag>]
170[-pcmsz:<pcmwordsize>]
171[-dmix:<down_mix>]
172[-esbr_hq:<esbr_hq_flag>]
173[-esbr_ps:<esbr_ps_flag>]
174[-tostereo:<interleave_to_stereo>]
175[-dsample:<down_sample_sbr>]
176[-drc_cut_fac:<drc_cut_factor>]
177[-drc_boost_fac:<drc_boost_factor>]
178[-drc_target_level:<drc_target_level>]
179[-drc_heavy_comp:<drc_heavy_compression>]
180[-effect:<effect_type>]
181[-target_loudness:<target_loudness>]
182[-nosync:<disable_sync>]
183[-sbrup:<auto_sbr_upsample>]
184[-flflag:<framelength_flag>}
185[-fs:<RAW_sample_rate>]
186[-maxchannel:<maximum_num_channels>]
187[-coupchannel:<coupling_channel>]
188[-downmix:<down_mix_stereo>]
189[-fs480:<ld_frame_size>]
190[-ld_testing:<ld_testing_flag>]
191[-peak_limiter_off:<peak_limiter_off_flag>]
192[-err_conceal:<error_concealment_flag>]
193[-esbr:<esbr_flag>]
194
195where,
196 <input_file> is the input AAC-LC/HE-AACv1/HE-AACv2/AAC-LD/AAC-ELD/AAC-ELDv2/USAC file name.
197 <meta_data_file> is a text file which contains metadata. To be given when -mp4:1 is enabled.
198 <output_file> is the output file name.
199 <mp4_flag> is a flag that should be set to 1 when passing raw stream along with meta data text file.
200 <pcmwordsize> is the bits per sample info. value can be 16 or 24.
201 <down_mix> is to enable/disable always mono output. Default 1.
202 <esbr_hq_flag> is to enable/disable high quality eSBR. Default 0.
203 <esbr_ps_flag> is to indicate eSBR with PS. Default 0.
204 <interleave_to_stereo> is to enable/disable always interleaved to stereo output. Default 1.
205 <down_sample_sbr> is to enable/disable down-sampled SBR output. Default auto identification from header.
206 <drc_cut_factor> is to set DRC cut factor value. Default value is 0.
207 <drc_boost_factor> is to set DRC boost factor. Default value is 0.
208 <drc_target_level> is to set DRC target reference level. Default value is 108.
209 <drc_heavy_compression> is to enable/disable DRC heavy compression. Default value is 0.
210 <effect_type> is to set DRC effect type. Default value is 0.
211 <target_loudness> is to set target loudness level. Default value is -24.
212 <disable_sync> is to disable the ADTS/ADIF sync search i.e when enabled the decoder expects the header to be at the start of input buffer. Default 0.
213 <auto_sbr_upsample> is to enable(1) or disable(0) auto SBR upsample in case of stream changing from SBR present to SBR not present. Default 1.
214 <framelength_flag> is flag for decoding framelength of 1024 or 960. 1 to decode 960 frame length, 0 to decode 1024 frame length.
215 Frame length value in the GA header will override this option. Default 0.
216 <RAW_sample_rate> is to indicate the core AAC sample rate for a RAW stream. If this is specified no other file format headers are searched for.
217 <maximum_num_channels> is the number of maxiumum channels the input may have. Default is 6 for multichannel libraries and 2 for stereo libraries.
218 <coupling_channel> is element instance tag of independent coupling channel to be mixed. Default is 0.
219 <down_mix_stereo> is flag for Downmix. Give 1 to get stereo (downmix) output. Default is 0.
220 <ld_frame_size> is to indicate ld frame size. 0 is for 512 frame length, 1 is for 480 frame length. Default value is 512 (0).
221 <ld_testing_flag> is to enable/disable ld decoder testing. Default value is 0.
222 <peak_limiter_off_flag> is to enable/disable peak limiter. Default value is 0.
223 <error_concealment_flag> is to enable/disable error concealment. Default value is 0.
224 <esbr_flag> is to enable/disable eSBR. Default value is 1.
225
226```
227Sample CLI:
228```
229<xaac_dec_exe> -ifile:in_file.aac -ofile:out_file.wav -pcmsz:16
230```
231
232# Validating the libxaac decoder
233
234Conformance testing for AAC/HE-AACv1/HE-AACv2 mainly involves comparing
235decoder under test output with the ISO and 3GPP reference decoded output.
236
237Testing for USAC is done using encoded streams generated using ISO USAC
238reference encoder. The output generated by libxaac USAC decoder is
239compared against the output generated by ISO USAC decoder for 16-bit
240conformance on the respective(ARMv7, ARMv8, X86_32, X86_64) platforms.
241
242
README_enc.md
1# Introduction libxaac encoder APIs
2
3# Encoder APIs
4
5| **API Call** | **Description** |
6|------|------|
7|ixheaace_get_lib_id_strings| Gets the encoder library name and version number details |
8|ixheaace_create| Sets the encoder configuration parameters, gets the memory requirements and allocates required memory |
9|ixheaace_process| Encodes the input frame data |
10|ixheaace_delete| Frees the allocated memories for the encoder |
11
12## Flowchart of calling sequence
13
14
15# Audio Object Types(AOT)
16| **AOT** | **Description** |
17|------|------|
18|2|AAC-LC|
19|5|HE-AACv1|
20|23|AAC-LD|
21|29|HE-AACv2|
22|39|AAC-ELD|
23|42|USAC|
24
25# Running the libxaac encoder
26The libxaac encoder can be run by providing command-line parameters(CLI options) directly or by providing a parameter file as a command line argument.
27The reference paramfile is placed in `encoder\test` directory(paramfilesimple.txt)
28The configuration file for DRC is placed in `encoder\test` directory(impd_drc_config_params.txt). Please make sure this file is placed in the same directory as the executable.
29
30# Command line usage :
31```
32<exceutable> -ifile:<input_file> -ofile:<out_file> [options]
33(or)
34<executable> -paramfile:<paramfile>
35[options] can be,
36[-br:<bitrate>]
37[-mps:<use_mps>]
38[-adts:<use_adts_flag>]
39[-tns:<use_tns_flag>]
40[-nf:<use_noise_filling>]
41[-cmpx_pred:<use_complex_prediction>]
42[-framesize:<framesize_to_be_used>]
43[-aot:<audio_object_type>]
44[-esbr:<esbr_flag>]
45[-full_bandwidth:<enable_full_bandwidth>]
46[-max_out_buffer_per_ch:<bitreservoir_size>]
47[-tree_cfg:<tree_config>]
48[-usac:<usac_encoding_mode>]
49[-ccfl_idx:<corecoder_framelength_index>]
50[-pvc_enc:<pvc_enc_flag>]
51[-harmonic_sbr:<harmonic_sbr_flag>]
52[-esbr_hq:<esbr_hq_flag>]
53[-drc:<drc_flag>]
54[-inter_tes_enc:<inter_tes_enc_flag>]
55[-rap:<random access interval in ms>]
56[-stream_id:<stream identifier>]
57
58where,
59 <paramfile> is the parameter file with multiple commands
60 <inputfile> is the input 16-bit WAV or PCM file name
61 <outputfile> is the output ADTS/ES file name
62 <bitrate> is the bit-rate in bits per second. Default value is 48000.
63 <use_mps> Valid values are 0 (disable MPS) and 1 (enable MPS). Default is 0.
64 <use_adts_flag> Valid values are 0 ( No ADTS header) and 1 ( generate ADTS header). Default is 0.
65 <use_tns_flag> Valid values are 0 (disable TNS) and 1 (enable TNS). Default is 1.
66 <use_noise_filling> controls usage of noise filling in encoding. Default 0.
67 <use_complex_prediction> controls usage of complex prediction in encoding. Default is 0.
68 <framesize_to_be_used> is the framesize to be used.
69 For AOT 23, 39 (LD core coder profiles) valid values are 480 and 512. Default is 512.
70 For AOT 2, 5, 29 (LC core coder profiles) valid values are 960 and 1024. Default is 1024.
71 For AOT 42 (USAC profile) valid values are 768 and 1024. Default is 1024.
72 <audio_object_type> is the Audio object type.
73 2 for AAC-LC
74 5 for HE-AACv1(Legacy SBR)
75 23 for AAC-LD
76 29 for HE-AACv2
77 39 for AAC-ELD
78 42 for USAC
79 Default is 2 for AAC-LC.
80 <esbr_flag> Valid values are 0 (disable eSBR) and 1 (enable eSBR). Default is 0 for HE-AACv1 profile (legacy SBR) and 1 for USAC profile.
81 <enable_full_bandwidth> Enable use of full bandwidth of input. Valid values are 0(disable full bandwidth) and 1(enable full bandwidth). Default is 0.
82 <bitreservoir_size> is the maximum size of bit reservoir to be used. Valid values are from -1 to 6144. -1 will omit use of bit reservoir. Default is 384.
83 <tree_config> MPS tree config
84 0 for '212'
85 1 for '5151'
86 2 for '5152'
87 3 for '525'
88 Default 0 for stereo input and 1 for 6ch input.
89 <usac_encoding_mode> USAC encoding mode to be chose
90 0 for 'usac_switched'
91 1 for 'usac_fd'
92 2 for 'usac_td'
93 Default 'usac_fd'
94 <corecoder_framelength_index> is the core coder framelength index for USAC encoder.
95 Valid values are 0, 1, 2, 3, 4. eSBR enabling is implicit
96 0 - Core coder framelength of USAC is 768 and eSBR is disabled
97 1 - Core coder framelength of USAC is 1024 and eSBR is disabled
98 2 - Core coder framelength of USAC is 768 and eSBR ratio 8:3
99 3 - Core coder framelength of USAC is 1024 and eSBR ratio 2:1
100 4 - Core coder framelength of USAC is 1024 and eSBR ratio 4:1
101 <pvc_enc_flag> Valid values are 0 (disable PVC encoding) and 1 (enable PVC encoding). Default is 0.
102 <harmonic_sbr_flag> Valid values are 0 (disable harmonic SBR) and 1 (enable harmonic SBR). Default is 0.
103 <esbr_hq_flag> Valid values are 0 (disable high quality eSBR) and 1 (enable high quality eSBR). Default is 0.
104 <drc_flag> Valid values are 0 (disable DRC encoding) and 1 (enable DRC encoding). Default is 0.
105 <inter_tes_enc_flag> Valid values are 0 (disable inter-TES encoding) and 1 (enable inter-TES encoding). Default is 0.
106 <random access interval in ms> is the time interval between audio preroll frames in ms. It is applicable only for AOT 42. Valid values are -1 (Audio preroll sent only at beginning of file) and greater than 1000 ms. Default is -1.
107 <stream identifier> It is the stream id used to uniquely identify configuration of a stream within a set of associated streams. It is applicable only for AOT 42. Valid values are 0 to 65535. Any value outside this range is type-casted to a value of unsigned short type. Default is 0.
108
109```
110Sample CLI:
111```
112-ifile:input_file.wav -ofile:out_file.aac -br:<bit_rate> –aot:<audio profile>
113```
114
115
116# Additional Documents
117Brief description about documents present in `docs` folder
118* [`LIBXAAC-Enc-API.pdf`](docs/LIBXAAC-Enc-API.pdf) - Describes Application Program Interface for libxaac encoder.
119* [`LIBXAAC-Enc-GSG.pdf`](docs/LIBXAAC-Enc-GSG.pdf) - Getting Started Guide for the libxaac encoder.
120
121# Validating the libxaac encoder
122Testing for libxaac encoder is done using listening test for different AOT(Audio object types)
123