Name Date Size #Lines LOC

..--

BUILD.gnH A D25-Apr-20252.1 KiB8276

OWNERS.webrtcH A D25-Apr-202561 43

README.mdH A D25-Apr-20252.3 KiB7560

config.ccH A D25-Apr-2025883 3216

config.hH A D25-Apr-20251.3 KiB4425

generator.ccH A D25-Apr-20253.1 KiB9060

generator_unittest.ccH A D25-Apr-202524 KiB676430

mock_wavreader.ccH A D25-Apr-20251.1 KiB3519

mock_wavreader.hH A D25-Apr-20251.6 KiB4928

mock_wavreader_factory.ccH A D25-Apr-20252.6 KiB6743

mock_wavreader_factory.hH A D25-Apr-20252 KiB6036

multiend_call.ccH A D25-Apr-20256.8 KiB194132

multiend_call.hH A D25-Apr-20253.4 KiB10573

simulator.ccH A D25-Apr-20259.1 KiB236153

simulator.hH A D25-Apr-20251.5 KiB4525

timing.ccH A D25-Apr-20252.1 KiB7450

timing.hH A D25-Apr-20251.5 KiB5231

wavreader_abstract_factory.hH A D25-Apr-20251.2 KiB3518

wavreader_factory.ccH A D25-Apr-20251.9 KiB6638

wavreader_factory.hH A D25-Apr-20251.2 KiB3720

wavreader_interface.hH A D25-Apr-20251.2 KiB4120

README.md

1# Conversational Speech generator tool
2
3Tool to generate multiple-end audio tracks to simulate conversational speech
4with two or more participants.
5
6The input to the tool is a directory containing a number of audio tracks and
7a text file indicating how to time the sequence of speech turns (see the Example
8section).
9
10Since the timing of the speaking turns is specified by the user, the generated
11tracks may not be suitable for testing scenarios in which there is unpredictable
12network delay (e.g., end-to-end RTC assessment).
13
14Instead, the generated pairs can be used when the delay is constant (obviously
15including the case in which there is no delay).
16For instance, echo cancellation in the APM module can be evaluated using two-end
17audio tracks as input and reverse input.
18
19By indicating negative and positive time offsets, one can reproduce cross-talk
20(aka double-talk) and silence in the conversation.
21
22### Example
23
24For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A)
25and b1, b2 (speaker B).
26The text file with the timing information may look like this:
27
28```
29A a1 0
30B b1 0
31A a2 100
32B b2 -200
33A a3 0
34A a4 0
35```
36
37The first column indicates the speaker name, the second contains the audio track
38file names, and the third the offsets (in milliseconds) used to concatenate the
39chunks. An optional fourth column contains positive or negative integral gains
40in dB that will be applied to the tracks. It's possible to specify the gain for
41some turns but not for others. If the gain is left out, no gain is applied.
42
43Assume that all the audio tracks in the example above are 1000 ms long.
44The tool will then generate two tracks (A and B) that look like this:
45
46**Track A**
47```
48  a1 (1000 ms)
49  silence (1100 ms)
50  a2 (1000 ms)
51  silence (800 ms)
52  a3 (1000 ms)
53  a4 (1000 ms)
54```
55
56**Track B**
57```
58  silence (1000 ms)
59  b1 (1000 ms)
60  silence (900 ms)
61  b2 (1000 ms)
62  silence (2000 ms)
63```
64
65The two tracks can be also visualized as follows (one characheter represents
66100 ms, "." is silence and "*" is speech).
67
68```
69t: 0         1        2        3        4        5        6 (s)
70A: **********...........**********........********************
71B: ..........**********.........**********....................
72                                ^ 200 ms cross-talk
73        100 ms silence ^
74```
75