Is your feature request related to a problem? Please describe. <p

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Titanet-L Augmentation about nemo HOT 4 OPEN

CreativeSelf0 commented on June 12, 2024

Titanet-L Augmentation

from nemo.

Comments (4)

CreativeSelf0 commented on June 12, 2024

I added the following to augmentor, and used the following snippet from online augmentation tutorial

rir_data_path = f'{data_dir}/dataset'
!python {NEMO_ROOT}/scripts/dataset_processing/get_openslr_rir_data.py --data_root {rir_data_path}
rir_manifest_path = os.path.join(rir_data_path, 'processed', 'rir.json')
!head -n 3 {rir_manifest_path}

Then to use the augmentation I applied the following

audio_augmentations = dict(
    speed = dict(
        sr=16000,
        prob=0.3,
        resample_type='kaiser_fast',
        min_speed_rate=0.95,
        max_speed_rate=1.05,
    ),
    noise = dict(
        manifest_path=rir_manifest_path,
        prob=0.5,
        min_snr_db=0,
        max_snr_db=15,
    ),
)
finetune_config.model.train_ds.augmentor = audio_augmentations

Am I correct and thanks @okuchaiev

from nemo.

nithinraok commented on June 12, 2024

Yes, code looks fine to me. But for impulse you should use impulse pertubation not noise pertubation.
Sample can be found here:

NeMo/examples/speaker_tasks/recognition/conf/titanet-small.yaml

Line 14 in 6442bb6

augmentor:

from nemo.

CreativeSelf0 commented on June 12, 2024

@nithinraok that's what I thought, However in Titanet-Large they use noise instead of impulse, and it says we are using impulse perturbation. So, does that mean in their training they made an error using RIR corpora for noise instead of pulse perturbation.

NeMo/examples/speaker_tasks/recognition/conf/titanet-large.yaml

Lines 14 to 26 in 6442bb6

 augmentor: 

 noise: 

 manifest_path: null 

 prob: 0.5 

 min_snr_db: 0 

 max_snr_db: 15 

 speed: 

 prob: 0.3 

 sr: *sample_rate 

 resample_type: 'kaiser_fast' 

 min_speed_rate: 0.95 

 max_speed_rate: 1.05

The paper statement:

(just realized you are the first author x.x)
Thank you @nithinraok

from nemo.

nithinraok commented on June 12, 2024

I don;t remember details exactly but as far I remember RIR corpora also has noise samples as well along with impulse responses, and I have not added impulse section to this config file but was added to titanet-small config.

from nemo.

Recommend Projects

Titanet-L Augmentation about nemo HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	augmentor:
	noise:
	manifest_path: null
	prob: 0.5
	min_snr_db: 0
	max_snr_db: 15

	speed:
	prob: 0.3
	sr: *sample_rate
	resample_type: 'kaiser_fast'
	min_speed_rate: 0.95
	max_speed_rate: 1.05