Hi, thank you for making a excellent project. I have a question about training big

CUDA out of memory about nemo HOT 5 CLOSED

nvidia commented on May 9, 2024

CUDA out of memory

from nemo.

Comments (5)

Chen1399 commented on May 9, 2024

Should I reduce batch_size? But Jasper use BN dense.

from nemo.

rinjac commented on May 9, 2024

What is the maximum duration of audiofiles in your train set? Make a histogram out of the durations and see if you have any outliers (very long audiofiles). Try to remove all audiofiles longer than X seconds (start X with some large value, and then lower it until you stop getting CUDA out of memory exceptions).

from nemo.

okuchaiev commented on May 9, 2024

Like @RobertInjac mentioned above, GPU memory usage depends a lot on the max length of the audio during training. For public datasets such as LibriSpeech and Mozilla Common Voice, we cap it at 16.7 seconds during training. So one option is to cut your audio files into smaller pieces (but don't go too small - you still want several words per audio sample).
Another option is to reduce the batch size per GPU. Note that you can still simulate a larger batch size by setting batches_per_step parameter to more than 1 (see https://nvidia.github.io/NeMo/api-docs/nemo.html#module-nemo.core.neural_factory). This may also help with GPU utilization during multi-GPU/multi-node training

from nemo.

Chen1399 commented on May 9, 2024

What is the maximum duration of audiofiles in your train set? Make a histogram out of the durations and see if you have any outliers (very long audiofiles). Try to remove all audiofiles longer than X seconds (start X with some large value, and then lower it until you stop getting CUDA out of memory exceptions).

Thank you. The max length of the audio in my dataset in only 15S.

from nemo.

Chen1399 commented on May 9, 2024

Like @RobertInjac mentioned above, GPU memory usage depends a lot on the max length of the audio during training. For public datasets such as LibriSpeech and Mozilla Common Voice, we cap it at 16.7 seconds during training. So one option is to cut your audio files into smaller pieces (but don't go too small - you still want several words per audio sample).
Another option is to reduce the batch size per GPU. Note that you can still simulate a larger batch size by setting batches_per_step parameter to more than 1 (see https://nvidia.github.io/NeMo/api-docs/nemo.html#module-nemo.core.neural_factory). This may also help with GPU utilization during multi-GPU/multi-node training

Thank you. I solve the problem by opening apex O1 and reducing my batch_size to 14. And I will try to set batches_per_step to 2.

from nemo.

Recommend Projects

CUDA out of memory about nemo HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent