Giter Club home page Giter Club logo

Comments (8)

adefossez avatar adefossez commented on July 24, 2024

Could you try to add some debug statements around this line: https://github.com/facebookresearch/denoiser/blob/master/denoiser/audio.py#L72

            try:
                out, sr = torchaudio.load(str(file), offset=offset, num_frames=num_frames)
            except Exception:
                print(file, examples, offset); raise

and then report the offending file? If you have ffmpeg installed, you can run ``ffprobe PATH_TO_FILE` so that i have more info to debug the issue.

from denoiser.

ntyoshi avatar ntyoshi commented on July 24, 2024

Hi @abhshkdz,

The code you gave me output the following lines:

[2021-02-22 14:27:41,384][denoiser.solver][INFO] - Training...
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p228_301.wav 9 56000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p239_287.wav 12 72000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p259_464.wav 10 48000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p279_312.wav 14 88000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p278_021.wav 44 224000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav 10 72000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p259_006.wav 25 144000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p276_010.wav 13 96000
[2021-02-22 14:27:41,507][__main__][ERROR] - Some error happened
Traceback (most recent call last):
  File "train.py", line 104, in main
    _main(args)
  File "train.py", line 98, in _main
    run(args)
  File "train.py", line 79, in run
    solver.train()
  File "/data/home/ntyoshi/denoiser/denoiser/solver.py", line 137, in train
    train_loss = self._run_one_epoch(epoch)
  File "/data/home/ntyoshi/denoiser/denoiser/solver.py", line 200, in _run_one_epoch
    for i, data in enumerate(logprog):
  File "/data/home/ntyoshi/denoiser/denoiser/utils.py", line 126, in __next__
    value = next(self._iterator)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/home/ntyoshi/denoiser/denoiser/data.py", line 96, in __getitem__
    return self.noisy_set[index], self.clean_set[index]
  File "/data/home/ntyoshi/denoiser/denoiser/audio.py", line 73, in __getitem__
    out, sr = torchaudio.load(str(file), offset=offset, num_frames=num_frames)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torchaudio/__init__.py", line 85, in load
    filetype=filetype,
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torchaudio/_sox_backend.py", line 47, in load
    filetype
RuntimeError: Offset past EOF

/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p254_257.wav 17 80000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p259_016.wav 32 216000
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p279_283.wav 7 48000
[2021-02-22 14:27:41,566][denoiser.executor][ERROR] - Worker 0 died, killing all workers

And when I tried ffprove to one of the path I got before the error was happened, I got these message:

$ ffprobe /data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p228_301.wav
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with gcc 7.5.0 (crosstool-NG 1.24.0.131_87df0e6_dirty)
  configuration: --prefix=/home/ntyoshi/anaconda3/envs/denoiser --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1602879523915/_build_env/bin/x86_64-conda-linux-gnu-cc --disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-gpl --enable-hardcoded-tables --enable-libfreetype --enable-libopenh264 --enable-libx264 --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, wav, from '/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p228_301.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:02.78, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s

Then that after the error, here:

$ ffprobe /data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p254_257.wav
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with gcc 7.5.0 (crosstool-NG 1.24.0.131_87df0e6_dirty)
  configuration: --prefix=/home/ntyoshi/anaconda3/envs/denoiser --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1602879523915/_build_env/bin/x86_64-conda-linux-gnu-cc --disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-gpl --enable-hardcoded-tables --enable-libfreetype --enable-libopenh264 --enable-libx264 --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, wav, from '/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p254_257.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:04.14, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s

Sorry for the long comment!

from denoiser.

adefossez avatar adefossez commented on July 24, 2024

Sorry can you do the same with passing num_workers=0? This will try to load a single file at once, which will avoid having so many errors in parallel. Also replace this line:
https://github.com/facebookresearch/denoiser/blob/master/denoiser/audio.py#L63
with for (file, file_size), examples in zip(self.files, self.num_examples): and add file_size in the print call in the except:

from denoiser.

ntyoshi avatar ntyoshi commented on July 24, 2024

Fixed audio.py:

...
        for (file, file_size), examples in zip(self.files, self.num_examples):
            if index >= examples:
                index -= examples
                continue
            num_frames = 0
            offset = 0
            if self.length is not None:
                offset = self.stride * index
                num_frames = self.length
            try:
                out, sr = torchaudio.load(str(file), offset=offset, num_frames=num_frames)
            except Exception:
                print(file, examples, offset, file_size); raise
...

conf/config.yml:

...
# Logging and printing, and does not impact training
num_prints: 5
device: cuda
num_workers: 0
verbose: 0
show: 0   # just show the model and its size and exit
...

I got this messages:

$ bash launch_valentini.sh
[2021-02-23 15:33:19,178][__main__][INFO] - For logs, checkpoints and samples check /data/workspace/ntyoshi/outputs/exp_bandmask=0.2,demucs.causal=1,demucs.hidden=48,demucs.resample=4,dset=valentini,remix=1,segment=4.5,shift=8000,shift_same=True,stft_loss=True,stride=0.5
[2021-02-23 15:33:19,708][denoiser.executor][INFO] - Starting 1 worker processes for DDP.
[2021-02-23 15:33:19,966][__main__][INFO] - For logs, checkpoints and samples check /data/workspace/ntyoshi/outputs/exp_bandmask=0.2,demucs.causal=1,demucs.hidden=48,demucs.resample=4,dset=valentini,remix=1,segment=4.5,shift=8000,shift_same=True,stft_loss=True,stride=0.5
[2021-02-23 15:33:23,088][denoiser.solver][INFO] - ----------------------------------------------------------------------
[2021-02-23 15:33:23,088][denoiser.solver][INFO] - Training...
/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav 10 72000 143813
[2021-02-23 15:33:23,108][__main__][ERROR] - Some error happened
Traceback (most recent call last):
  File "train.py", line 104, in main
    _main(args)
  File "train.py", line 98, in _main
    run(args)
  File "train.py", line 79, in run
    solver.train()
  File "/data/home/ntyoshi/denoiser/denoiser/solver.py", line 137, in train
    train_loss = self._run_one_epoch(epoch)
  File "/data/home/ntyoshi/denoiser/denoiser/solver.py", line 200, in _run_one_epoch
    for i, data in enumerate(logprog):
  File "/data/home/ntyoshi/denoiser/denoiser/utils.py", line 126, in __next__
    value = next(self._iterator)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/home/ntyoshi/denoiser/denoiser/data.py", line 96, in __getitem__
    return self.noisy_set[index], self.clean_set[index]
  File "/data/home/ntyoshi/denoiser/denoiser/audio.py", line 74, in __getitem__
    out, sr = torchaudio.load(str(file), offset=offset, num_frames=num_frames)
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torchaudio/__init__.py", line 85, in load
    filetype=filetype,
  File "/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torchaudio/_sox_backend.py", line 47, in load
    filetype
RuntimeError: Offset past EOF

from denoiser.

adefossez avatar adefossez commented on July 24, 2024

that is very weird. Can you check ffprobe /data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav, as well as

file = ' /data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav'
siginfo, _ = torchaudio.info(file)
length = siginfo.length // siginfo.channels
print(length)

Maybe also try with a more recent version of torchaudio?

from denoiser.

ntyoshi avatar ntyoshi commented on July 24, 2024

Thanks for your response!
Please see the result before,

ffprobe:

$ ffprobe /data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav
ffprobe version 4.3.1 Copyright (c) 2007-2020 the FFmpeg developers
  built with gcc 7.5.0 (crosstool-NG 1.24.0.131_87df0e6_dirty)
  configuration: --prefix=/home/ntyoshi/anaconda3/envs/denoiser --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1602879523915/_build_env/bin/x86_64-conda-linux-gnu-cc --disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-gpl --enable-hardcoded-tables --enable-libfreetype --enable-libopenh264 --enable-libx264 --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, wav, from '/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:03.00, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s
  • Before torch audio updating

torchaudio version:

$ pip show torchaudio
Name: torchaudio
Version: 0.5.1
Summary: An audio package for PyTorch
Home-page: https://github.com/pytorch/audio
Author: Soumith Chintala, David Pollack, Sean Naren, Peter Goldsborough
Author-email: [email protected]
License: UNKNOWN
Location: /data/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages
Requires: torch
Required-by: denoiser

python script result:

$ python
Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchaudio
>>> file = '/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav'
>>> siginfo, _ = torchaudio.info(file)
>>> length = siginfo.length // siginfo.channels
>>> print(length)
47938
  • After torch audio updating

torchaudio version:

$ pip show torchaudio
Name: torchaudio
Version: 0.7.0
Summary: An audio package for PyTorch
Home-page: https://github.com/pytorch/audio
Author: Soumith Chintala, David Pollack, Sean Naren, Peter Goldsborough
Author-email: [email protected]
License: UNKNOWN
Location: /data/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages
Requires: torch
Required-by: denoiser

python script result:

$ python
Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchaudio
/home/ntyoshi/anaconda3/envs/denoiser/lib/python3.7/site-packages/torchaudio/backend/utils.py:54: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
  '"sox" backend is being deprecated. '
>>> file = '/data/workspace/ntyoshi/dataset/valentini/noisy_trainset_wav/p250_002.wav'
>>> siginfo, _ = torchaudio.info(file)
>>> length = siginfo.length // siginfo.channels
>>> print(length)
47938

I'm trying on condo environment. Let me know if you need other information.
Thanks again!

from denoiser.

adefossez avatar adefossez commented on July 24, 2024

The file size here 47938 doesn't match what is stored in the json (143813).
The only explanation I can think of is that the file size changed between when the list of files was computed and now. Could you try to remove the clean.json and noisy.json file, and regenerate them ?

from denoiser.

ntyoshi avatar ntyoshi commented on July 24, 2024

Could you try to remove the clean.json and noisy.json file, and regenerate them ?

It worked!
Thank you!

from denoiser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.