Giter Club home page Giter Club logo

so-vits-svc's People

Contributors

112292454 avatar 2dipw avatar asdfw13 avatar burtbai avatar cnchtu avatar erythrocyte3803 avatar geraint-dou avatar huanlinoto avatar hxdnshx avatar innnky avatar kakaruhayate avatar limbang avatar magic-akari avatar misteo avatar miuzarte avatar mlbv avatar ms903x1 avatar narusemioshirakana avatar njsgdd10086 avatar quicksandznzn avatar ricecakey06 avatar rvc-boss avatar sherkeyxd avatar stardust-minus avatar tps-f avatar umoufuton avatar xdedss avatar ylzz1997 avatar zscharlie avatar zwa73 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

so-vits-svc's Issues

在colab进行重采样时无响应

系统平台: google colab

出现问题的环节: 预处理/重采样至44100hz

Python版本: colab笔记本内安装依赖时的默认版本

PyTorch版本: 1.13.1+cu116

所用分支: 4.0

问题描述: 在在线解压完数据集到dataset_raw之后,执行下一个单元格Resample to 44100hz时没有任何反应,没有报错日志并且dataset文件夹没有被生成。在手动创建dataset文件夹后也没有变化。

日志截图:
image

content提取方式的比较?

您好,这个项目做的很好。
我看你用了第九层的hubert,出于什么考虑呢?如何权衡内容信息丢失、音色泄漏的问题,您有对比过其他层或者whisper这种方式吗?

/hubert/checkpoint_best_legacy_500.pt box.com直连方法分享

请问一下colab的链接

之前innnky的colab链接好像不能用了,训练会报错
是把项目转移到这里了吗?vits3.0的模型还能继续使用吗?想问问vits3.0和4.0的colab链接

validation loss

There are many types of losses. Considering the fact a run can ezly generate thousands of check points, a validation loss would be particularly useful when trying to determine a set time for the training termination. May I ask if it's already there(if so which one?)
Screenshot from 2023-03-13 22-36-03
thanks

AttributeError: 'HParams' object has no attribute 'dataset_type'

系统平台: windows

出现问题的环节: 推理

Python版本: 3.8

PyTorch版本: 1.13.1+cu116

所用分支: 4.0-v2

所用数据集:

授权证明截图:

问题描述: 推理报错,AttributeError: 'HParams' object has no attribute 'dataset_type',4.0分支可以正常推

日志截图:

use_cuda, True
INFO:44k:{'log_interval': 200, 'eval_interval': 800, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 10240, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 10}
INFO:44k:{'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 22050}
INFO:44k:{'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 200}
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
Traceback (most recent call last):
File "train.py", line 435, in
main()
File "train.py", line 57, in main
mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
File "D:\so-vits\so-vits-svc1\python38\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "D:\so-vits\so-vits-svc1\python38\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes while not context.join():
File "D:\so-vits\so-vits-svc1\python38\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "D:\so-vits\so-vits-svc1\python38\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap
fn(i, *args)
File "D:\so-vits\so-vits-svc1\train.py", line 76, in run
dataset_constructor = DatasetConstructor(hps, num_replicas=n_gpus, rank=rank)
File "D:\so-vits\so-vits-svc1\data_utils.py", line 293, in init
self._get_components()
File "D:\so-vits\so-vits-svc1\data_utils.py", line 296, in _get_components
self._init_datasets()
File "D:\so-vits\so-vits-svc1\data_utils.py", line 301, in _init_datasets
self._train_dataset = self.dataset_function[self.hparams.data.dataset_type](self.hparams,
AttributeError: 'HParams' object has no attribute 'dataset_type'

训练过程报错_pickle.UnpicklingError

系统平台: CentOS 7.9
出现问题的环节: 训练
Python版本: 3.8.13
PyTorch版本: Version: 1.13.1+cu116
所用分支: 4.0-v2
所用数据集: 本人
授权证明截图:

问题描述: 训练开始后报错,无法继续,报错如下,网络搜索得到的结果是与torch版本有关,但在我跟换了torch版本之后似乎也没有改变。可能是由于torch.load()引起。

日志:
在slurm环境中使用srun运行 4卡
前面都是正常的...
(部分文件夹涉及隐私用星号代替了)

INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:44k:Train Epoch: 1 [0%]
INFO:44k:Losses: [4.583632469177246, 2.16941237449646, 11.800090789794922, 124.89070129394531, 616.9237060546875], step: 0, lr: 0.0002
/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/autograd/__init__.py:197: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed.  This is not an error, but may impair performance.
grad.sizes() = [32, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [32, 1, 4], strides() = [4, 4, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
INFO:44k:Saving model and optimizer state at iteration 1 to ./logs/44k/G_0.pth
INFO:44k:Saving model and optimizer state at iteration 1 to ./logs/44k/D_0.pth
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/autograd/__init__.py:197: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed.  This is not an error, but may impair performance.
grad.sizes() = [32, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [32, 1, 4], strides() = [4, 4, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
  File "train.py", line 310, in <module>
    main()
  File "train.py", line 51, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 2 terminated with the following error:
Traceback (most recent call last):
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/scratch/****/so-vits-svc/train.py", line 122, in run
    train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler,
  File "/scratch/****/so-vits-svc/train.py", line 141, in train_and_evaluate
    for batch_idx, items in enumerate(train_loader):
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
    return self._process_data(data)
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
    data.reraise()
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/_utils.py", line 543, in reraise
    raise exception
_pickle.UnpicklingError: Caught UnpicklingError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/scratch/****/so-vits-svc/data_utils.py", line 88, in __getitem__
    return self.get_audio(self.audiopaths[index][0])
  File "/scratch/****/so-vits-svc/data_utils.py", line 51, in get_audio
    spec = torch.load(spec_filename)
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/****/project/conda_envs/sov/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

之后就exit code 1了
同样的数据集在windows平台也尝试过,没有报错。

预训练底模

请问作者大大们有预定什么时候发布预训练底模吗?
我在网上(hugging face等)下载了一些底模效果都不是特别好(生成日语歌)
或者您们知道哪里可以找到支持v4的比较好的底模吗?

关于webui界面源音频参数的问题

训练了一个模型大改2400步后我试了一下发现可以用,在转换过程中我发现有些音频转成wav 44100 16bit后可以正常转换,但是有个wav源文件明明自己是16bit,转换就报错ValueError: Audio data cannot be converted to 16-bit int format.哪怕我在au里切一小段下来另存为44100 16bit也是这个报错。wav的参数只有这些,还有什么因素会影响这个源文件无法被转换?

A more accurate English Model Introduction

I found the English description a bit hard to understand. The English version missed the crucial part that you use VITS. The 中文简体 intro is easier to understand.

... to extract source audio speech features, and inputs them together with F0 to replace the original text input to achieve the effect of song conversion. ...

The following would be better in my humble opinion:

... to extract source audio speech features, and inputs them together with F0 into VITS instead of the original text input to achieve the effect of song conversion. ...

Thanks!

P.S. Sorry, no 中文简体 input method on this laptop, so I'm using English.

最后训练时无法继续,哪位大佬帮指点下,google没有找到解决办法

系统平台: Windows

出现问题的环节: 训练

Python版本: Python 3.7.0

PyTorch版本: 1.13.1

所用分支: 4.0 e701955 Unlock the version of numpy

问题描述:
前面所有的命令都成功了,没有报错,最后这一步出现这个问题
在执行训练命令(python train.py -c configs/config.json -m 44k)提示如下错误
assert torch.cuda.is_available(), "CPU training is not allowed."
AssertionError: CPU training is not allowed.

日志截图:
(venv) (base) PS E:\voice\so-vits-svc> python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.13.1+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 专业版
GCC version: (Rev10, Built by MSYS2 project) 12.2.0
Clang version: Could not collect
CMake version: version 3.26.0-rc5
Libc version: N/A

Python version: 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19041-SP0
Is CUDA available: False
CUDA runtime version: 11.7.64
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 2070 SUPER
Nvidia driver version: 517.40
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudnn_ops_train64_8.dll
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.20.0
[pip3] torch==1.13.1
[pip3] torch-tb-profiler==0.4.1
[pip3] torchaudio==0.13.1
[conda] blas 1.0 mkl https://repo.anaconda.com/pkgs/main
[conda] mkl 2021.4.0 haa95532_640 https://repo.anaconda.com/pkgs/main
[conda] mkl-service 2.4.0 py39h2bbff1b_0 https://repo.anaconda.com/pkgs/main
[conda] mkl_fft 1.3.1 py39h277e83a_0 https://repo.anaconda.com/pkgs/main
[conda] mkl_random 1.2.2 py39hf11a4ad_0 https://repo.anaconda.com/pkgs/main
[conda] numpy 1.21.5 py39h7a0a035_3 https://repo.anaconda.com/pkgs/main
[conda] numpy-base 1.21.5 py39hca35cd5_3 https://repo.anaconda.com/pkgs/main
[conda] numpydoc 1.4.0 py39haa95532_0 https://repo.anaconda.com/pkgs/main

怎么配置开启GPU的cuda继续训练呢?

跨性别推理如何优化呢?

我所使用的训练数据集是女声,训练到了9600步。

如果推理数据集是女声的话,效果非常赞!但如果推理数据集是男声的话,效果会差一些,应该如何优化呢?我目前想到的是再多训练一些步骤,到20000左右再试试。如果效果还是不理想的话,会尝试先把男声用某些方式转化成女声,然后再推理。

请问还有什么其他建议吗?

在Google colab上运行,到安装依赖这一步出现错误

系统平台: Google colab

出现问题的环节: 安装依赖

Python版本: Python 3.9.16

PyTorch版本: 1.13.1+cu116

所用分支: 4.0

问题描述: 安装依赖这一步出现了错误

error: subprocess-exited-with-error

× Building wheel for fairseq (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for fairseq (pyproject.toml) ... error
ERROR: Failed building wheel for fairseq
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141231 sha256=6430468728cf59e967aae7f0619d2ccc82bc6e6626ac52048d1b2fee50c31878
Stored in directory: /root/.cache/pip/wheels/42/3c/ae/14db087e6018de74810afe32eb6ac890ef9c68ba19b00db97a
Successfully built pyworld antlr4-python3-runtime
Failed to build fairseq
ERROR: Could not build wheels for fairseq, which is required to install pyproject.toml-based projects

日志截图:
image

KeyError: "param 'initial_lr' is not specified in param_groups[0] when resuming an optimizer"

I'm trying to finetune 4.0-v2 using this checkpoint I found https://huggingface.co/cr941131/sovits-4.0-v2-hubert/tree/main
(not sure if its good or not)
But when I try to start training this error happens:

Traceback (most recent call last):
  File "/home/manjaro/.conda/envs/soft-vc/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/media/manjaro/NVME_2tb/NeuralNetworks/so-vits-svc-v2-44100/train.py", line 112, in run
    scheduler_g = torch.optim.lr_scheduler.ExponentialLR(optim_g, gamma=hps.train.lr_decay, last_epoch=epoch_str - 2)
  File "/home/manjaro/.conda/envs/soft-vc/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 583, in __init__
    super(ExponentialLR, self).__init__(optimizer, last_epoch, verbose)
  File "/home/manjaro/.conda/envs/soft-vc/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 42, in __init__
    raise KeyError("param 'initial_lr' is not specified "
KeyError: "param 'initial_lr' is not specified in param_groups[0] when resuming an optimizer"

Where can I find official checkpoints if that one is bad?

Questions regarding significant reorganization of the project

https://github.com/34j/so-vits-svc-fork

I forked so-vits-svc 4.0 v1, did some major refactoring and added some (external) features:

  • realtime voice convertion
  • unified CLI
  • GUI for inference
  • download pretrained models automatically
  • use pre-commit to format code
  • upload to PyPI using CI

I am considering sending a PR based on the repository above. However, if there is something wrong with the svc-develop-team, such as having trouble using git, GitHub CI, pre-commit, or having problems removing Chinese in the code, or just being a pain, I will stop. what do you think about such a refactoring?

If you want to reject this, I would appreciate it if you could refer to my project instead. Thank you.

求大师解答colab

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.9.0 requires jedi>=0.10, which is not installed.
cvxpy 1.2.3 requires setuptools<=64.0.2, but you have setuptools 67.6.0 which is incompatible.

在Clone repository and install requirements时出现这段讯息,要理会它吗?
用的是4.0版本的colab

4.0 版本 训练发生错误

4.0 版本 训练发生错误
INFO:44k:{'train': {'log_interval': 200, 'eval_interval': 800, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 10240, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 3}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 22050}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 200}, 'spk': {'miaopeng': 0}, 'model_dir': './logs\44k'}
WARNING:44k:D:\Desktop\so-vits-svc-4.0 is not a git repository, therefore hash value comparison will be ignored.
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
./logs\44k\G_0.pth
error, emb_g.weight is not in the checkpoint
INFO:44k:emb_g.weight is not in the checkpoint
load
INFO:44k:Loaded checkpoint './logs\44k\G_0.pth' (iteration 1)
./logs\44k\D_0.pth
load
INFO:44k:Loaded checkpoint './logs\44k\D_0.pth' (iteration 1)
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration.
C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\autograd_init_.py:197: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
grad.sizes() = [32, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [32, 1, 4], strides() = [4, 4, 1] (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\reducer.cpp:339.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
File "D:\Desktop\so-vits-svc-4.0\train.py", line 310, in
main()
File "D:\Desktop\so-vits-svc-4.0\train.py", line 51, in main
mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes
while not context.join():
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in wrap
fn(i, *args)
File "D:\Desktop\so-vits-svc-4.0\train.py", line 119, in run
train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler,
File "D:\Desktop\so-vits-svc-4.0\train.py", line 202, in train_and_evaluate
scaler.step(optim_g)
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 313, in step
return optimizer.step(*args, **kwargs)
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\optim\lr_scheduler.py", line 68, in wrapper
return wrapped(*args, **kwargs)
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\optim\optimizer.py", line 140, in wrapper
out = func(*args, **kwargs)
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\optim\adamw.py", line 162, in step
adamw(params_with_grad,
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\optim\adamw.py", line 219, in adamw
func(params,
File "C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\optim\adamw.py", line 273, in single_tensor_adamw
exp_avg.mul
(beta1).add
(grad, alpha=1 - beta1)
RuntimeError: output with shape [1, 256] doesn't match the broadcast shape [200, 256]

这是我包的信息
Python 3.9.16
Name: torch
Version: 1.13.1+cu117
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: c:\programdata\miniconda3\envs\so4.0\lib\site-packages
Requires: typing-extensions
Required-by: fairseq, torchaudio, torchvision

chatgpt 这样回答
根据您提供的信息,训练过程出现了以下错误信息:

vbnet
Copy code
C:\ProgramData\miniconda3\envs\so4.0\lib\site-packages\torch\autograd_init_.py:197: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
grad.sizes() = [32, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [32, 1, 4], strides() = [4, 4, 1] (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\reducer.cpp:339.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
...
ProcessRaisedException:

-- Process 0 terminated with the following error:
Trace...
这个错误通常出现在使用 PyTorch 的分布式数据并行训练(DDP)时。它表明参数的梯度和梯度桶的形状不匹配,可能会影响训练的性能。

解决这个问题的方法可能因情况而异,但以下是几个可能有用的步骤:

升级 PyTorch 版本:如果您正在使用的是旧版本的 PyTorch,请尝试升级到最新版本以查看问题是否已经得到解决。

调整批次大小:尝试使用不同的批次大小来查看问题是否会消失。

更改分布式数据并行训练的配置:如果您在分布式数据并行训练期间使用了某些特殊配置,请尝试更改这些配置并查看问题是否已经得到解决。

检查梯度的形状:使用 print 语句或调试器检查梯度的形状是否与您的预期相同,以查看问题是否出现在梯度计算的过程中。

M1 mac 安装依赖失败

由于numpy==1.20.3还未支持m1 的mac导致报错ERROR: Failed building wheel for numpy
去numpy仓库下看到在1.21.4版本后就可以解决,同时还可以解决x86_64及pipenv的一些安装问题
numpy/numpy#17784 (comment)

请问锁1.20是因为哪个包的依赖,可否麻烦升级项目numpy版本呢

train开始之后一堆 xxx is not in the checkpoint的错

本地和colab都是一样的错,环境都是没问题的。训练了10000次但是推出来的声音都是只有噪音。
请问是g_0和d_0的问题嘛 但是看log都是loaded了
请大佬帮忙看看 谢谢!
INFO:44k:{'train': {'log_interval': 200, 'eval_interval': 800, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 10240, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 3}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 22050}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 200}, 'spk': {'owen': 0}, 'model_dir': './logs/44k'}
2023-03-13 13:35:44.410924: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
DEBUG:tensorflow:Falling back to TensorFlow client; we recommended you install the Cloud TPU client directly with pip install cloud-tpu-client.
2023-03-13 13:35:45.367774: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-03-13 13:35:45.367898: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-03-13 13:35:45.367920: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
DEBUG:h5py.conv:Creating converter from 7 to 5
DEBUG:h5py.conv:Creating converter from 5 to 7
DEBUG:h5py.conv:Creating converter from 7 to 5
DEBUG:h5py.conv:Creating converter from 5 to 7
DEBUG:jaxlib.mlir.mlir_libs:Initializing MLIR with module: site_initialize_0
DEBUG:jaxlib.mlir.mlir_libs:Registering dialects from initializer <module 'jaxlib.mlir.mlir_libs.site_initialize_0' from '/usr/local/lib/python3.9/dist-packages/jaxlib/mlir/mlir_libs/site_initialize_0.so'>
DEBUG:jax.src.path:etils.epath found. Using etils.epath for file I/O.
INFO:numexpr.utils:NumExpr defaulting to 2 threads.
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
./logs/44k/G_0.pth
error, emb_g.weight is not in the checkpoint
INFO:44k:emb_g.weight is not in the checkpoint
error, pre.weight is not in the checkpoint
INFO:44k:pre.weight is not in the checkpoint
error, pre.bias is not in the checkpoint
INFO:44k:pre.bias is not in the checkpoint
error, enc_p.proj.weight is not in the checkpoint
INFO:44k:enc_p.proj.weight is not in the checkpoint
error, enc_p.proj.bias is not in the checkpoint
INFO:44k:enc_p.proj.bias is not in the checkpoint
error, enc_p.f0_emb.weight is not in the checkpoint
INFO:44k:enc_p.f0_emb.weight is not in the checkpoint
error, enc_p.enc
.attn_layers.0.emb_rel_k is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.emb_rel_k is not in the checkpoint
error, enc_p.enc
.attn_layers.0.emb_rel_v is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.emb_rel_v is not in the checkpoint
error, enc_p.enc
.attn_layers.0.conv_q.weight is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.conv_q.weight is not in the checkpoint
error, enc_p.enc
.attn_layers.0.conv_q.bias is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.conv_q.bias is not in the checkpoint
error, enc_p.enc
.attn_layers.0.conv_k.weight is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.conv_k.weight is not in the checkpoint
error, enc_p.enc
.attn_layers.0.conv_k.bias is not in the checkpoint
INFO:44k:enc_p.enc
.attn_layers.0.conv_k.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.0.conv_v.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.0.conv_v.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.0.conv_v.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.0.conv_v.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.0.conv_o.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.0.conv_o.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.0.conv_o.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.0.conv_o.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.1.emb_rel_k is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.emb_rel_k is not in the checkpoint
error, enc_p.enc_.attn_layers.1.emb_rel_v is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.emb_rel_v is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_q.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_q.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_q.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_q.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_k.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_k.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_k.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_k.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_v.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_v.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_v.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_v.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_o.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_o.weight is not in the checkpoint
error, enc_p.enc_.attn_layers.1.conv_o.bias is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.1.conv_o.bias is not in the checkpoint
error, enc_p.enc_.attn_layers.2.emb_rel_k is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.2.emb_rel_k is not in the checkpoint
error, enc_p.enc_.attn_layers.2.emb_rel_v is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.2.emb_rel_v is not in the checkpoint
error, enc_p.enc_.attn_layers.2.conv_q.weight is not in the checkpoint
INFO:44k:enc_p.enc_.attn_layers.2.conv_q.weight is not in the checkpoint

Colab inference not working at all

Whenever I try to use inference at the last inference step, it says this:

Traceback (most recent call last): File "/content/so-vits-svc/inference_main.py", line 101, in main() File "/content/so-vits-svc/inference_main.py", line 47, in main svc_model = Svc(args.model_path, args.config_path, args.device, args.cluster_model_path) File "/content/so-vits-svc/inference/infer_tool.py", line 127, in init self.load_model() File "/content/so-vits-svc/inference/infer_tool.py", line 138, in load_model self.hps_ms.data.filter_length // 2 + 1,AttributeError: 'HParams' object has no attribute 'filter_length'

Any fix?
image_2023-03-14_212606368

Not saving checkpoints

I am running the Colab 4.0 notebook and everything works very well but when I ran the actual training step I notice that it's not saving any of the checkpoints. It does so at the beginning but then it just doesn't at all. I've run it for over 80 epochs now with absolutely no updates.

如何正确提issues (How to properly raise issues)

如何正确提issues

  1. 提问前建议先自己去尝试解决,可以借助一些搜索引擎(谷歌/必应等等)。如果实在无法自己解决再发issues,在提issues之前,请先仔细阅读《提问的智慧》;
  2. 提问时候必须提供如下信息,以便于定位问题所在:系统平台,出现问题的环节,python环境版本,torch版本,所用分支,所用数据集,授权证明截图,问题描述,完整的日志截图;
  3. 提问时候态度要友好。

什么样的issues会被close

  1. 伸手党;
  2. 一键包/环境包相关;
  3. 提供的信息不全;
  4. 所用的数据集是无授权数据集(游戏角色/二次元人物暂不归为此类,但是训练时候也要小心谨慎。如果能联系到官方,必须先和官方联系并核实清楚)。

参考格式(可以直接复制)

系统平台: 在此处填写你所用的平台,例如:Windows

出现问题的环节: 安装依赖/推理/训练/预处理/其它

Python版本: 在此填写你所用的Python版本,可用 python -V 查询

PyTorch版本: 在此填写你所用的PyTorch版本,可用 pip show torch 查询

所用分支: 在此填写你所用的代码分支

所用数据集: 在此填写你训练所用数据集的来源,如果只是推理,可留空

授权证明截图:
在此添加授权证明截图,如果是数据集是自己的声音或数据集为游戏角色/二次元人物或没有训练需求,此处可留空

问题描述: 在这里描述自己的问题,越详细越好

日志截图:
在此添加完整的日志截图,便于定位问题所在

ValueError: numpy.ndarray has the wrong size, try recompiling. Expected 88, got 96

Great work! This error occurred when extracting Hubert and f0 features. Is there any misusing part about hubert/checkpoint_best_legacy_500.pt? That was downloaded from http://obs.cstcloud.cn/share/obs/sankagenkeshi/checkpoint_best_legacy_500.pt
BTW, I was under the 4.0 version with python3.8. Thanks for your information

_Traceback (most recent call last):
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self.kwargs)
File "/data/sdb/mike/repo/so-vits-svc/preprocess_hubert_f0.py", line 44, in process_batch
process_one(filename, hmodel)
File "/data/sdb/mike/repo/so-vits-svc/preprocess_hubert_f0.py", line 34, in process_one
f0 = utils.compute_f0_dio(wav, sampling_rate=sampling_rate, hop_length=hop_length)
File "/data/sdb/mike/repo/so-vits-svc/utils.py", line 156, in compute_f0_dio
import pyworld
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/pyworld/init.py", line 7, in
from .pyworld import *
File "init.pxd", line 199, in init pyworld.pyworld
ValueError: numpy.ndarray has the wrong size, try recompiling. Expected 88, got 96

Many speaker voice conversion task

Hi

If i use your model for a voice conversion task with around 100 speakers, would its performance be better than that of freevc?
And can i get the checkpoint repository link?

训练速度过慢

2023-03-19 17:26:01,202	44k	INFO	{'train': {'log_interval': 200, 'eval_interval': 800, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 10240, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 10}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 22050}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 200}, 'spk': {'miria': 0}, 'model_dir': './logs\\44k'}
2023-03-19 17:26:22,608	44k	INFO	Train Epoch: 1 [0%]
2023-03-19 17:26:22,608	44k	INFO	Losses: [5.9978346824646, 5.234327793121338, 0.53354412317276, 109.33016204833984, 392.0679626464844], step: 0, lr: 0.0001
2023-03-19 17:26:27,268	44k	INFO	Saving model and optimizer state at iteration 1 to ./logs\44k\G_0.pth
2023-03-19 17:26:28,176	44k	INFO	Saving model and optimizer state at iteration 1 to ./logs\44k\D_0.pth
2023-03-19 17:28:11,282	44k	INFO	====> Epoch: 1, cost 130.08 s
2023-03-19 17:29:18,080	44k	INFO	Train Epoch: 2 [61%]
2023-03-19 17:29:18,080	44k	INFO	Losses: [2.2427873611450195, 2.418828010559082, 4.7479071617126465, 51.7297477722168, 3.8419177532196045], step: 200, lr: 9.99875e-05
2023-03-19 17:29:54,746	44k	INFO	====> Epoch: 2, cost 103.46 s
2023-03-19 17:31:38,868	44k	INFO	====> Epoch: 3, cost 104.12 s
2023-03-19 17:32:10,387	44k	INFO	Train Epoch: 4 [23%]
2023-03-19 17:32:10,387	44k	INFO	Losses: [1.7529761791229248, 3.1881754398345947, 5.160671234130859, 43.579063415527344, 2.4448511600494385], step: 400, lr: 9.996250468730469e-05
2023-03-19 17:33:23,387	44k	INFO	====> Epoch: 4, cost 104.52 s
2023-03-19 17:34:53,327	44k	INFO	Train Epoch: 5 [84%]
2023-03-19 17:34:53,327	44k	INFO	Losses: [2.7421751022338867, 1.9685697555541992, 2.512032985687256, 37.84978103637695, 1.9366132020950317], step: 600, lr: 9.995000937421877e-05
2023-03-19 17:35:08,414	44k	INFO	====> Epoch: 5, cost 105.03 s
2023-03-19 17:36:53,399	44k	INFO	====> Epoch: 6, cost 104.99 s

image

抱歉在Video Encode这项里没有找到CUDA的占用选项,config.json里只修改了保存ckpt的数量

训练集有747个。现在就是训练的很慢,看大家用GPU的话都是几秒一步,想知道如何提高训练速度。

高音失真问题

我的模型经过一段时间使用对话语音进行训练以后在正常音域已经达到了很好的效果,但高音失真严重。我现在也有角色唱的歌的音频,但在用ultimate vocal remover的demucs模型和karaoke-uvr模型去除伴奏和和声后音频质量堪忧,完全比不上语音的训练集,尝试训练以后出现了严重的电流声和各种问题。
请问大大们有任何建议吗?

EOFError: Ran out of input

Hi, the error occurred when extracting the spectrogram. The error is caused by loading the empty *spec.pt file, which was interrupting the training process.

Env: ubuntu/python3.8/branch 4.0
Stage: Training. The Hubert and f0 features were extracted before the training stage.

Traceback (most recent call last):
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/data/sdb/mike/repo/so-vits-svc/train.py", line 122, in run
train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler,
File "/data/sdb/mike/repo/so-vits-svc/train.py", line 141, in train_and_evaluate
for batch_idx, items in enumerate(train_loader):
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in next
data = self._next_data()
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1313, in _next_data
return self._process_data(data)
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/_utils.py", line 543, in reraise
raise exception
EOFError: Caught EOFError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/sdb/mike/repo/so-vits-svc/data_utils.py", line 90, in getitem
return self.get_audio(self.audiopaths[index][0])
File "/data/sdb/mike/repo/so-vits-svc/data_utils.py", line 53, in get_audio
spec = torch.load(spec_filename)
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/mike/anaconda3/envs/sovits/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input

epch 到10000停止了,但是推理时有很严重的噪音

到10000后训练就停止,推理时还会有严重的噪音,但可以隐约听到说话声音了

是否需要把配置文件中的"epochs": 10000, 调高让他接着练呢?还是说有哪一步可能做错了呢?

我确实没有使用Pre-trained model files: G_0.pth D_0.pth, 不知道是否和这个有关呢?

请问onnx导出时出现的错误是什么原因导致的?

运行后出现如下字段,CPU也确实短暂工作了一下,但最后生成的onnx模型只有114mb(pth有517MB),放到MoeSS里也无法加载。
想请教一下各位大佬如何解决?非常感谢!
另外我的模型是在云端跑出来的,下载到本地后推理没问题但转onnx出现了问题,觉得是环境不一致的原因又去云端转了下onnx,也失败了。

load
2023-03-20 22:40:07 | INFO | root | Loaded checkpoint 'checkpoints/tomovoice/model.pth' (iteration 204)
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:2020: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input c
warnings.warn(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:2020: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input f0
warnings.warn(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:2020: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input mel2ph
warnings.warn(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:2020: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input uv
warnings.warn(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:2020: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input noise
warnings.warn(
E:\AI-barbara.v4.0\utils.py:178: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert f0_coarse.max() <= 255 and f0_coarse.min() >= 1, (f0_coarse.max(), f0_coarse.min())
E:\AI-barbara.v4.0\modules\attentions.py:203: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert t_s == t_t, "Relative attention is only available for self-attention."
E:\AI-barbara.v4.0\modules\attentions.py:248: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
pad_length = max(length - (self.window_size + 1), 0)
E:\AI-barbara.v4.0\modules\attentions.py:249: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
slice_start_position = max((self.window_size + 1) - length, 0)
E:\AI-barbara.v4.0\modules\attentions.py:251: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_length > 0:
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx_internal\jit_utils.py:258: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\constant_fold.cpp:181.)
_C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx_internal\jit_utils.py:258: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.)
_C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:687: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.)
_C._jit_pass_onnx_graph_shape_type_inference(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:687: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\constant_fold.cpp:181.)
_C._jit_pass_onnx_graph_shape_type_inference(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:1178: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.)
_C._jit_pass_onnx_graph_shape_type_inference(
C:\Users\TAKATSUKI\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\onnx\utils.py:1178: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\constant_fold.cpp:181.)
_C._jit_pass_onnx_graph_shape_type_inference(

关于多角色、声音素材、训练次数的疑问

有以下疑问请大佬们解答:
已整理了3个角色干声,每个角色大概1万多条语音,每条在2-13秒以内。
batch_size: 4,learning_rate: 0.0001,3060Laptop(6G),内存24G。
1、多角色一起训练好?还是单独每个角色训练比较好?
2、有必要每个角色都放1万多条语音数据进行训练吗?是数据多好?还是训练次数多好?
3、Epoch: per cost 236.25s,如果一起训练3个角色,给予足够时间,3万多条语音都会跑训一遍?
4、如果前期只使用小量干声数据,后期可以增加干声数据继续训练吗,如何操作稳妥?
5、如果一开始就进行单角色训练,后期就不能增加角色了吗,只能单独再训练一个模型?

感谢大佬!

训练什么时候会结束?

看了训练的过程,似乎是到了一定程度后就会存档,那有没有办法手动让它随时存档呢?
用的是4.0的colab

cuda out of memory

File "D:\so-vits-svc\inference_main.py", line 51, in
out_audio, out_sr = svc_model.infer(spk, tran, raw_path)
File "D:\so-vits-svc\inference\infer_tool.py", line 224, in infer
audio = self.net_g_ms.infer(x_tst, f0=f0, g=sid)[0,0].data.float()
File "D:\so-vits-svc\models.py", line 346, in infer
z_p, m_p, logs_p, c_mask = self.enc_p_(c, c_lengths, f0=f0_to_coarse(f0))
File "C:\Users\userAppData\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1194, in call_impl
return forward_call(*input, **kwargs)
File "D:\so-vits-svc\models.py", line 119, in forward
x = self.enc
(x * x_mask, x_mask)
File "C:\Users\userAppData\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\so-vits-svc\attentions.py", line 39, in forward
y = self.attn_layers[i](x, x, attn_mask)
File "C:\Users\userAppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\so-vits-svc\attentions.py", line 143, in forward
x, self.attn = self.attention(q, k, v, mask=attn_mask)
File "D:\so-vits-svc\attentions.py", line 160, in attention
scores_local = self._relative_position_to_absolute_position(rel_logits)
File "D:\so-vits-svc\attentions.py", line 221, in _relative_position_to_absolute_position
x = F.pad(x, commons.convert_pad_shape([[0,0],[0,0],[0,0],[0,1]]))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.39 GiB (GPU 0; 6.00 GiB total capacity; 3.23 GiB already allocated; 53.94 MiB free; 3.94 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

训练过程中有大量不必要的DEBUG信息输出

您好!

我在pull了最新的仓库后在模型训练过程中发现了大量的DEBUG信息输出,请问是否有近期的更新打开了DEBUG开关?建议在train.py中加入args控制verbose输出,以避免大量不必要的信息污染日志。

非常感谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.