Giter Club home page Giter Club logo

aicovergen's People

Contributors

akirary avatar mrtimoxayt avatar sociallyineptweeb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aicovergen's Issues

UI produces error when generating despite functioning in terminal

The UI produces an error when clicking generate, despite the generation progressing normally within the terminal. Clicking generate again after the generation finishes in the terminal will then show the output correctly.

Edit: More correctly, in my case, it seems you have to spam the generate button to finish generating fully as it times itself out after a second or two.
image
image

Issues on cpu "ValueError"

ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

Traceback (most recent call last):
File "/run/media/saminur/none/Coding Bug/Python/pi/RVC/AICoverGen/src/webui.py", line 10, in
from main import song_cover_pipeline
File "/run/media/saminur/none/Coding Bug/Python/pi/RVC/AICoverGen/src/main.py", line 16, in
from rvc import Config, load_hubert, get_vc, rvc_infer
File "/run/media/saminur/none/Coding Bug/Python/pi/RVC/AICoverGen/src/rvc.py", line 5, in
from fairseq import checkpoint_utils
File "/run/media/saminur/none/Coding Bug/Python/lib/python3.11/site-packages/fairseq/init.py", line 20, in
from fairseq.distributed import utils as distributed_utils
File "/run/media/saminur/none/Coding Bug/Python/lib/python3.11/site-packages/fairseq/distributed/init.py", line 7, in
from .fully_sharded_data_parallel import (
File "/run/media/saminur/none/Coding Bug/Python/lib/python3.11/site-packages/fairseq/distributed/fully_sharded_data_parallel.py", line 10, in
from fairseq.dataclass.configs import DistributedTrainingConfig
File "/run/media/saminur/none/Coding Bug/Python/lib/python3.11/site-packages/fairseq/dataclass/init.py", line 6, in
from .configs import FairseqDataclass
File "/run/media/saminur/none/Coding Bug/Python/lib/python3.11/site-packages/fairseq/dataclass/configs.py", line 1104, in
@DataClass
^^^^^^^^^
File "/usr/lib/python3.11/dataclasses.py", line 1223, in dataclass
return wrap(cls)
^^^^^^^^^
File "/usr/lib/python3.11/dataclasses.py", line 1213, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

Running on CPU instead of GPU

i have 3060 6GB gpu on a laptop , when I run webui it works CPU based and only the model is loaded in the GPU but all the processing is done but CPU & RAM. is this not optimized for GPU

Auto-tune detector ?

A parameter that can fix / control autotune on a song by "unauto_tune" the voice and / or add autotune to the AI-vocal

too many indices for array: array is 1-dimensional, but 2 were indexed, while local path song input

when tried to input song through path this error came in
Tried changing formats as well (.wav/.mp3/.mp4)

2023-08-18 12:05:20 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7860/api/predict "HTTP/1.1 500 Internal Server Error"
2023-08-18 12:05:21 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7860/reset "HTTP/1.1 200 OK"

0it [00:00, ?it/s]Traceback (most recent call last):
  File "/content/AICoverGen/src/main.py", line 247, in song_cover_pipeline
    orig_song_path, vocals_path, instrumentals_path, main_vocals_path, backup_vocals_path, main_vocals_dereverb_path = preprocess_song(song_input, mdx_model_params, song_id, is_webui, input_type, progress)
  File "/content/AICoverGen/src/main.py", line 149, in preprocess_song
    vocals_path, instrumentals_path = run_mdx(mdx_model_params, song_output_dir, os.path.join(mdxnet_models_dir, 'UVR-MDX-NET-Voc_FT.onnx'), orig_song_path, denoise=True, keep_orig=keep_orig)
  File "/content/AICoverGen/src/mdx.py", line 262, in run_mdx
    wave_processed = -(mdx_sess.process_wave(-wave, m_threads)) + (mdx_sess.process_wave(wave, m_threads))
  File "/content/AICoverGen/src/mdx.py", line 214, in process_wave
    waves = self.segment(wave, False, chunk)
  File "/content/AICoverGen/src/mdx.py", line 135, in segment
    cut = wave[:, start:end].copy()
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 442, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1392, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1097, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 703, in wrapper
    response = f(*args, **kwargs)
  File "/content/AICoverGen/src/main.py", line 274, in song_cover_pipeline
    raise_exception(str(e), is_webui)
  File "/content/AICoverGen/src/main.py", line 76, in raise_exception
    raise gr.Error(error_msg)
gradio.exceptions.Error: 'too many indices for array: array is 1-dimensional, but 2 were indexed'

sox

Hi there,
I see this warning, is it required?
'sox' is not recognized as an internal or external command,
operable program or batch file.
SoX could not be found!

If you do not have SoX, proceed here:
 - - - http://sox.sourceforge.net/ - - -

If you do (or think that you should) have SoX, double-check your
path variables.

Input and weight type error

OS: Windows 11
Additional info: I use a Python VENV

<All keys matched successfully>
Traceback (most recent call last):
  File "D:\Programme\AICoverGen\src\main.py", line 192, in <module>
    cover_path = song_cover_pipeline(args.youtube_link, rvc_dirname, args.pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 164, in song_cover_pipeline
    ai_vocals_path = voice_change(voice_model, main_vocals_dereverb_path, pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 104, in voice_change
    rvc_infer(rvc_index_path, vocals_path, output_path, pitch_change, cpt, version, net_g, tgt_sr, vc, hubert_model)
  File "D:\Programme\AICoverGen\src\rvc.py", line 147, in rvc_infer
    audio_opt = vc.pipeline(hubert_model, net_g, 0, audio, input_path, times, pitch_change, 'crepe', index_path, 0.7, if_f0, 3, tgt_sr, 0, 0.25, version, 0.33, f0_file=None)
  File "D:\Programme\AICoverGen\src\vc_infer_pipeline.py", line 347, in pipeline
    self.vc(
  File "D:\Programme\AICoverGen\src\vc_infer_pipeline.py", line 185, in vc
    logits = model.extract_features(**inputs)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\fairseq\models\hubert\hubert.py", line 535, in extract_features
    res = self.forward(
  File "D:\Programme\AICoverGen\venv\lib\site-packages\fairseq\models\hubert\hubert.py", line 437, in forward
    features = self.forward_features(source)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\fairseq\models\hubert\hubert.py", line 392, in forward_features
    features = self.feature_extractor(source)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\fairseq\models\wav2vec\wav2vec2.py", line 895, in forward
    x = conv(x)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
    input = module(input)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "D:\Programme\AICoverGen\venv\lib\site-packages\torch\nn\modules\conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

[Feature Request] Allow sharing of webui over local network

Instead of using a gradio tunnel to reach the webui when it resides on another machine when running locally, why not add something similar to how textgen-webui does it. For example, adding a flag to the webui.py like --listen which would allow the web server to listen on a port that is available a local network. This could be useful for if people are running a dedicated machine or VM on their home network.

Fairseq refuses to install off of Requirments

python 3.9 have Git, sox, ffmpeg, my paths are set. the error returns stuff about C++ 2014 not existing i have the 2015-2022 installed due to being on win 10. Programing is Not my thing at all. This is after tyring to figure out why i cant get the MXmodels DLed for it.

invalid load key, '\x00'.

Traceback (most recent call last):
File "/content/AICoverGen/src/main.py", line 275, in song_cover_pipeline
voice_change(voice_model, main_vocals_dereverb_path, ai_vocals_path, pitch_change, index_rate, filter_radius, rms_mix_rate, protect, is_webui)
File "/content/AICoverGen/src/main.py", line 184, in voice_change
cpt, version, net_g, tgt_sr, vc = get_vc(device, config.is_half, config, rvc_model_path)
File "/content/AICoverGen/src/rvc.py", line 113, in get_vc
cpt = torch.load(model_path, map_location='cpu')
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x00'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1392, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1097, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "/content/AICoverGen/src/main.py", line 293, in song_cover_pipeline
raise_exception(str(e), is_webui)
File "/content/AICoverGen/src/main.py", line 81, in raise_exception
raise gr.Error(error_msg)
gradio.exceptions.Error: "invalid load key, '\x00'."

No such file or directory

Additional notes: I use a python venv
OS: Windows 11

16 series/10 series P40 forced single precision
Traceback (most recent call last):
  File "D:\Programme\AICoverGen\src\main.py", line 189, in <module>
    cover_path = song_cover_pipeline(args.youtube_link, rvc_dirname, args.pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 161, in song_cover_pipeline
    ai_vocals_path = voice_change(voice_model, main_vocals_dereverb_path, pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 98, in voice_change
    config = Config(device, is_half)
  File "D:\Programme\AICoverGen\src\rvc.py", line 24, in __init__
    self.x_pad, self.x_query, self.x_center, self.x_max = self.device_config()
  File "D:\Programme\AICoverGen\src\rvc.py", line 44, in device_config
    with open("trainset_preprocess_pipeline_print.py", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'trainset_preprocess_pipeline_print.py'

Options for Vocal Only Input

It would be beneficial to incorporate an option allowing users to indicate that the existing file contains vocals only, facilitating the bypassing of several processing steps. This feature would be particularly advantageous when working with speeches or monologues devoid of musical components.

File not found

I tried a few models, and I always get the following error.

Converting voice using RVC...
16 series/10 series P40 forced single precision
Traceback (most recent call last):
  File "D:\Programme\AICoverGen\src\main.py", line 189, in <module>
    cover_path = song_cover_pipeline(args.youtube_link, rvc_dirname, args.pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 161, in song_cover_pipeline
    ai_vocals_path = voice_change(voice_model, main_vocals_dereverb_path, pitch_change)
  File "D:\Programme\AICoverGen\src\main.py", line 98, in voice_change
    config = Config(device, is_half)
  File "D:\Programme\AICoverGen\src\rvc.py", line 24, in __init__
    self.x_pad, self.x_query, self.x_center, self.x_max = self.device_config()
  File "D:\Programme\AICoverGen\src\rvc.py", line 40, in device_config
    with open(f"configs/{config_file}", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'configs/32k.json'

I use a local install on my own system

Error opening '*.webm': Format not recognised.

After several tests, I'm always getting the same error when trying to generate (with several audio models and youtube links). During the installation I haven't had any errors, although I haven't managed to activate the AVX2 support from faiss.

OS: win 10
GPU: Nvidia GeForce RTX 3080

Startup log
2023-08-03 22:40:48 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2023-08-03 22:40:48 | INFO | faiss.loader | Loading faiss with AVX2 support.
2023-08-03 22:40:48 | INFO | faiss.loader | Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
2023-08-03 22:40:48 | INFO | faiss.loader | Loading faiss.
2023-08-03 22:40:48 | INFO | faiss.loader | Successfully loaded faiss.
Running on local URL: http://127.0.0.1:7860

Error Log
Traceback (most recent call last):
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\librosa\core\audio.py", line 155, in load
context = sf.SoundFile(path)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\soundfile.py", line 658, in init
self._file = self._open(file, mode_int, closefd)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'youtube_video.webm': Format not recognised.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\AICoverGen\src\webui.py", line 65, in song_cover_pipeline
orig_song_path, vocals_path, instrumentals_path, main_vocals_path, backup_vocals_path, main_vocals_dereverb_path = preprocess_song(yt_link, mdx_model_params, song_id, progress)
File "D:\AICoverGen\src\main.py", line 89, in preprocess_song
vocals_path, instrumentals_path = run_mdx(mdx_model_params, song_output_dir, os.path.join(mdxnet_models_dir, 'UVR-MDX-NET-Voc_FT.onnx'), orig_song_path, denoise=True, keep_orig=False)
File "D:\AICoverGen\src\mdx.py", line 257, in run_mdx
wave, sr = librosa.load(filename, mono=False, sr=44100)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
return f(*args, **kwargs)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\librosa\core\audio.py", line 174, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\librosa\core\audio.py", line 198, in _audioread_load
with audioread.audio_open(path) as input_file:
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\audioread_init
.py", line 132, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\gradio\blocks.py", line 1392, in process_api
result = await self.call_function(
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\gradio\blocks.py", line 1097, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\Users\igmnd.conda\envs\py39\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "D:\AICoverGen\src\webui.py", line 99, in song_cover_pipeline
raise gr.Error(str(e))
gradio.exceptions.Error: ''
2023-08-03 22:41:25 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7860/api/predict "HTTP/1.1 500 Internal Server Error"

[Questions] Can we make an options for vocals only?

Sometimes the AI just skip a part and set it to the backup vocals and it makes the songs feels uncompleted. Like "Idol" from YOASOBI for example, it puts the vocals on 0:11 to the backup vocals and it makes the AI didn't process it. So can you make an option that I can upload the vocals like Mangio RVC Fork and process all of it?

Issue with models only running on CPU, no GPU utilization

I have been playing round with the latest update and I have noticed that I am having an issue with the models only using the CPU no GPU utilization, I am positive that I have all the required dependencies installed. to me this looks like that there is an issue with accessing CUDA? I have the CUDA toolkit installed. I am running a system with Ubuntu 22.04.3 LTS | 2x RTX 3060 12GB Driver Version: 535.54.03

[W:onnxruntime:Default, onnxruntime_pybind_state.cc:640 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

run_mdx gpu

Hi run_mdx is using cpu instead of gpu. Do you have any information about this?

[BUG] Regenerate Error

WebUi does not regenerate when the pitch value changes. First create it with 0 pitch and then try to create it with another value.

[Questions] Can we add the ability to adjust vocal cleanup aggression?

With certain songs or songs from certain artists, it seems the AI likes clipping when the singer starts screaming in high pitches or when vocal effects are involved.

Are we able to add the ability to select presets for these instances? Or is there a trick to doing it in a better way?

The song that brought this to my attention was Mafumafu's I want to be a girl, at 1:20 into the song, the ai just struggles.

Issue with the webui command

It doesnt generate a url and i have no idea why.

PS C:\Users\tothr\AICoverGen\AICoverGen\AICoverGen\AICoverGen> python src/webui.py
2023-08-15 00:02:05 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2023-08-15 00:02:05 | INFO | faiss.loader | Loading faiss with AVX2 support.
2023-08-15 00:02:05 | INFO | faiss.loader | Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
2023-08-15 00:02:05 | INFO | faiss.loader | Loading faiss.
2023-08-15 00:02:05 | INFO | faiss.loader | Successfully loaded faiss.
Traceback (most recent call last):
  File "C:\Users\tothr\AICoverGen\AICoverGen\AICoverGen\AICoverGen\src\webui.py", line 160, in <module>
    public_models = json.load(infile)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\encodings\cp1250.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x88 in position 6904: character maps to <undefined>

many errors

Hello
So I have been trying to use this on my 4070ti with 12gb for context
I have not been successful with that
kept getting many errors
Here is the first one that I get when I put a yt link(not Playlist)

Screenshot 2023-08-13 203445
image

And this one when I upload a song, tried to put it in the installation folder didn't work
Screenshot 2023-08-13 204232
Screenshot 2023-08-13 at 21-03-16 Memo Notepad

what do I do

SoX could not be found!

Ran the update today,

I got the error:

'sox' is not recognized as an internal or external command,
operable program or batch file.
SoX could not be found!

If you do not have SoX, proceed here:
 - - - http://sox.sourceforge.net/ - - -

If you do (or think that you should) have SoX, double-check your
path variables.

But during the update process, this line was there:
Installing collected packages: sox
Successfully installed sox-1.4.1

I'm considering installing Sox directly from the link that the prompt gives, but the version is 14.4.2, will that cause any issues?

[Feature request] Prompting menu

Add a prompting menu so the character voices any text. Plus add different control of settings such as how the character speaks (shouting, whispering, etc.).

[Questions] Can we return the pitch changing from -20 to 20?

Sometimes I want my vocals is deeper just a little bit, but after this new update, the gap of the number changes so much that when I just change the pitch to -1, the female vocals became male vocals and sometimes the pitch just change uncontrollable and it makes the vocals deeper or softer than the original dataset voices. Can you bring that back again?

Model file not found despite being there

I've loaded the model file to the specific folder in the webUI, but it doesn't detect it
[WinError 267] The directory name is invalid: 'C:\\Users\\USER\\Downloads\\Easysongcover\\AICoverGen\\rvc_models\\[model].pth'"

some lyrics skipped

for some songs some lyrics are skipped by ai, I think ai thinks the vocal part of instrumental.
how can I fix it?

Torch not compiled with CUDA enabled

i got this error when generate

i use macbook to run this code

Traceback (most recent call last):
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/gradio/routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/gradio/blocks.py", line 1392, in process_api
result = await self.call_function(
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/gradio/blocks.py", line 1097, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/Caskroom/miniforge/base/envs/kita/lib/python3.9/site-packages/gradio/utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "/Users/darelllegoferdanu/Documents/AIART/AICoverGen-main/src/main.py", line 316, in song_cover_pipeline
raise_exception(str(e), is_webui)
File "/Users/darelllegoferdanu/Documents/AIART/AICoverGen-main/src/main.py", line 83, in raise_exception
raise gr.Error(error_msg)
gradio.exceptions.Error: 'Torch not compiled with CUDA enabled'

Download from a YouTube Playlist URL took forever

It seems that the download process took forever if I gave a YT URL song which I played from a playlist.

example URL:
https://www.youtube.com/watch?v=LrwC2Xu2POs&list=PLcEGijvykxZGPfuRkWle36VHV8Px9kvYT&index=29&ab_channel=spitzclips

I need to restart the runtime and gave only the short video link for it to work:
https://www.youtube.com/watch?v=LrwC2Xu2POs

not an important job though but definitely a bit annoying when accidentally provided the wrong format and need to reset the runtime 😅

No such file or directory UVR-MDX-NET-Voc_FT.onnx

Hello, I'm getting the following error when trying to use the program. I downloaded a youtube video in .wav format.


Traceback (most recent call last):
  File "C:\Users\jarur\AICoverGen\src\mdx.py", line 84, in get_hash
    with open(model_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\jarur\\AICoverGen\\mdxnet_models\\UVR-MDX-NET-Voc_FT.onnx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\jarur\AICoverGen\src\main.py", line 265, in song_cover_pipeline
    orig_song_path, vocals_path, instrumentals_path, main_vocals_path, backup_vocals_path, main_vocals_dereverb_path = preprocess_song(song_input, mdx_model_params, song_id, is_webui, input_type, progress)
  File "C:\Users\jarur\AICoverGen\src\main.py", line 167, in preprocess_song
    vocals_path, instrumentals_path = run_mdx(mdx_model_params, song_output_dir, os.path.join(mdxnet_models_dir, 'UVR-MDX-NET-Voc_FT.onnx'), orig_song_path, denoise=True, keep_orig=keep_orig)
  File "C:\Users\jarur\AICoverGen\src\mdx.py", line 245, in run_mdx
    model_hash = MDX.get_hash(model_path)
  File "C:\Users\jarur\AICoverGen\src\mdx.py", line 88, in get_hash
    model_hash = hashlib.md5(open(model_path, 'rb').read()).hexdigest()
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\jarur\\AICoverGen\\mdxnet_models\\UVR-MDX-NET-Voc_FT.onnx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\routes.py", line 442, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1392, in process_api
    result = await self.call_function(
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1097, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\Users\jarur\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 703, in wrapper
    response = f(*args, **kwargs)
  File "C:\Users\jarur\AICoverGen\src\main.py", line 292, in song_cover_pipeline
    raise_exception(str(e), is_webui)
  File "C:\Users\jarur\AICoverGen\src\main.py", line 80, in raise_exception
    raise gr.Error(error_msg)
gradio.exceptions.Error: "[Errno 2] No such file or directory: 'C:\\\\Users\\\\jarur\\\\AICoverGen\\\\mdxnet_models\\\\UVR-MDX-NET-Voc_FT.onnx'"

Any help would be greatly appreciated.

module 'numpy' has no attribute 'int'

Hey there,

After painful local success installation of AICoverGen, I wasn't able to generate anything. When I hit Converting voice using RVC I immediately hit error:

"module 'numpy' has no attribute 'int'.
np.int was a deprecated alias for the builtin int. To avoid this error in existing code, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information.\nThe aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:\n https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations"

And there is traceback from console:

Traceback (most recent call last):
  File "D:\AICoverGen\src\main.py", line 275, in song_cover_pipeline
    voice_change(voice_model, main_vocals_dereverb_path, ai_vocals_path, pitch_change, f0_method, index_rate, filter_radius, rms_mix_rate, protect, crepe_hop_length, is_webui)
  File "D:\AICoverGen\src\main.py", line 187, in voice_change
    rvc_infer(rvc_index_path, index_rate, vocals_path, output_path, pitch_change, f0_method, cpt, version, net_g, filter_radius, tgt_sr, rms_mix_rate, protect, crepe_hop_length, vc, hubert_model)
  File "D:\AICoverGen\src\rvc.py", line 150, in rvc_infer
    audio_opt = vc.pipeline(hubert_model, net_g, 0, audio, input_path, times, pitch_change, f0_method, index_path, index_rate, if_f0, filter_radius, tgt_sr, 0, rms_mix_rate, version, protect, crepe_hop_length)
  File "D:\AICoverGen\src\vc_infer_pipeline.py", line 549, in pipeline
    pitch, pitchf = self.get_f0(
  File "D:\AICoverGen\src\vc_infer_pipeline.py", line 368, in get_f0
    f0_coarse = np.rint(f0_mel).astype(np.int)
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\numpy\__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\gradio\routes.py", line 442, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\gradio\blocks.py", line 1392, in process_api
    result = await self.call_function(
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\gradio\blocks.py", line 1097, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\Users\PinkiePie\AppData\Roaming\Python\Python39\site-packages\gradio\utils.py", line 703, in wrapper
    response = f(*args, **kwargs)
  File "D:\AICoverGen\src\main.py", line 293, in song_cover_pipeline
    raise_exception(str(e), is_webui)
  File "D:\AICoverGen\src\main.py", line 81, in raise_exception
    raise gr.Error(error_msg)
gradio.exceptions.Error: "module 'numpy' has no attribute 'int'.\n`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.\nThe aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:\n    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations"
2023-08-27 11:39:25 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7860/api/predict "HTTP/1.1 500 Internal Server Error"
2023-08-27 11:39:25 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7860/reset "HTTP/1.1 200 OK"

Could be that fixed?

I am using Windows 11, 22621.1413 build with Python 3.9.13.

And I want to do it more locally as I do not support Google Services at all, so I prefer local over cloud. And I have even good GPU (RTX 3070), kinda powerful, so I guess it can run pretty good.

Thanks

[Request] On-Off Function, Mono-Stereo

Hello! I am always grateful to you.
I'd like to ask if you can add additional functions.

In my case, I input my voice. (No music, no backup vocals) And I want a conversion without reverb.

Could you please make it possible to turn on and off the music and backup vocal removal and reverb functions?

I also type in mono files, and the output is coming out in stereo. Could you add a function that can come out as a mono?

Thank you very much for your hard work.
(I'm a translator user, so I don't know what nuance my message is getting through, but only my thanks are genuine🥹)

CUDA driver version is insufficient for CUDA runtime version

Hi,
I have trouble installing fairseq, I often get this type of error:

Collecting omegaconf<2.1 (from fairseq) Downloading omegaconf-2.0.6-py3-none-any.whl (36 kB) Collecting antlr4-python3-runtime==4.8 (from hydra-core<1.1,>=1.0.7->fairseq) Downloading antlr4-python3-runtime-4.8.tar.gz (112 kB) ---------------------------------------- 112.4/112.4 kB 6.4 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done ERROR: No .egg-info directory found in C:\Users\...\Local\Temp\pip-pip-egg-info-of2_fg80

Anyone has an idea ?

Dependency Issues

So when running the requirment installation the provided numpy version (1.23.5) is not compatible with fairseq, librosa, onnxruntime-gpu.

ERROR: Cannot install -r requirements.txt (line 14), -r requirements.txt (line 3), -r requirements.txt (line 4), numpy==1.23.5 and onnxruntime-gpu==1.15.1 because these package versions have conflicting dependencies.

The conflict is caused by:
The user requested numpy==1.23.5
fairseq 0.12.2 depends on numpy; python_version >= "3.7"
librosa 0.6.3 depends on numpy>=1.8.0
onnxruntime-gpu 1.15.1 depends on numpy>=1.24.2
The user requested numpy==1.23.5
fairseq 0.12.2 depends on numpy; python_version >= "3.7"
librosa 0.6.3 depends on numpy>=1.8.0
onnxruntime-gpu 1.15.0 depends on numpy>=1.24.2

Now if I loosen the package range, I can install everything, but then when running it we are faced with the following issue:

Traceback (most recent call last):
File "C:\AICoverGen\src\main.py", line 12, in
from mdx import run_mdx
File "C:\AICoverGen\src\mdx.py", line 8, in
import librosa
File "C:\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\librosa_init_.py", line 211, in
from . import core
File "C:\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\librosa\core_init_.py", line 9, in
from .constantq import * # pylint: disable=wildcard-import
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\librosa\core\constantq.py", line 1059, in
dtype=np.complex,
^^^^^^^^^^
File "C:\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\numpy_init_.py", line 305, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'?
'ab_channel' is not recognized as an internal or external command,
operable program or batch file.

I was just wondering how you resolved this issue? Since it seems that even the given error is a dependency issue related to librosa.

Got "error: Microsoft Visual C++ 14.0 or greater is required" while installing requirements.

I'm getting this error

      (output ommited)
      copying fairseq\config\model\wav2vec2\wav2vec2_base.yaml -> build\lib.win-amd64-cpython-39\fairseq\config\model\wav2vec2
      copying fairseq\config\model\wav2vec2\wav2vec2_large.yaml -> build\lib.win-amd64-cpython-39\fairseq\config\model\wav2vec2
      running build_ext
      building 'fairseq.libbleu' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for fairseq
  Building wheel for audioread (setup.py) ... done
  Created wheel for audioread: filename=audioread-3.0.0-py3-none-any.whl size=23738 sha256=92be0e4481346376a517c5866c147c1ac5edc80b0d52ce7582e90bccfde540d3
  Stored in directory: c:\users\kuroyukihime\appdata\local\pip\cache\wheels\e4\76\a4\cfb55573167a1f5bde7d7a348e95e509c64b2c3e8f921932c3
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141217 sha256=a67917deaf6ba187e7fbe8f1886ecc86d896efdc36dae2d118b08599fea181ae
  Stored in directory: c:\users\kuroyukihime\appdata\local\pip\cache\wheels\42\3c\ae\14db087e6018de74810afe32eb6ac890ef9c68ba19b00db97a
  Building wheel for ffmpy (setup.py) ... done
  Created wheel for ffmpy: filename=ffmpy-0.3.1-py3-none-any.whl size=5600 sha256=54fe7835b1ef6fbc1f39d1a560c07d7796506ecad41e7ce279287c05f6ec3d92
  Stored in directory: c:\users\kuroyukihime\appdata\local\pip\cache\wheels\1f\f1\8d\367922b023b526b7c2ced5db30932def7b18cf39d7ac6e8572
  Building wheel for future (setup.py) ... done
  Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492053 sha256=21ca2781bb1cb244f5a0a458c7d72feb075bb6ee3ddf3c708efac711380bd88e
  Stored in directory: c:\users\kuroyukihime\appdata\local\pip\cache\wheels\bf\5d\6a\2e53874f7ec4e2bede522385439531fafec8fafe005b5c3d1b
Successfully built audioread antlr4-python3-runtime ffmpy future
Failed to build fairseq
ERROR: Could not build wheels for fairseq, which is required to install pyproject.toml-based projects

which package do i need to install ?

image

Requirements.txt error

Screenshot_20230906-090922_1

So im currently using the colab, but it wouldn't let me because of the requirements error?

[Sugestion] MDX Drop Down

Can you perhaps adding drop down on vocal and instrument isolation so we can choose which one is better on specific music?

[Question] Dereverb take to much time

hi, can you help me. Why it takes so much time on dereverb steps, as i remember it didn't takes this much time when on the first day i run the code. But, idk today it takes so long to dereverb one audio. Is there maybe something like cache that i can clear?
image

[Feature Request] save only vocal

can you make a checkbox or a button that saves only the vocal of the converted voice
or alternatively make it so that the music volume can be turned off completely
(if there's a question about this request sry for not answering... timezone diffrence and school)

Different UVR modesl

is it possible to change the uvr model? since it seem like vocals and instrument are poorly separate (lot of back vocals , vocals still stuck in instrument) i would like to change it to uvr kim both vocals and instrument how can i do that if it possible?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.