Giter Club home page Giter Club logo

voicefixer's Introduction

arXiv Open In Colab PyPI version githubioHuggingFace

🗣️ 🔧 VoiceFixer

Voicefixer aims to restore human speech regardless how serious its degraded. It can handle noise, reveberation, low resolution (2kHz~44.1kHz) and clipping (0.1-1.0 threshold) effect within one model.

This package provides:

  • A pretrained Voicefixer, which is build based on neural vocoder.
  • A pretrained 44.1k universal speaker-independent neural vocoder.

main

  • If you found this repo helpful, please consider citing or "Buy Me A Coffee"
 @misc{liu2021voicefixer,   
     title={VoiceFixer: Toward General Speech Restoration With Neural Vocoder},   
     author={Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang},  
     year={2021},  
     eprint={2109.13731},  
     archivePrefix={arXiv},  
     primaryClass={cs.SD}  
 }

Demo

Please visit demo page to view what voicefixer can do.

Usage

Run Modes

Mode Description
0 Original Model (suggested by default)
1 Add preprocessing module (remove higher frequency)
2 Train mode (might work sometimes on seriously degraded real speech)
all Run all modes - will output 1 wav file for each supported mode.

Command line

First, install voicefixer via pip:

pip install git+https://github.com/haoheliu/voicefixer.git

Process a file:

# Specify the input .wav file. Output file is outfile.wav.
voicefixer --infile test/utterance/original/original.wav
# Or specify a output path
voicefixer --infile test/utterance/original/original.wav --outfile test/utterance/original/original_processed.wav

Process files in a folder:

voicefixer --infolder /path/to/input --outfolder /path/to/output

Change mode (The default mode is 0):

voicefixer --infile /path/to/input.wav --outfile /path/to/output.wav --mode 1

Run all modes:

# output file saved to `/path/to/output-modeX.wav`.
voicefixer --infile /path/to/input.wav --outfile /path/to/output.wav --mode all

Pre-load the weights only without any actual processing:

voicefixer --weight_prepare

For more helper information please run:

voicefixer -h

Desktop App

Demo on Youtube (Thanks @Justin John)

Install voicefixer via pip:

pip install voicefixer

You can test audio samples on your desktop by running website (powered by streamlit)

  1. Clone the repo first.
git clone https://github.com/haoheliu/voicefixer.git
cd voicefixer

⚠️ For windows users, please make sure you have installed WGET and added the wget command to the system path (thanks @justinjohn0306).

  1. Initialize and start web page.
# Run streamlit 
streamlit run test/streamlit.py
  • If you run for the first time: the web page may leave blank for several minutes for downloading models. You can checkout the terminal for downloading progresses.

  • You can use this low quality speech file we provided for a test run. The page after processing will look like the following.

figure

  • For users from main land China, if you experience difficulty on downloading checkpoint. You can access them alternatively on 百度网盘 (提取密码: qis6). Please download the two checkpoints inside and place them in the following folder.
    • Place vf.ckpt inside ~/.cache/voicefixer/analysis_module/checkpoints. (The "~" represents your home directory)
    • Place model.ckpt-1490000_trimed.pt inside ~/.cache/voicefixer/synthesis_module/44100. (The "~" represents your home directory)

Python Examples

First, install voicefixer via pip:

pip install voicefixer

Then run the following scripts for a test run:

git clone https://github.com/haoheliu/voicefixer.git; cd voicefixer
python3 test/test.py # test script

We expect it will give you the following output:

Initializing VoiceFixer...
Test voicefixer mode 0, Pass
Test voicefixer mode 1, Pass
Test voicefixer mode 2, Pass
Initializing 44.1kHz speech vocoder...
Test vocoder using groundtruth mel spectrogram...
Pass

test/test.py mainly contains the test of the following two APIs:

  • voicefixer.restore
  • vocoder.oracle
...

# TEST VOICEFIXER
## Initialize a voicefixer
print("Initializing VoiceFixer...")
voicefixer = VoiceFixer()
# Mode 0: Original Model (suggested by default)
# Mode 1: Add preprocessing module (remove higher frequency)
# Mode 2: Train mode (might work sometimes on seriously degraded real speech)
for mode in [0,1,2]:
    print("Testing mode",mode)
    voicefixer.restore(input=os.path.join(git_root,"test/utterance/original/original.flac"), # low quality .wav/.flac file
                       output=os.path.join(git_root,"test/utterance/output/output_mode_"+str(mode)+".flac"), # save file path
                       cuda=False, # GPU acceleration
                       mode=mode)
    if(mode != 2):
        check("output_mode_"+str(mode)+".flac")
    print("Pass")

# TEST VOCODER
## Initialize a vocoder
print("Initializing 44.1kHz speech vocoder...")
vocoder = Vocoder(sample_rate=44100)

### read wave (fpath) -> mel spectrogram -> vocoder -> wave -> save wave (out_path)
print("Test vocoder using groundtruth mel spectrogram...")
vocoder.oracle(fpath=os.path.join(git_root,"test/utterance/original/p360_001_mic1.flac"),
               out_path=os.path.join(git_root,"test/utterance/output/oracle.flac"),
               cuda=False) # GPU acceleration

...

You can clone this repo and try to run test.py inside the test folder.

Docker

Currently the the Docker image is not published and needs to be built locally, but this way you make sure you're running it with all the expected configuration. The generated image size is about 10GB and that is mainly due to the dependencies that consume around 9.8GB on their own.

However, the layer containing voicefixer is the last added layer, making any rebuild if you change sources relatively small (~200MB at a time as the weights get refreshed on image build).

The Dockerfile can be viewed here.

After cloning the repo:

OS Agnostic

# To build the image
cd voicefixer
docker build -t voicefixer:cpu .

# To run the image
docker run --rm -v "$(pwd)/data:/opt/voicefixer/data" voicefixer:cpu <all_other_cli_args_here>

## Example: docker run --rm -v "$(pwd)/data:/opt/voicefixer/data" voicefixer:cpu --infile data/my-input.wav --outfile data/my-output.mode-all.wav --mode all

Wrapper script: Linux and MacOS

# To build the image
cd voicefixer
./docker-build-local.sh

# To run the image
./run.sh <all_other_cli_args_here>

## Example: ./run.sh --infile data/my-input.wav --outfile data/my-output.mode-all.wav --mode all

Others Features

  • How to use your own vocoder, like pre-trained HiFi-Gan?

First you need to write a following helper function with your model. Similar to the helper function in this repo: https://github.com/haoheliu/voicefixer/blob/main/voicefixer/vocoder/base.py#L35

    def convert_mel_to_wav(mel):
        """
        :param non normalized mel spectrogram: [batchsize, 1, t-steps, n_mel]
        :return: [batchsize, 1, samples]
        """
        return wav

Then pass this function to voicefixer.restore, for example:

voicefixer.restore(input="", # input wav file path
                   output="", # output wav file path
                   cuda=False, # whether to use gpu acceleration
                   mode = 0,
                   your_vocoder_func = convert_mel_to_wav)

Note:

  • For compatibility, your vocoder should working on 44.1kHz wave with mel frequency bins 128.
  • The input mel spectrogram to the helper function should not be normalized by the width of each mel filter.

Materials

46dnPO.png 46dMxH.png

Change log

See CHANGELOG.md.

voicefixer's People

Contributors

anonymous20211004 avatar appleholic avatar camille-vanhoffelen avatar chrisbaume avatar francocm avatar haoheliu avatar haoheliu2 avatar manmay-nakhashi avatar welbornt avatar wetdog avatar zfturbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voicefixer's Issues

some questions

Hi, thanks for your great work.
After reading your paper, I have a question here.

  1. Why use the two-stage algorithm? is it to facilitate more types of speech restoration?
  2. Since there is no information about the speed of the model in the paper, what is the training and inference speed of the model?

cannot use my gpu

How do I use my gpu? I have a gtx 1650 but no matter I do, the program keeps using my CPU instead.

not working anymore

hi!
I have this installed and it used to work alright, but since recently it will not start up anymore.
what has happened? I know you don't work on this repo anymore, but then how did it break? 🤕

RuntimeError: Error(s) in loading state_dict for VoiceFixer: Missing key(s) in state_dict: "f_helper.istft.ola_window". Unexpected key(s) in state_dict: "f_helper.istft.reverse.weight", "f_helper.istft.overlap_add.weight".

Traceback:
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "D:\AI\VOICEFIXER\voicefixer\test\streamlit.py", line 18, in
voice_fixer = init_voicefixer()
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 211, in wrapper
return cached_func(*args, **kwargs)
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 240, in call
return self._get_or_create_cached_value(args, kwargs)
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 266, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 320, in _handle_cache_miss
computed_value = self._info.func(*func_args, **func_kwargs)
File "D:\AI\VOICEFIXER\voicefixer\test\streamlit.py", line 14, in init_voicefixer
return VoiceFixer()
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\voicefixer\base.py", line 23, in init
self._model.load_state_dict(
File "C:\Users\madwu\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(

where to find the model(*.pth) to test the effect with my own input wav?

hi, i just want to test the powerfull effect of voicefixer, with my own distored wav. so i followed your instruction under Python Examples, but when run python3 test/test.py failed. the error information is as follows~~~~~~~~~
Initializing VoiceFixer...
Traceback (most recent call last):
File "test/test.py", line 39, in
voicefixer = VoiceFixer()
File "/root/anaconda3.8/lib/python3.8/site-packages/voicefixer/base.py", line 12, in init
self._model = voicefixer_fe(channels=2, sample_rate=44100)
File "/root/anaconda3.8/lib/python3.8/site-packages/voicefixer/restorer/model.py", line 140, in init
self.vocoder = Vocoder(sample_rate=44100)
File "/root/anaconda3.8/lib/python3.8/site-packages/voicefixer/vocoder/base.py", line 14, in init
self._load_pretrain(Config.ckpt)
File "/root/anaconda3.8/lib/python3.8/site-packages/voicefixer/vocoder/base.py", line 19, in _load_pretrain
checkpoint = load_checkpoint(pth, torch.device("cpu"))
File "/root/anaconda3.8/lib/python3.8/site-packages/voicefixer/vocoder/model/util.py", line 92, in load_checkpoint
checkpoint = torch.load(checkpoint_path, map_location=device)
File "/root/anaconda3.8/lib/python3.8/site-packages/torch/serialization.py", line 600, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/root/anaconda3.8/lib/python3.8/site-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
It seems that the pretrained model file can not be find.
i manually searched the *.pth files but not find, so seeking your help.
Thank you!

unable to run voicefixer

when I'm trying to run voicefixer, I'm getting these messages:
Downloading the weight of neural vocoder: TFGAN
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1346, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1285, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1331, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1280, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1040, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 980, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1454, in connect
self.sock = self._context.wrap_socket(self.sock,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1040, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/bin/voicefixer", line 5, in
from voicefixer import VoiceFixer
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/voicefixer/init.py", line 13, in
from voicefixer.vocoder.base import Vocoder
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/voicefixer/vocoder/init.py", line 20, in
urllib.request.urlretrieve(
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 239, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 517, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 534, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1389, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1349, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)>

Any idea how this can be fixed?

Can I use it in my commercial project?

Hi, you developed a great project here.it saved me a lot of time and effort.i want to use the voice fixer with pretrained model in my commercial project to increase audio quality.is it legal? Can I use this for commercial purpose without any license issues?

Inconsistency in the generator architecture

Thanks for releasing the code publicly. I have a small confusion in the implementation of the generator mentioned here. As per Fig.3(a) in the paper, a mask is predicted from the input noisy audio which is then multiplied with the input to get the clean audio, but in the implementation, it seems the after the masking operation it is further passed through a unet. The loss is also calculated for both the outputs. Can you please clarify the inconsistency? Thanks in advance.

URLError: <urlopen error [Errno 11004] getaddrinfo failed>

URLError: <urlopen error [Errno 11004] getaddrinfo failed>
Traceback:
File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "C:\Users\Administrator\voicefixer\test\streamlit.py", line 9, in <module> from voicefixer import VoiceFixer File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-packages\voicefixer\__init__.py", line 13, in <module> from voicefixer.vocoder.base import Vocoder File "c:\users\administrator\appdata\local\programs\python\python38\lib\site-packages\voicefixer\vocoder\__init__.py", line 20, in <module> urllib.request.urlretrieve( File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 525, in open response = self._open(req, data) File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 542, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 502, in _call_chain result = func(*args) File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 1393, in https_open return self.do_open(http.client.HTTPSConnection, req, File "c:\users\administrator\appdata\local\programs\python\python38\lib\urllib\request.py", line 1353, in do_open raise URLError(err)

run windows
Snipaste_2023-02-17_10-10-21

Artifacts on 's' sounds

Hello! Awesome project, and I totally understand that this isn't your main focus anymore, but I just love the results this gives over almost everything else I've tried for speech restoration.

However, I'm getting some interesting 's' sounds being dropped occasionally, and was wondering if there was perhaps a way of avoiding that, that you knew of?

UnvoiceFixed
Voicefixed

Any ideas would be great, thanks!

Streamlit says Numpy isn't available when I already have it installed.

RuntimeError: Numpy is not available
Traceback:

File "C:\Users\daftp\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\script_runner.py", line 430, in _run_script
exec(code, module.dict)
File "C:\Users\daftp\AppData\Local\Programs\Python\Python310\test\streamlit.py", line 49, in
pred_wav = voice_fixer.restore_inmem(audio, mode=mode, cuda=is_cuda)
File "C:\Users\daftp\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "C:\Users\daftp\AppData\Local\Programs\Python\Python310\voicefixer\base.py", line 102, in restore_inmem
sp,mel_noisy = self._pre(self._model, segment, cuda)
File "C:\Users\daftp\AppData\Local\Programs\Python\Python310\voicefixer\base.py", line 59, in _pre
input = torch.from_numpy(input)

some questions

I have some questions, which I would like to clarify
does this work with any language?
is it useful for many voices speaking at the same time?
If the above is false, do you know of a program or project that does the same?

Numpy error

(Voicefixer) C:\Temp\voicefixer>voicefixer --help
Traceback (most recent call last):
File "", line 189, in run_module_as_main
File "", line 148, in get_module_details
File "", line 112, in get_module_details
File "C:\Temp\voicefixer\voicefixer_init
.py", line 13, in
from voicefixer.vocoder.base import Vocoder
File "C:\Temp\voicefixer\voicefixer\vocoder\base.py", line 2, in
from voicefixer.tools.wav import read_wave, save_wave
File "C:\Temp\voicefixer\voicefixer\tools\wav.py", line 6, in
import librosa
File "C:\Temp\voicefixer\venv\Lib\site-packages\librosa_init
.py", line 211, in
from . import core
File "C:\Temp\voicefixer\venv\Lib\site-packages\librosa\core_init.py", line 9, in
from .constantq import * # pylint: disable=wildcard-import
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Temp\voicefixer\venv\Lib\site-packages\librosa\core\constantq.py", line 1059, in
dtype=np.complex,
^^^^^^^^^^
File "C:\Temp\voicefixer\venv\Lib\site-packages\numpy_init_.py", line 338, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'?

Some problems and questions.

Hello!
I installed your neural network and ran it in Desktop App mode, but I don't see the "Turn on GPU" switch here. This is the first question.
Second question: How do I use the models from the demo page? GSR_UNet, VF_Unet, Oracle?

Thanks in advance for the answer!

no effect in the treble area

Hi, thanks for your great work!
When I try the test/test.py , I find that the repair effect is better in the bass area, but has no effect in the treble area. How can I solve this?

Missing `yaml` dependency in package requirements

Requirements do not include PyYAML:

voicefixer/setup.py

Lines 37 to 44 in 310c0ad

REQUIRED = [
"librosa>=0.8.1",
"matplotlib",
"torch>=1.7.0",
"progressbar",
"torchlibrosa==0.0.7",
"GitPython",
]

but the tools/io.py module imports one named yaml:

This is raising an error when using the ssr_eval package:

  File "/home/louis/dev/mdx/ssr_eval/NVSR/nvsr_unet.py", line 8, in <module>
    from voicefixer import Vocoder
  File "/home/louis/miniconda3/envs/ssr_eval/lib/python3.9/site-packages/voicefixer/__init__.py", line 14, in <module>
    from voicefixer.base import VoiceFixer
  File "/home/louis/miniconda3/envs/ssr_eval/lib/python3.9/site-packages/voicefixer/base.py", line 4, in <module>
    from voicefixer.restorer.model import VoiceFixer as voicefixer_fe
  File "/home/louis/miniconda3/envs/ssr_eval/lib/python3.9/site-packages/voicefixer/restorer/model.py", line 16, in <module>
    from voicefixer.tools.io import load_json, write_json
  File "/home/louis/miniconda3/envs/ssr_eval/lib/python3.9/site-packages/voicefixer/tools/io.py", line 4, in <module>
    import yaml
ModuleNotFoundError: No module named 'yaml'

This could be resolved by adding it to the package requirements (in the meantime I will install it manually but wanted to report it).

Issue with defining Module

I'm trying to make a Google Colab with the code of this one, but it somehow returned this error: NameError: name 'VoiceFixer' is not defined. I even actually defined VoiceFixer using one of the definitions from line 9 of base.py. So I changed the definition with line 93 of model.py, still got the same error. Do you know any fixes? If yes, reply.

data format

Data format error? I use .flac file with 44.1 kHz sampling rate.

Google Colab error

When I try running the Colab, step 2 setup throws this error

OSError                                   Traceback (most recent call last)
<ipython-input-3-8baae7857564> in <module>()
     21 from IPython import display; import note_seq; from scipy.io import wavfile
     22 import tensorflow.compat.v2 as tf
---> 23 import augment; import librosa; from matplotlib import cm; import matplotlib.pyplot as plt
     24 
     25 # Import VoiceFixer!

6 frames
/usr/local/lib/python3.7/dist-packages/augment/__init__.py in <module>()
      4 # LICENSE file in the root directory of this source tree
      5 
----> 6 from .effects import (
      7     EffectChain,
      8     shutdown_sox,

/usr/local/lib/python3.7/dist-packages/augment/effects.py in <module>()
      8 import torch
      9 import numpy as np
---> 10 import torchaudio
     11 from torchaudio.sox_effects.sox_effects import effect_names as get_effect_names
     12 

/usr/local/lib/python3.7/dist-packages/torchaudio/__init__.py in <module>()
----> 1 from torchaudio import _extension  # noqa: F401
      2 from torchaudio import (
      3     compliance,
      4     datasets,
      5     functional,

/usr/local/lib/python3.7/dist-packages/torchaudio/_extension.py in <module>()
     25 
     26 
---> 27 _init_extension()

/usr/local/lib/python3.7/dist-packages/torchaudio/_extension.py in _init_extension()
     19     # which depends on `libtorchaudio` and dynamic loader will handle it for us.
     20     if path.exists():
---> 21         torch.ops.load_library(path)
     22         torch.classes.load_library(path)
     23     # This import is for initializing the methods registered via PyBind11

/usr/local/lib/python3.7/dist-packages/torch/_ops.py in load_library(self, path)
    108             # static (global) initialization code in order to register custom
    109             # operators with the JIT.
--> 110             ctypes.CDLL(path)
    111         self.loaded_libraries.add(path)
    112 

/usr/lib/python3.7/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
    362 
    363         if handle is None:
--> 364             self._handle = _dlopen(self._name, mode)
    365         else:
    366             self._handle = handle

OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

Padding error with certain input lengths

Hello everyone, first of all nice work on the library! Very cool stuff and good out-of-the-box results.

I've run into a bug though (or at least it looks a lot like one). Certain input lengths trigger padding errors, probably due to how the split-and-concat strategy for larger inputs work in restore_inmem:

import voicefixer
import numpy as np

model = voicefixer.VoiceFixer()
model.restore_inmem(np.random.random(44100*30 + 1))

>>>
RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (1024, 1024) at dimension 2 of input [1, 1, 2]

I have a rough idea on how to patch it, so let me know if you'd like a PR.

Thanks,

"voicefixer" is not recognized as an internal or external command

Hello!
I plan to use voicefixer from the command line. In accordance with the instructions, I produce the following commands on the command line:

  1. pip install voicefixer==0.1.1
  2. git clone https://github.com/haoheliu/voicefixer.git
  3. cd voicefixer

all these commands run without any error, ok.
But then, as soon as I try to run the «voicefixer» command (for example, this command):
voicefixer --infile /path/to/input.wav --outfile /path/to/output.wav
a message is displayed:
“voicefixer” is not recognized as an internal or external command…and so on
even when I just write one word “voicefixer” on the command line, the same message is displayed
As I understand it, this executable file cannot be found for some reason.
How to fix it? I use Windows 10, I also installed the recommended WGET.

vocoder error

ImportError: cannot import name 'Vocoder' from 'voicefixer' (unknown location)

Lack of user information

What do "modes" do? for example

Change mode (The default mode is 0):

voicefixer --infile /path/to/input.wav --outfile /path/to/output.wav --mode 1

Run all modes:

# output file saved to `/path/to/output-modeX.wav`.
voicefixer --infile /path/to/input.wav --outfile /path/to/output.wav --mode all

Also, the app says it uses cuda but even having the required hardware and drivers set up in my system, I see the app uses only my CPU. I did not use the "--disable-cuda" arg.

I used this app on a 30 minute old radio show in spanish from 2001 that has horrible quality (a home recording of the radio show apparently) and the result was 20 minutes later (apparently it did not use CUDA) and had horrible quality, the words could not be understood anymore, they sounded like a person with difficulties to talk (was kinda funny though)

the version of librosa needs to >=0.8.1 and <0.9.0

In setup.py, librosa needs to be >=0.8.1 and <0.9.0
When I first install librosa==0.9.0, an error occurs as the following:

Traceback (most recent call last):
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/streamlit/script_runner.py", line 379, in _run_script
    exec(code, module.__dict__)
  File "/Users/li/PycharmProjects/voicefixer/test/streamlit.py", line 18, in <module>
    voice_fixer = init_voicefixer()
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/streamlit/caching/cache_utils.py", line 145, in wrapper
    return get_or_create_cached_value()
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/streamlit/caching/cache_utils.py", line 137, in get_or_create_cached_value
    return_value = func(*args, **kwargs)
  File "/Users/li/PycharmProjects/voicefixer/test/streamlit.py", line 14, in init_voicefixer
    return VoiceFixer()
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/voicefixer/base.py", line 12, in __init__
    self._model = voicefixer_fe(channels=2, sample_rate=44100)
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/voicefixer/restorer/model.py", line 158, in __init__
    freeze_parameters=freeze_parameters,
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/voicefixer/tools/modules/fDomainHelper.py", line 24, in __init__
    pad_mode=pad_mode, freeze_parameters=freeze_parameters)
  File "/Users/li/opt/anaconda3/envs/voicefixer/lib/python3.7/site-packages/torchlibrosa/stft.py", line 177, in __init__
    fft_window = librosa.util.pad_center(fft_window, n_fft)
TypeError: pad_center() takes 1 positional argument but 2 were given

The reason is librosa==0.9.0 modified librosa.util.pad_center
librosa==0.9.0 pad_center document:https://librosa.org/doc/0.9.0/generated/librosa.util.pad_center.html?highlight=pad_center#librosa.util.pad_center
librosa==0.8.0 pad_center document:https://librosa.org/doc/0.8.1/generated/librosa.util.pad_center.html?highlight=pad_center#librosa.util.pad_center

How to use my own trained model from voicefixer_main?

Hello.

I am having issue when running your code for inference with the trained model from voicefixer_main, not the pretrained model.
Is it possible to use the trained model for test.py?

I tried to replaced the vf.ckpt with my trined model ~ at the original directory, but it did not work
it produced the following error:

CleanShot 2022-11-04 at 16 34 01@2x

It seems like the pretrained model voicefixer and the trained model from voicefixer_main are different each other in terms of model's size.
the pretrained model is about 489.3 MB
the one from voice_fixer main is about 1.3 GB

Lots of noises are added to the unspoken parts and overall quality is not worse - files provides

My audio is from my lecture video : https://www.youtube.com/watch?v=2zY1dQDGl3o

I want to improve overall quality to make it easier to understand

Here my raw audio : https://drive.google.com/file/d/1gGxH1J3Z_I8NNjqBvbrVB5MA0gh4qCD7/view?usp=share_link

mode 0 output : https://drive.google.com/file/d/1MRFQecxx9Ikevnsyk9Ivx6Ofr_dqdwFi/view?usp=share_link

mode 1 output : https://drive.google.com/file/d/1sva-o7Py6beEIWbcA4f0LS1-ikGmvlUC/view?usp=share_link

mode 2 output : https://drive.google.com/file/d/1sva-o7Py6beEIWbcA4f0LS1-ikGmvlUC/view?usp=share_link

for example open 1.00.40 and you will see noise

also improvement is not very good if i am not talking a lot during that part of video

check out usually the late parts of the sound files and you will see it is actually worse in mode 1 and mode 2

for example check 1.02.40 mode 1 and see noise and bad sound quality

for example check 1.32.55 mode 2 and see bad quality and noise glitches

I don't know maybe you can test and experiment with my speech to improve model even further.

thank you very much keep up the good work

Incompatible libraries

Hello.

I have this incompatibility issue:

pip install voicefixer==1.0.2

voicefixer 0.1.2 requires librosa<0.9.0,>=0.8.1, but you have librosa 0.9.1 which is incompatible.

Then I tried to install 0.9.0 instead:

torchcrepe 0.0.18 requires librosa==0.9.1, but you have librosa 0.9.0 which is incompatible.

I tried to run it anyways with 0.9.0 couldn't run the program in the end because of an error:

  File "E:\Miniconda\envs\softvc\lib\site-packages\torchlibrosa\stft.py", line 177, in __init__
    fft_window = librosa.util.pad_center(fft_window, n_fft)
TypeError: pad_center() takes 1 positional argument but 2 were given

Please advise.

M.

Voicefixer not using CUDA

I created a venv and ran pip install voicefixer==0.1.2

On running voicefixer, only the CPU is used. The GPU is not utilized.

I did not use the --disable-cuda argument.

Update dependencies

Hi,

Voicefixer is a great library to use in tandem with other workflows, but many other libraries have updated their dependencies while voicefixer has not. Is there a roadmap to update the dependencies? If not, would others find an update useful if I made a pull request and give this a go?

Possibility of running on Windows?

Hello, I stumbled on this repo and found it really interesting. The demos in particular impressed me. I have some old/bad quality speech recordings I'd like to try and enhance, but I'm having trouble running any of the code.

I am running Windows 10 home, Python 3.9.12 at the moment. No GPU present right now, so that may be a problem? I understand that the code is not well tested on Windows yet. Nevertheless, I am completely ignorant when it comes to getting these sorts of things to run; without clear steps to follow, I am lost.

If there are legitimate issues running on Windows, I'd like to do my part in making them known, but I'm taking a shot in the dark here. I still hope I can be helpful though!

I assume that the intended workflow for testing is to read an audio file eg. wav, aiff, raw PCM data etc. and process it, creating a new output file? But please correct me if I'm wrong.

I followed instructions in readme.md to try and use the Streamlit app. Specifically, I ran these commands:
pip install voicefixer==0.0.17
git clone https://github.com/haoheliu/voicefixer.git
cd voicefixer
pip install streamlit
streamlit run test/streamlit.py
At this point a Windows firewall dialog comes up and I click allow.
Throughout this process, no errors seem to show up. But the models do not appear to download (no terminal updates, and I let it sit for about a day with no changes). Streamlit page remains blank. The last thing I see in terminal is:
"  You can now view your Streamlit app in your browser.
  Local URL: http://localhost:8501
  Network URL: http://10.0.0.37:8501"
That local URL is the one shown in the address bar.

So yeah I'm quite lost. What do you advise?
Thanks in advance!

Numpy error no attribute complex

File "/home/jordancruz/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "/home/jordancruz/Tools/voicefixer/test/streamlit.py", line 4, in
import librosa
File "/home/jordancruz/.local/lib/python3.11/site-packages/librosa/init.py", line 211, in
from . import core
File "/home/jordancruz/.local/lib/python3.11/site-packages/librosa/core/init.py", line 9, in
from .constantq import * # pylint: disable=wildcard-import
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jordancruz/.local/lib/python3.11/site-packages/librosa/core/constantq.py", line 1059, in
dtype=np.complex,
^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/numpy/init.py", line 305, in getattr
raise AttributeError(former_attrs[attr])

Ask for batch inference example

Hello! I would like to ask to add example for batch processing from samples, for integrating voicefixer before ASV/ASR system if possible. As I have found so far, this proposed function will look quite like restore_inmem.

TFGAN的效果

您好,我这边单独跑了TFGAN的inference,但是效果不是特别理想,语音听起来不太清晰;我跑了您的voicefixer的inference,效果很不错,可见您训练出来的TFGAN(vocoder)的效果是很好的。但是您的工程中没有单独提供TFGAN(vocoder)的模型,您可否添加效果较好的 TFGAN(vocoder)的模型呢?非常感谢!

License for pre-trained checkpoints

I am writing to inquire about the licensing information for the pre-trained checkpoints hat you have released. Understanding the licensing information is crucial to ensure that I use it in compliance with your requirements. Is it licensed under MIT as well? Could you provide any licensing information if not?

Thanks

Installing voicefixer for the first time and at run i got following error

Collecting voicefixer==0.1.2
  Using cached voicefixer-0.1.2-py3-none-any.whl (52 kB)
Requirement already satisfied: librosa<0.9.0,>=0.8.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from voicefixer==0.1.2) (0.8.1)
Requirement already satisfied: matplotlib in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from voicefixer==0.1.2) (3.7.1)
Requirement already satisfied: torch>=1.7.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from voicefixer==0.1.2) (2.0.1)
Collecting progressbar (from voicefixer==0.1.2)
  Using cached progressbar-2.5-py3-none-any.whl
Collecting torchlibrosa==0.0.7 (from voicefixer==0.1.2)
  Using cached torchlibrosa-0.0.7-py3-none-any.whl (10 kB)
Requirement already satisfied: GitPython in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from voicefixer==0.1.2) (3.1.32)
Requirement already satisfied: streamlit>=1.12.0pyyaml in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from voicefixer==0.1.2) (1.24.1)
Requirement already satisfied: audioread>=2.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (3.0.0)
Requirement already satisfied: numpy>=1.15.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.24.3)
Requirement already satisfied: scipy>=1.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.11.0)
Requirement already satisfied: scikit-learn!=0.19.0,>=0.14.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.2.2)
Requirement already satisfied: joblib>=0.14 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.2.0)
Requirement already satisfied: decorator>=3.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (5.1.1)
Requirement already satisfied: resampy>=0.2.2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (0.4.2)
Requirement already satisfied: numba>=0.43.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (0.57.1)
Requirement already satisfied: soundfile>=0.10.2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (0.12.1)
Requirement already satisfied: pooch>=1.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.7.0)
Requirement already satisfied: packaging>=20.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (23.1)
Requirement already satisfied: altair<6,>=4.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (5.0.1)
Requirement already satisfied: blinker<2,>=1.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (1.6.2)
Requirement already satisfied: cachetools<6,>=4.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (5.3.1)
Requirement already satisfied: click<9,>=7.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (8.1.3)
Requirement already satisfied: importlib-metadata<7,>=1.4 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (6.8.0)
Requirement already satisfied: pandas<3,>=0.25 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2.0.3)
Requirement already satisfied: pillow<10,>=6.2.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (9.5.0)
Requirement already satisfied: protobuf<5,>=3.20 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (4.23.3)
Requirement already satisfied: pyarrow>=4.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (12.0.1)
Requirement already satisfied: pympler<2,>=0.9 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (1.0.1)
Requirement already satisfied: python-dateutil<3,>=2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2.8.2)
Requirement already satisfied: requests<3,>=2.4 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2.31.0)
Requirement already satisfied: rich<14,>=10.11.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (13.4.2)
Requirement already satisfied: tenacity<9,>=8.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (8.2.2)
Requirement already satisfied: toml<2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.10.2)
Requirement already satisfied: typing-extensions<5,>=4.0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (4.6.3)
Requirement already satisfied: tzlocal<5,>=1.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (4.3.1)
Requirement already satisfied: validators<1,>=0.2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.20.0)
Requirement already satisfied: pydeck<1,>=0.1.dev5 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.8.1b0)
Requirement already satisfied: tornado<7,>=6.0.3 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (6.3.2)
Requirement already satisfied: watchdog in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (3.0.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from GitPython->voicefixer==0.1.2) (4.0.10)
Requirement already satisfied: filelock in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from torch>=1.7.0->voicefixer==0.1.2) (3.12.2)
Requirement already satisfied: sympy in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from torch>=1.7.0->voicefixer==0.1.2) (1.12)
Requirement already satisfied: networkx in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from torch>=1.7.0->voicefixer==0.1.2) (3.1)
Requirement already satisfied: jinja2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from torch>=1.7.0->voicefixer==0.1.2) (3.1.2)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from matplotlib->voicefixer==0.1.2) (1.1.0)
Requirement already satisfied: cycler>=0.10 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from matplotlib->voicefixer==0.1.2) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from matplotlib->voicefixer==0.1.2) (4.40.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from matplotlib->voicefixer==0.1.2) (1.4.4)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from matplotlib->voicefixer==0.1.2) (3.1.0)
Requirement already satisfied: jsonschema>=3.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (4.18.3)
Requirement already satisfied: toolz in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.12.0)
Requirement already satisfied: colorama in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from click<9,>=7.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.4.6)
Requirement already satisfied: smmap<6,>=3.0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from gitdb<5,>=4.0.1->GitPython->voicefixer==0.1.2) (5.0.0)
Requirement already satisfied: zipp>=0.5 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from importlib-metadata<7,>=1.4->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (3.16.2)
Requirement already satisfied: llvmlite<0.41,>=0.40.0dev0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from numba>=0.43.0->librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (0.40.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from pandas<3,>=0.25->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from pandas<3,>=0.25->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2023.3)
Requirement already satisfied: platformdirs>=2.5.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from pooch>=1.0->librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (3.8.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from jinja2->torch>=1.7.0->voicefixer==0.1.2) (2.1.3)
Requirement already satisfied: six>=1.5 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from python-dateutil<3,>=2->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from requests<3,>=2.4->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from requests<3,>=2.4->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from requests<3,>=2.4->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from requests<3,>=2.4->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2023.5.7)
Requirement already satisfied: markdown-it-py>=2.2.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from rich<14,>=10.11.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from rich<14,>=10.11.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2.15.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from scikit-learn!=0.19.0,>=0.14.0->librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (3.1.0)
Requirement already satisfied: cffi>=1.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from soundfile>=0.10.2->librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (1.15.1)
Requirement already satisfied: pytz-deprecation-shim in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from tzlocal<5,>=1.1->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.1.0.post0)
Requirement already satisfied: mpmath>=0.19 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from sympy->torch>=1.7.0->voicefixer==0.1.2) (1.3.0)
Requirement already satisfied: pycparser in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from cffi>=1.0->soundfile>=0.10.2->librosa<0.9.0,>=0.8.1->voicefixer==0.1.2) (2.21)
Requirement already satisfied: attrs>=22.2.0 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (23.1.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (2023.6.1)
Requirement already satisfied: referencing>=0.28.4 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.29.1)
Requirement already satisfied: rpds-py>=0.7.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.8.10)
Requirement already satisfied: mdurl~=0.1 in c:\users\beats\appdata\local\programs\python\python310\lib\site-packages (from markdown-it-py>=2.2.0->rich<14,>=10.11.0->streamlit>=1.12.0pyyaml->voicefixer==0.1.2) (0.1.2)
DEPRECATION: voicefixer 0.1.2 has a non-standard dependency specifier streamlit>=1.12.0pyyaml. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of voicefixer or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Installing collected packages: progressbar, torchlibrosa, voicefixer
Successfully installed progressbar-2.5 torchlibrosa-0.0.7 voicefixer-0.1.2

C:\Users\Beats>git clone https://github.com/haoheliu/voicefixer.git
Cloning into 'voicefixer'...
remote: Enumerating objects: 452, done.
remote: Counting objects: 100% (31/31), done.
remote: Compressing objects: 100% (25/25), done.
Receiving objects: 100% (452/452), 784.00 KiB | 1.52 MiB/sed 421Receiving objects:  99% (448/452), 784.00 KiB | 1.52 MiB/s
Receiving objects: 100% (452/452), 3.83 MiB | 5.23 MiB/s, done.
Resolving deltas: 100% (246/246), done.

C:\Users\Beats>cd voicefixer

C:\Users\Beats\voicefixer>streamlit run test/streamlit.py

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://10.0.0.4:8501

2023-07-26 11:27:38.530 Uncaught app exception
Traceback (most recent call last):
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Beats\voicefixer\test\streamlit.py", line 4, in <module>
    import librosa
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\__init__.py", line 211, in <module>
    from . import core
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\__init__.py", line 9, in <module>
    from .constantq import *  # pylint: disable=wildcard-import
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\constantq.py", line 1059, in <module>
    dtype=np.complex,
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
`np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
2023-07-26 11:27:46.792 Uncaught app exception
Traceback (most recent call last):
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Beats\voicefixer\test\streamlit.py", line 4, in <module>
    import librosa
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\__init__.py", line 211, in <module>
    from . import core
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\__init__.py", line 9, in <module>
    from .constantq import *  # pylint: disable=wildcard-import
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\constantq.py", line 1059, in <module>
    dtype=np.complex,
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
`np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
2023-07-26 11:29:29.943 Uncaught app exception
Traceback (most recent call last):
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Beats\voicefixer\test\streamlit.py", line 4, in <module>
    import librosa
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\__init__.py", line 211, in <module>
    from . import core
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\__init__.py", line 9, in <module>
    from .constantq import *  # pylint: disable=wildcard-import
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\constantq.py", line 1059, in <module>
    dtype=np.complex,
  File "C:\Users\Beats\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
`np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations```

I'm new to this and have no idea what went wrong and how to fix it?

![image](https://github.com/haoheliu/voicefixer/assets/9324274/b0b7d466-9f8d-49ff-bab8-3014e1738bbf)

colab error

WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  Running command git clone --filter=blob:none -q https://github.com/facebookresearch/WavAugment.git /tmp/pip-req-build-8bon4dtg
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-3-bb7743a3a444> in <module>()
     24 
     25 # Import VoiceFixer!
---> 26 from voicefixer import VoiceFixer, Vocoder
     27 
     28 download = files.download

13 frames
/usr/local/lib/python3.7/dist-packages/torchtext/vocab.py in <module>()
     11 from typing import Dict, List, Optional, Iterable
     12 from collections import Counter, OrderedDict
---> 13 from torchtext._torchtext import (
     14     Vocab as VocabPybind,
     15 )

ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZN3c106ivalue6Future15extractDataPtrsERKNS_6IValueE

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

How to test the model for a single task?

I ran the test/reference.py to test my distorted speech, and the result was GSR. How to test the model for a single task, such as audio super-resolution only?
In addition, what is the delay of voicefixer?

Unable to test, error in state_dict

Hello,

I am trying to test the code on a wav file. But I receive the following message:

RuntimeError: Error(s) in loading state_dict for VoiceFixer:
Missing key(s) in state_dict: "f_helper.istft.ola_window".
Unexpected key(s) in state_dict: "f_helper.istft.reverse.weight", "f_helper.istft.overlap_add.weight".

Which seemed to be caused by the following line in the code:
self._model = self._model.load_from_checkpoint(os.path.join(os.path.expanduser('~'), ".cache/voicefixer/analysis_module/checkpoints/epoch=15_trimed_bn.ckpt"))

Do you have an idea on how to resolve this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.