manmay-nakhashi / tortoise-tts-fastest Goto Github PK
View Code? Open in Web Editor NEWThis project forked from 152334h/tortoise-tts-fast
Faster Tortoise inference then Tortoise Fast Fork
License: GNU Affero General Public License v3.0
This project forked from 152334h/tortoise-tts-fast
Faster Tortoise inference then Tortoise Fast Fork
License: GNU Affero General Public License v3.0
If Inscribe
First of all, congrats for the awesome job you did speeding up tortoise. Tortoise is, hands down, the best tts engine out there and you (and the tortoise-tts-fast project) have made it usable. Now, with that out of the way...
I seem to have encountered a rather peculiar bug - sometimes the end of a sentence gets repeated. Here is one example
I was now coming near the gates and it seemed that our journey was nearly over and we had escaped, when I suddenly thought I heard the sound of many marching feet and my father looking out through the darkness cried: Run, my son, run.
Using the last commit with
voice=train_dotrice
preset=ultra_fast
enable_redaction=False
results in the Run, my son, run.
part being repeated. Now, I know Aeneas father is extremely worried about his son's safety, but I think the dude got it the first time he said it :)
I tried to run the conversion again and got the same result. I'm attaching a sample audio file
repeat.zip
FROM nvidia/cuda:12.2.0-base-ubuntu22.04
RUN apt-get update && \
apt-get install -y --allow-unauthenticated --no-install-recommends \
curl \
git \
build-essential \
python3 \
python3-pip \
python3-dev \
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -rf /var/lib/apt/lists/*
RUN git clone https://github.com/manmay-nakhashi/tortoise-tts-fastest /app
RUN cd /app \
&& pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117 \
&& python3 -m pip install -e . \
&& pip3 install git+https://github.com/152334H/BigVGAN.git \
&& curl -sSL https://install.python-poetry.org | python3 \
&& /root/.local/bin/poetry install
ENTRYPOINT ["streamlit", "run", "/app/scripts/app.py"]
docker build . -t tts-fastest
docker run --gpus all -p 8501:8501 --entrypoint bash -it tts-fastest
root@b80ffe41cbfc:/app# streamlit run scripts/app.py
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
You can now view your Streamlit app in your browser.
Network URL: http://172.16.0.3:8501
External URL: http://98.56.18.179:8501
2023-09-06 02:43:32.306 Uncaught app exception
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.__dict__)
File "/app/scripts/app.py", line 8, in <module>
from tortoise.api import MODELS_DIR
File "/app/tortoise/api.py", line 13, in <module>
from tortoise.models.autoregressive import UnifiedVoice
File "/app/tortoise/models/autoregressive.py", line 12, in <module>
import deepspeed
ModuleNotFoundError: No module named 'deepspeed'
I don't know what I'm doing and I tried some other random commands that didn't help.
For example, I tried conda install -c conda-forge nvcc_linux-64
from somewhere on stack overflow. Also conda install pytorch torchvision cudatoolkit=10.2 -c pytorch -c hcc
.
root@b80ffe41cbfc:/# pip3 install deepspeed
Collecting deepspeed
Using cached deepspeed-0.10.2.tar.gz (858 kB)
Preparing metadata (setup.py) ... done
Collecting hjson (from deepspeed)
...junk
...stuff
...omitted
...
Building wheels for collected packages: deepspeed, lit
Building wheel for deepspeed (setup.py) ... done
Created wheel for deepspeed: filename=deepspeed-0.10.2-py3-none-any.whl size=898080 sha256=917a3df6d5998cdcec2405f3a7f81c8b4b3a97226e2da0b4b33b2ae123c8c354
Stored in directory: /root/.cache/pip/wheels/3b/3b/95/06c454917f34fa4952021f717b90aebd36ade7196d58ac9652
Building wheel for lit (pyproject.toml) ... done
Created wheel for lit: filename=lit-16.0.6-py3-none-any.whl size=93584 sha256=a3d5626b8f9e7032e920035063325495d32c0f22c4a907f9b234f9ca62426d6b
Stored in directory: /root/.cache/pip/wheels/ab/84/e4/5af8c76af9e5bee472e825f1451c18bb3b261d80a7b3ec7f8a
Successfully built deepspeed lit
Installing collected packages: py-cpuinfo, ninja, mpmath, lit, hjson, cmake, typing-extensions, sympy, psutil, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, MarkupSafe, filelock, pydantic, nvidia-cusolver-cu11, nvidia-cudnn-cu11, jinja2, triton, torch, deepspeed
Successfully installed MarkupSafe-2.1.3 cmake-3.27.4.1 deepspeed-0.10.2 filelock-3.12.3 hjson-3.1.0 jinja2-3.1.2 lit-16.0.6 mpmath-1.3.0 networkx-3.1 ninja-1.11.1 numpy-1.25.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 psutil-5.9.5 py-cpuinfo-9.0.0 pydantic-1.10.12 sympy-1.12 torch-2.0.1 triton-2.0.0 typing-extensions-4.7.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
I haven't been able to fix this issue. I'd appreciate some hand holding since I'm unfamiliar with this stuff. I'll pr the Dockerfile if I get it working. Thanks!
I´m trying to make this new version work with the script tortoise_tts.py work, but I´m getting this error, any idea what do I have to change ??
Traceback (most recent call last):
File "/content/gdrive/Othercomputers/Mi portátil/Dubme/FastTortoise/tortoise-tts-fastest/scripts/./tortoise_tts.py", line 352, in <module>
gen = tts.tts_with_preset(
File "/content/tortoise-tts-fastest/tortoise/api.py", line 534, in tts_with_preset
return self.tts(text, **settings)
File "/content/tortoise-tts-fastest/tortoise/api.py", line 672, in tts
codes = autoregressive.inference_speech(
File "/content/tortoise-tts-fastest/tortoise/models/autoregressive.py", line 652, in inference_speech
gen = self.ds_engine.module.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1565, in generate
return self.sample(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2612, in sample
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/tortoise-tts-fastest/tortoise/models/autoregressive.py", line 126, in forward
transformer_outputs = self.transformer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/gpt2/modeling_gpt2.py", line 926, in forward
hidden_states = self.ln_f(hidden_states)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/normalization.py", line 190, in forward
return F.layer_norm(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2515, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Float but found Half
dockerfile
FROM nvidia/cuda:12.3.1-devel-ubuntu20.04
RUN apt-get update && \
apt-get install -y --allow-unauthenticated --no-install-recommends \
git \
wget \
build-essential
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
bash Miniconda3-latest-Linux-x86_64.sh -b -p /miniconda && \
rm Miniconda3-latest-Linux-x86_64.sh
ENV PATH="/miniconda/bin:${PATH}"
RUN git clone https://github.com/manmay-nakhashi/tortoise-tts-fastest /app
WORKDIR /app
RUN conda create -n ttts-fast python=3.8 && \
echo "source activate ttts-fast" > ~/.bashrc
ENV PATH /miniconda/envs/ttts-fast/bin:$PATH
SHELL ["conda", "run", "-n", "ttts-fast", "/bin/bash", "-c"]
RUN conda install -y pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 -c pytorch -c nvidia && \
conda install -c anaconda gdbm && \
pip install -e . && \
pip install git+https://github.com/152334H/BigVGAN.git && \
pip install streamlit
RUN pip install deepspeed==0.9.0
EXPOSE 8501
ENV NAME tortoise-tts
CMD ["streamlit", "run", "scripts/app.py"]
docker build . -t tts
docker run --gpus all -p 8501:8501 tts
==========
== CUDA ==
==========
CUDA Version 12.3.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
You can now view your Streamlit app in your browser.
Network URL: http://172.16.0.2:8501
External URL: http://98.56.18.179:8501
/miniconda/envs/ttts-fast/lib/python3.8/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
* 'allow_population_by_field_name' has been renamed to 'populate_by_name'
* 'validate_all' has been renamed to 'validate_default'
warnings.warn(message, UserWarning)
/miniconda/envs/ttts-fast/lib/python3.8/site-packages/pydantic/_internal/_fields.py:149: UserWarning: Field "model_persistence_threshold" has conflict with protected namespace "model_".
You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
warnings.warn(
2024-01-19 14:55:32.213 Uncaught app exception
Traceback (most recent call last):
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.__dict__)
File "/app/scripts/app.py", line 8, in <module>
from tortoise.api import MODELS_DIR
File "/app/tortoise/api.py", line 13, in <module>
from tortoise.models.autoregressive import UnifiedVoice
File "/app/tortoise/models/autoregressive.py", line 12, in <module>
import deepspeed
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/__init__.py", line 16, in <module>
from . import module_inject
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/module_inject/__init__.py", line 6, in <module>
from .replace_module import replace_transformer_layer, revert_transformer_layer, ReplaceWithTensorSlicing, GroupQuantizer, generic_injection
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 732, in <module>
from ..pipe import PipelineModule
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/pipe/__init__.py", line 6, in <module>
from ..runtime.pipe import PipelineModule, LayerSpec, TiedLayerSpec
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/pipe/__init__.py", line 6, in <module>
from .module import PipelineModule, LayerSpec, TiedLayerSpec
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/pipe/module.py", line 19, in <module>
from ..activation_checkpointing import checkpointing
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 25, in <module>
from deepspeed.runtime.config import DeepSpeedConfig
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/config.py", line 28, in <module>
from .zero.config import get_zero_config, ZeroStageEnum
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/zero/__init__.py", line 6, in <module>
from .partition_parameters import ZeroParamType
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 569, in <module>
class Init(InsertPostInitMethodToModuleSubClasses):
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 571, in Init
param_persistence_threshold = get_config_default(DeepSpeedZeroConfig, "param_persistence_threshold")
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/config_utils.py", line 115, in get_config_default
assert not config.__fields__.get(
AttributeError: 'FieldInfo' object has no attribute 'required'
/miniconda/envs/ttts-fast/lib/python3.8/site-packages/pydantic/_internal/_config.py:321: UserWarning: Valid config keys have changed in V2:
* 'validate_all' has been renamed to 'validate_default'
warnings.warn(message, UserWarning)
2024-01-19 14:55:43.098 Uncaught app exception
Traceback (most recent call last):
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.__dict__)
File "/app/scripts/app.py", line 8, in <module>
from tortoise.api import MODELS_DIR
File "/app/tortoise/api.py", line 13, in <module>
from tortoise.models.autoregressive import UnifiedVoice
File "/app/tortoise/models/autoregressive.py", line 12, in <module>
import deepspeed
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/__init__.py", line 16, in <module>
from . import module_inject
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/module_inject/__init__.py", line 6, in <module>
from .replace_module import replace_transformer_layer, revert_transformer_layer, ReplaceWithTensorSlicing, GroupQuantizer, generic_injection
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 732, in <module>
from ..pipe import PipelineModule
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/pipe/__init__.py", line 6, in <module>
from ..runtime.pipe import PipelineModule, LayerSpec, TiedLayerSpec
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/pipe/__init__.py", line 6, in <module>
from .module import PipelineModule, LayerSpec, TiedLayerSpec
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/pipe/module.py", line 19, in <module>
from ..activation_checkpointing import checkpointing
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 25, in <module>
from deepspeed.runtime.config import DeepSpeedConfig
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/runtime/config.py", line 31, in <module>
from ..monitor.config import get_monitor_config
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/monitor/config.py", line 63, in <module>
class DeepSpeedMonitorConfig(DeepSpeedConfigModel):
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/deepspeed/monitor/config.py", line 76, in DeepSpeedMonitorConfig
def check_enabled(cls, values):
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/pydantic/deprecated/class_validators.py", line 231, in root_validator
return root_validator()(*__args) # type: ignore
File "/miniconda/envs/ttts-fast/lib/python3.8/site-packages/pydantic/deprecated/class_validators.py", line 237, in root_validator
raise PydanticUserError(
pydantic.errors.PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.
For further information visit https://errors.pydantic.dev/2.5/u/root-validator-pre-skip
Stopping...
any definitive guide on install deepspeed on windows?
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[1], line 14
11 from tortoise.utils.audio import load_audio, load_voice, load_voices
13 # This will download all the models used by Tortoise from the HuggingFace hub.
---> 14 tts = TextToSpeech()
File /workspace/tortoise-tts-fast/tortoise/api.py:271, in TextToSpeech.__init__(self, autoregressive_batch_size, models_dir, enable_redaction, device, high_vram, kv_cache, ar_checkpoint, clvp_checkpoint, diff_checkpoint, vocoder)
254 self.autoregressive = (
255 UnifiedVoice(
256 max_mel_tokens=604,
(...)
268 .eval()
269 )
270 ar_path = ar_checkpoint or get_model_path("autoregressive.pth", models_dir)
--> 271 self.autoregressive.load_state_dict(torch.load(ar_path))
272 self.autoregressive.post_init_gpt2_config(kv_cache)
274 diff_path = diff_checkpoint or get_model_path(
275 "diffusion_decoder.pth", models_dir
276 )
File /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1671, in Module.load_state_dict(self, state_dict, strict)
1666 error_msgs.insert(
1667 0, 'Missing key(s) in state_dict: {}. '.format(
1668 ', '.join('"{}"'.format(k) for k in missing_keys)))
1670 if len(error_msgs) > 0:
-> 1671 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
1672 self.__class__.__name__, "\n\t".join(error_msgs)))
1673 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for UnifiedVoice:
Unexpected key(s) in state_dict: "gpt.h.0.attn.bias", "gpt.h.0.attn.masked_bias", "gpt.h.1.attn.bias", "gpt.h.1.attn.masked_bias", "gpt.h.2.attn.bias", "gpt.h.2.attn.masked_bias", "gpt.h.3.attn.bias", "gpt.h.3.attn.masked_bias", "gpt.h.4.attn.bias", "gpt.h.4.attn.masked_bias", "gpt.h.5.attn.bias", "gpt.h.5.attn.masked_bias", "gpt.h.6.attn.bias", "gpt.h.6.attn.masked_bias", "gpt.h.7.attn.bias", "gpt.h.7.attn.masked_bias", "gpt.h.8.attn.bias", "gpt.h.8.attn.masked_bias", "gpt.h.9.attn.bias", "gpt.h.9.attn.masked_bias", "gpt.h.10.attn.bias", "gpt.h.10.attn.masked_bias", "gpt.h.11.attn.bias", "gpt.h.11.attn.masked_bias", "gpt.h.12.attn.bias", "gpt.h.12.attn.masked_bias", "gpt.h.13.attn.bias", "gpt.h.13.attn.masked_bias", "gpt.h.14.attn.bias", "gpt.h.14.attn.masked_bias", "gpt.h.15.attn.bias", "gpt.h.15.attn.masked_bias", "gpt.h.16.attn.bias", "gpt.h.16.attn.masked_bias", "gpt.h.17.attn.bias", "gpt.h.17.attn.masked_bias", "gpt.h.18.attn.bias", "gpt.h.18.attn.masked_bias", "gpt.h.19.attn.bias", "gpt.h.19.attn.masked_bias", "gpt.h.20.attn.bias", "gpt.h.20.attn.masked_bias", "gpt.h.21.attn.bias", "gpt.h.21.attn.masked_bias", "gpt.h.22.attn.bias", "gpt.h.22.attn.masked_bias", "gpt.h.23.attn.bias", "gpt.h.23.attn.masked_bias", "gpt.h.24.attn.bias", "gpt.h.24.attn.masked_bias", "gpt.h.25.attn.bias", "gpt.h.25.attn.masked_bias", "gpt.h.26.attn.bias", "gpt.h.26.attn.masked_bias", "gpt.h.27.attn.bias", "gpt.h.27.attn.masked_bias", "gpt.h.28.attn.bias", "gpt.h.28.attn.masked_bias", "gpt.h.29.attn.bias", "gpt.h.29.attn.masked_bias".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.