Giter Club home page Giter Club logo

applio-rvc-fork's Introduction

๐Ÿ Applio-RVC-Fork

Note

Applio-RVC-Fork is designed to complement existing repositories, and as such, certain features may be in experimental stages, potentially containing bugs. Additionally, there might be instances of coding practices that could be improved or refined. It is not intended to replace any other repository.

Discord Discord Bot Docs

๐Ÿ“š Table of Contents

This README has been enhanced by incorporating the features introduced in Applio-RVC-Fork to the original Mangio-RVC-Fork README, along with additional details and explanations.

  1. Improvements of Applio Over RVC
  2. Additional Features of This Repository
  3. Todo Tasks
  4. Installation
  5. Running the Web GUI (Inference & Train)
  6. Running the CLI (Inference & Train)
  7. Credits
  8. Thanks to all RVC, Mangio and Applio contributors

๐ŸŽฏ Improvements of Applio-RVC-Fork Over RVC

The comparisons are with respect to the original Retrieval-based-Voice-Conversion-WebUI repository.

f0 Inference Algorithm Overhaul

  • Applio features a comprehensive overhaul of the f0 inference algorithm, including:
    • Addition of the pyworld dio f0 method.
    • Alternative method for calculating crepe f0.
    • Introduction of the torchcrepe crepe-tiny model.
    • Customizable crepe_hop_length for the crepe algorithm via both the web GUI and CLI.

f0 Crepe Pitch Extraction for Training

  • Works on paperspace machines but not local MacOS/Windows machines (Potential memory leak).

Paperspace Integration (Under maintenance, so it cannot be used for the moment.)

  • Applio seamlessly integrates with Paperspace, providing the following features:
    • Paperspace argument on infer-web.py (--paperspace) for sharing a Gradio link.
    • A dedicated make file tailored for Paperspace users.

Access to Tensorboard

  • Applio grants easy access to Tensorboard via a Makefile and a Python script.

CLI Functionality

  • Applio introduces command-line interface (CLI) functionality, with the addition of the --cli flag in infer-web.py for CLI system usage.

f0 Hybrid Estimation Method

  • Applio offers a novel f0 hybrid estimation method by calculating nanmedian for a specified array of f0 methods, ensuring the best results from multiple methods (CLI exclusive).
  • This hybrid estimation method is also available for f0 feature extraction during training.

UI Changes

Inference:

  • A complete interface redesign enhances user experience, with notable features such as:
    • Audio recording directly from the interface.
    • Convenient drop-down menus for audio and .index file selection.
    • An advanced settings section with new features like autotune and formant shifting.

Training:

  • Improved training features include:
    • A total epoch slider now limited to 10,000.
    • Increased save frequency limit to 100.
    • Default recommended options for smoother setup.
    • Better adaptation to high-resolution screens.
    • A drop-down menu for dataset selection.
    • Enhanced saving system options, including Save all files, Save G and D files, and Save model for inference.

UVR:

  • Applio ensures compatibility with all VR/MDX models for an extended range of possibilities.

TTS (Text-to-Speech, New):

  • Introducing a new Text-to-Speech (TTS) feature using RVC models.
  • Support for multiple languages and Edge-tts/Google-tts.

Resources (New):

  • Users can now upload models, backups, datasets, and audios from various storage services like Drive, Huggingface, Discord, and more.
  • Download audios from YouTube with the ability to automatically separate instrumental and vocals, offering advanced options and UVR support.

Extra (New):

  • Combine instrumental and vocals with ease, including independent volume control for each track and the option to add effects like reverb, compressor, and noise gate.
  • Significant improvements in the processing interface, allowing tasks such as merging models, modifying information, obtaining information, or extracting models effortlessly.

โš™๏ธ Additional Features of This Repository

In addition to the aforementioned improvements, this repository offers the following features:

Enhanced Tone Leakage Reduction

  • Implements tone leakage reduction by replacing source features with training-set features using top1 retrieval. This helps in achieving cleaner audio results.

Efficient Training

  • Provides a seamless and speedy training experience, even on relatively modest graphics cards. The system is optimized for efficient resource utilization.

Data Efficiency

  • Supports training with a small dataset, yielding commendable results, especially with audio clips of at least 10 minutes of low-noise speech.

Overtraining Detection

  • This feature keeps track of the current progress trend and stops the training if no improvement is found after 100 epochs.
    • During the 100 epochs with no improvement, no progress is saved. This allows you to continue training from the best-found epoch.
    • A .pth file of the best epoch is saved in the logs folder under name_[epoch].pth, and in the weights folder as name_fittest.pth. These files are the same.

Mode Collapse Detection

  • This feature restarts training before a mode collapse by lowering the batch size until it can progress past the mode collapse.
    • If a mode collapse is overcome but another one occurs later, it will reset the batch size to its initial setting. This helps maintain training speed when dealing with multiple collapses.

๐Ÿ“ Todo Tasks

  • Investigate GPU Detection Issue: Address the GPU detection problem and ensure proper utilization of Nvidia GPU.
  • Fix Mode Collapse Prevention Feature: Refine the mode collapse prevention feature to maintain graph consistency during retraining.
  • Resolve CUDA Compatibility Issue: Investigate and resolve the cuFFT error related to CUDA compatibility.
  • Refactor infer-web.py: Organize the code of infer-web.py into different files for each tab, enhancing modularity.
  • Expand UVR Model Options: Integrate additional UVR models to provide users with more options and flexibility.
  • Enhance Installation Process: Improve the system installation process for better user experience and clarity. Applio Installer.exe
  • Implement Automatic Updates: Add automatic update functionality to keep the application current with the latest features.
  • Multilingual Support: Include more translations for various languages.
  • Diversify TTS Methods: Introduce new TTS methods and enhance customization options for a richer user experience.
  • CLI Improvement: Enhance the CLI functionality and introduce a pipeline for a more streamlined user experience.
  • Dependency Updates: Keep dependencies up-to-date by regularly updating to the latest versions.
  • Dataset Creation Assistant: Develop an assistant for creating datasets to simplify and guide users through the process.

โœจ Installation

Automatic installation (Windows):

To quickly and effortlessly install Applio along with all the necessary models and configurations on Windows, you can use the Applio Installer.exe or the install_Applio.bat script available in the releases section.

Manual installation (Windows/MacOS):

Note for MacOS Users: When using faiss 1.7.2 under MacOS, you may encounter a Segmentation Fault: 11 error. To resolve this issue, install faiss-cpu 1.7.0 using the following command if you're installing it manually with pip:

pip install faiss-cpu==1.7.0

Additionally, you can install Swig on MacOS using brew:

brew install swig

Install requirements: Before this install ffmpeg, wget, git and python (This fork just works with 3.9.X on Linux)

wget https://github.com/IAHispano/Applio-RVC-Fork/releases/download/v2.0.0/install_Applio-linux.sh
chmod +x install_Applio-linux.sh && ./install_Applio-linux.sh

Manual installation (Paperspace):

cd Applio-RVC-Fork
make install # Do this everytime you start your paperspace machine

๐Ÿช„ Running the Web GUI (Inference & Train)

Use --paperspace or --colab if on cloud system.

python infer-web.py --pycmd python --port 3000

๐Ÿ’ป Running the CLI (Inference & Train)

python infer-web.py --pycmd python --cli
Applio-RVC-Fork CLI

Welcome to the CLI version of RVC. Please read the documentation on README.MD to understand how to use this app.

You are currently in 'HOME':
    go home            : Takes you back to home with a navigation list.
    go infer           : Takes you to inference command execution.

    go pre-process     : Takes you to training step.1) pre-process command execution.
    go extract-feature : Takes you to training step.2) extract-feature command execution.
    go train           : Takes you to training step.3) being or continue training command execution.
    go train-feature   : Takes you to the train feature index command execution.

    go extract-model   : Takes you to the extract small model command execution.

HOME:

Typing 'go infer' for example will take you to the infer page where you can then enter in your arguments that you wish to use for that specific page. For example typing 'go infer' will take you here:

HOME: go infer
You are currently in 'INFER':
    arg 1) model name with .pth in ./weights: mi-test.pth
    arg 2) source audio path: myFolder\MySource.wav
    arg 3) output file name to be placed in './audio-outputs': MyTest.wav
    arg 4) feature index file path: logs/mi-test/added_IVF3042_Flat_nprobe_1.index
    arg 5) speaker id: 0
    arg 6) transposition: 0
    arg 7) f0 method: harvest (pm, harvest, crepe, crepe-tiny)
    arg 8) crepe hop length: 160
    arg 9) harvest median filter radius: 3 (0-7)
    arg 10) post resample rate: 0
    arg 11) mix volume envelope: 1
    arg 12) feature index ratio: 0.78 (0-1)
    arg 13) Voiceless Consonant Protection (Less Artifact): 0.33 (Smaller number = more protection. 0.50 means Dont Use.)

Example: mi-test.pth saudio/Sidney.wav myTest.wav logs/mi-test/added_index.index 0 -2 harvest 160 3 0 1 0.95 0.33

INFER: <INSERT ARGUMENTS HERE OR COPY AND PASTE THE EXAMPLE>

๐Ÿ† Credits

Applio owes its existence to the collaborative efforts of various repositories, including Mangio-RVC-Fork, and all the other credited contributors. Without their contributions, Applio would not have been possible. Therefore, we kindly request that if you appreciate the work we've accomplished, you consider exploring the projects mentioned in our credits.

Our goal is not to supplant RVC or Mangio; rather, we aim to provide a contemporary and up-to-date alternative for the entire community.

Warning

If you believe you've made contributions to the code utilized in Applio and should be acknowledged in the credits, please feel free to open a pull request (PR). It's possible that we may have unintentionally overlooked your contributions, and we appreciate your proactive approach in ensuring proper recognition.

๐Ÿ™ Thanks to all RVC, Mangio and Applio contributors

RVC:

Applio & Mangio:

applio-rvc-fork's People

Contributors

aitronssesin avatar aldair502 avatar alexlnkp avatar anthonyxd22 avatar bastianmarin avatar blaise-tk avatar deiantv avatar dependabot[bot] avatar dschogo avatar entropyriser avatar fumiama avatar github-actions[bot] avatar h-exos avatar junityzhan avatar kalomaze avatar kawaiianpizza avatar l4ph avatar mangio621 avatar mrm0dz avatar ms903x1 avatar nadare881 avatar narusemioshirakana avatar pengoosedev avatar ricecakey06 avatar rinlovesyou avatar rvc-boss avatar sgsavu avatar spice-z avatar tps-f avatar vidalnt avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.