Comments (4)
i also know that coqui shut down , there is this really new tts model here https://github.com/PolyAI-LDN/pheme that claims to be really fast too , if both of this and parakeet got integrated into https://github.com/KoljaB/LocalAIVoiceChat i believe it will be a super boost better performance with faster speed , you can also apply some tricks to them to make them faster !!
from realtimestt.
The nvidia stt looks very promising. Word error rate better than whisper and if it's even faster it's for sure is a great candidate. Hope it does all languages well and not only english. I think currently it does not scale to low VRAM systems, Whisper offers tiny model...
pheme looks good, but tbh so do a lot of engines currently. For pure speed for example styletts2 is a really great engine. 6-7x faster than XTTS.
from realtimestt.
ok got it 👍 i just wanted to notify you , there is also a really new MIT licenced model that claims to be better than mistral 7B thus it mostly will be compatible with zypher! , it is only 2.7B so i bet it will be really fast https://huggingface.co/microsoft/phi-2
you might want to integrated into LocalAIVoiceChat for better speed while holding same accuracy!
from realtimestt.
now i will close the issue
from realtimestt.
Related Issues (20)
- How to connect to local computer's mic to RealtimeSTT on a remote cloud server? HOT 4
- How to pass audio file and transcribe it HOT 3
- Example for using remote GPU server? HOT 1
- [Feature request] Update porcupine version for use with macOS arm HOT 2
- transcribing multiple audio streams simultaneously HOT 2
- [Feature Request] Custom wakeword file HOT 2
- No Internet Connection HOT 4
- How to choose the CUDA version?
- How to choose the CUDA version? HOT 2
- the on_realtime_transcription_update text issue HOT 3
- CUDA initialization error on current master HOT 1
- Porcupine integration on Mac HOT 1
- pyaudio Invalid number of channels HOT 1
- Float 16 to Float 32 quantization HOT 19
- recorder.text(process_text) does not stop recording HOT 6
- Support GPT-SoVITS TTS
- Passing audio bytes (Frames) to the AudioToTextRecorder HOT 10
- Multiple clients in browser-client code HOT 3
- How to calculate the latency of STT
- Syntax error line 520 audio_recorder.py HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from realtimestt.