I'm AlexandaJerry, a postgraduate student studying phonetics.
- 🔭 I’m currently working on speech-related research.
- 🌱 I’m currently learning python programming and deep learning.
- 👯 I’m interested in Praat scripting and R visualization.
Name: Alexanda
Type: User
Bio: Postgraduate student of Phonetics
《语音信号处理试验教程》(梁瑞宇等)的代码主要是Matlab实现的,现在Python比较热门,所以把这个项目大部分内容写成了Python实现
Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark
🧑🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Preprocess Audio for training
数据集自动化制作脚本
这个项目是数据预处理。第一步是对获取到的音频做处理,结合Funasr的时间戳去掉空背景音。也包含了喂给BERT前的label
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition
A list of tools for annotating data, managing annotations, etc.
Curated list of python software and packages related to scientific research in audio
A book about Text-to-Speech (TTS) in Chinese.
Easily take an entire YouTube playlist and turn it into high quality transcripts using Whisper.
CapsWriter 的离线版,一个好用的 PC 端的语音输入工具
Charsiu: A neural phonetic aligner.
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
吴恩达老师的机器学习课程个人笔记
CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis(MM2020)
Text to speech alignment using CTC forced alignment
基于达摩院视频切割技术的视频转换为短音频的vits数据集生成工具 A VITS Dataset Generation Tool for Converting Video to Short Audio Based on Damo Academy Video Cutting Technology
deeplearning.ai(吴恩达老师的深度学习课程笔记及资源)
Deep Learning on Human Language Processing (2020, Spring) NTU-EECS
Expressive Anechoic Recordings of Speech (EARS)
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
faster_whisper GUI with PySide6
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
关于视觉小说的一切,争取打造全网最全的资料库
📦 Flatpak Package of gMKVExtractGUI, a small GUI utility to extract tracks, chapters and CUE sheets from mkv files
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.