nuaazs's Projects
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Flask backend for SARU ( MR -> CT transform network)
ResUnet for MR-only BNCT planning
ScanNetAI is an advanced self-supervised deep learning model tailored for CT image analysis. It excels in processing large-scale CT data, offering superior performance in tasks like image segmentation, medical image conversion, and dosage prediction.
speech_microservice
Speech Algorithms
A PyTorch-based Speech Toolkit
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
A Python toolkit for sound source separation.
Stable Diffusion web UI
stray_light_suppression
SoftVC VITS Singing Voice Conversion
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
The python-based TOPAS script generation tool automatically calculates the optimal beam direction by providing ct and tumor mask files(nii/nrrd). Other information such as field size, forwardness, etc. can be set through specific templates.
torch cookbook
simple energy vad
A simple energy base Voice Activity Detection (VAD) algorithm written in C++.
Backend of anti-fraud system based on speaker identification technology. 基于声纹识别的反诈系统后端
Voiceprint Anti-Fraud System(frontend). 基于声纹识别的反诈系统前端
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
Visual Speech Recognition for Multiple Languages
Vision Transformer for 3D medical image registration (Pytorch).
A collection of telephony channel audio processing tools. #Voiceprint #Speaker Recognition
WaveRNN Vocoder + TTS
Production First and Production Ready End-to-End Speech Recognition Toolkit
Research and Production Oriented Speaker Recognition Toolkit