Comments (3)
The channel with speech will match the position in the tracking results, e.g. if I have a tracked source in position 2, then the 2nd channel will contain the corresponding separated stream.
However, be careful: feeding the postfiltered output directly to a ASR system which was not trained on a dataset filtered in the same way will not produce any good result at all. There will be a domain mismatch. Postfiltering makes it easier for the human ear, but also introduces artifacts picked by the ASR. I suggest you use instead the separated stream, which does not include those artifacts.
Cheers!
from odas.
Thank you for your reply.
If two people speak at the same time at different position,two sources will be tracked,which source is the best choice?
from odas.
It really depends which source you want to use for voice recognition. However, let me warn you: you'll probably get quite bad ASR results if two people are speaking at the same time: each source will still corrupt the other one (even after separation... interference is reduced but not removed). Unless you use a language model with a very limited dictionary, and trained in such similar conditions, the WER will be quite high. Cocktail party is still a very hot research topic right now... deep learning makes things better, but we still have to work hard to get human like recognition performances.
from odas.
Related Issues (20)
- Is there a Python wrapper for this library HOT 3
- Trouble with Sound Source Separation
- Decibel Level
- Respeaker USB not working with direct capture HOT 2
- Persistent source tracking?
- odas automatically closing on pi zero w
- The definition of uniform distribution is different between the paper and code HOT 1
- How can I revise the SSL section for seeing some parameters? HOT 1
- Hardware list
- Tunning ODAS for a noisy robot
- sound source separation tuning
- Which configuration file shall I choose, if I have a 8-array microphone? HOT 1
- What does the raw output of ODAS mean?
- What does the raw output mean? HOT 1
- How to change the output frequency?
- PSEYE Config File -> raw: -> fS = 16000
- Calculating azimuth and elevation angle basaed on delay time between microphones
- run odasliv HOT 1
- The channels' order of SSS do not match with SST
- Hello, can the project do beamforming for voice enhancement, thanks
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odas.