Some useful features of speech process, such as MFCC, gammatone filterbank, GFCC, spectrum(power spectrum and log-power spectrum), Amplitude Modulation Spectrum(AMS) and so on.
I use function synthesis_speech in synthesis_speech.py to systhesis the original signal .
When i get the IRM which shape is (64, frame_num) , the code in the python file is masked_abs = np.abs(stft[:win_len//2+1]) * mk[:, i]
using stft multiply the mask. It do not match this axis. So the python file is only used to systhesis those masks extracted by stft, but not Gammatone filter?
when i was run ams_extractor.py,some errors was show: Traceback (most recent call last): File "/Users/jhm/Desktop/Speech_Separation/code/speech_feature_extractor-master/ams_extractor.py", line 4, in <module> from feature_extractor import stft_extractor File "/Users/jhm/Desktop/Speech_Separation/code/speech_feature_extractor-master/feature_extractor.py", line 4, in <module> from scikits.talkbox import lpc File "/Users/jhm/anaconda/lib/python3.6/site-packages/scikits/talkbox/__init__.py", line 3, in <module> from tools import * ModuleNotFoundError: No module named 'tools'
i want konw what is the "tools" and how to solve it.
anyone can help me?
Hi ZhihaoDU,
I can't find the "mix_by_db", when ran speech_synthesis.py file. Can you give me some tips? Please share you dataset what you used in the work. Thank you very much!
Watch's the different of ’cochleagram_extractor_wdl‘ and 'cochleagram_extractor' in feature_extractor.py.
I mean I know , one of them make a sum and another use mean. But why ,what's the different used in them?
Thank you