supikiti / pncc Goto Github PK
View Code? Open in Web Editor NEWA implementation of Power Normalized Cepstral Coefficients: PNCC
Home Page: https://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570104069.pdf
License: MIT License
A implementation of Power Normalized Cepstral Coefficients: PNCC
Home Page: https://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570104069.pdf
License: MIT License
When you raise a number to the power (1/15), the number can be negative too.
Thank you for Code.
I want to use this PNCC code to train the GMM(gaussion mixture model). but it gives me above error,when i am calling this pncc_feature to gmm
import _pickle
import numpy as np
#from scipy.io.wavfile import read
import soundfile as sf
from sklearn.mixture import GaussianMixture
from PNCC import pncc_feature
#from speakerfeatures import extract_features
import warnings
warnings.filterwarnings("ignore")
source = ("mobi_data/")
dest = ("Device_models_zcr/")
train_file = "trainData_mobiphone.txt"
file_paths = open(train_file,'r')
count = 1
features = np.asarray(())
for path in file_paths:
path = path.strip()
print (path)
# read the audio
sr,audio = sf.read(source + path)
# extract 40 dimensional MFCC & delta MFCC features
#vector = pncc(audio,sr)
vector = pncc_feature(audio,sr)
if features.size == 0:
features = vector
else:
features = np.vstack((features, vector))
# when features of 5 files of speaker are concatenated, then do model training
# -> if count == 5: --> edited below
if count == 15:
gmm = GaussianMixture(n_components = 16, n_iter = 200, covariance_type='diag',n_init = 3)
gmm.fit(features)
# dumping the trained gaussian model
picklefile = path.split("-")[0]+".gmm"
_pickle.dump(gmm,open(dest + picklefile,'w'))
print ('+ modeling completed for speaker:',picklefile," with data point = ",features.shape)
features = np.asarray(())
count = 0
count = count + 1
#PNCC code
from librosa.core import stft
from librosa import filters
from librosa import to_mono
#from librosa import resample
import numpy as np
#import soundfile as sf
import scipy
#import librosa
from PNCC_all import medium_time_power_calculation
from PNCC_all import asymmetric_lawpass_filtering
from PNCC_all import halfwave_rectification
from PNCC_all import temporal_masking
from PNCC_all import switch_excitation_or_non_excitation
from PNCC_all import weight_smoothing
from PNCC_all import time_frequency_normalization
from PNCC_all import mean_power_normalization
from PNCC_all import power_function_nonlinearity
#wave_path = 'C:\Users\Lenovo\Desktop\ME-SE\dataset\mi A2snehalLab.wav'
#sr, audio_wave = scipy.io.wavfile.read(str(wave_path))
#audio, sr = sf.read('mi A2snehalLab.wav', dtype='float32')
#audio_resam = librosa.resample(audio, sr, 16000)
def pncc_feature(audio,sr,n_fft=882,winlen=0.020,winstep=0.010,n_mels=128,n_pncc=13,weight_N=4,power=2):
pre_emphasis_signal = scipy.signal.lfilter([1.0, -0.97], 1, audio)
mono_wave = to_mono(pre_emphasis_signal.T)
stft_pre_emphasis_signal = np.abs(stft(mono_wave,
n_fft=n_fft,
hop_length=int(sr * winstep),
win_length=int(sr * winlen),
window=np.ones(int(sr * winlen)),
center=False)) ** power
mel_filter = np.abs(filters.mel(sr, n_fft=n_fft, n_mels=n_mels)) ** power
power_stft_signal = np.dot(stft_pre_emphasis_signal.T, mel_filter.T)
medium_time_power = medium_time_power_calculation(power_stft_signal)
lower_envelope = asymmetric_lawpass_filtering(
medium_time_power, 0.999, 0.5)
subtracted_lower_envelope = medium_time_power - lower_envelope
rectified_signal = halfwave_rectification(subtracted_lower_envelope)
floor_level = asymmetric_lawpass_filtering(rectified_signal)
temporal_masked_signal = temporal_masking(rectified_signal)
final_output = switch_excitation_or_non_excitation(
temporal_masked_signal, floor_level, lower_envelope,
medium_time_power)
spectral_weight_smoothing = weight_smoothing(
final_output, medium_time_power, L=n_mels)
transfer_function = time_frequency_normalization(
power_stft_signal,
spectral_weight_smoothing)
normalized_power = mean_power_normalization(
transfer_function, final_output, L=n_mels)
power_law_nonlinearity = power_function_nonlinearity(normalized_power)
dct = np.dot(power_law_nonlinearity, filters.dct(
n_pncc, power_law_nonlinearity.shape[1]).T)
combined = np.hstack(dct)
#specials = np.sort(list(self.specials.values()))
return combined
#return dct
Thank you
I see you use mel feature not the same as this paper, which use gammatone filter bank, why?
First of all, thanks for sharing the code, it really helps a lot
One small mistake: When I tried to implement pncc on my wav file, i received this arror:
File "", line 99, in mean_power_normalization
myu[m] = lam_myu * myu[m - 1] + (1.0 - lam_myu) / L * sum([transfer_function[m, s] for s in range(0, L - 1)])
IndexError: index 40 is out of bounds for axis 1 with size 40
It's because the default of L = 80 while our size of transfer_function is (n * 40)
I'm new to PNCC, my question is shouldn't the result of pncc = (No. of frames * n_pncc = 13 in our case)? Also in pncc function, though we defined n_pncc = 13, seems that we never use it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.