Giter Club home page Giter Club logo

pncc's People

Contributors

indianquant avatar supikiti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pncc's Issues

AxisError: axis -1 is out of bounds for array of dimension 0

Thank you for Code.
I want to use this PNCC code to train the GMM(gaussion mixture model). but it gives me above error,when i am calling this pncc_feature to gmm

GMM Model

import _pickle
import numpy as np
#from scipy.io.wavfile import read
import soundfile as sf
from sklearn.mixture import GaussianMixture
from PNCC import pncc_feature
#from speakerfeatures import extract_features
import warnings
warnings.filterwarnings("ignore")

source = ("mobi_data/")
dest = ("Device_models_zcr/")
train_file = "trainData_mobiphone.txt"
file_paths = open(train_file,'r')
count = 1

Extracting features for each speaker (5 files per speakers)

features = np.asarray(())
for path in file_paths:
path = path.strip()
print (path)

# read the audio
sr,audio = sf.read(source + path)

# extract 40 dimensional MFCC & delta MFCC features
#vector   = pncc(audio,sr)
vector   = pncc_feature(audio,sr)


if features.size == 0:
    features = vector
else:
    features = np.vstack((features, vector))
# when features of 5 files of speaker are concatenated, then do model training
# -> if count == 5: --> edited below
if count == 15:    
    gmm = GaussianMixture(n_components = 16, n_iter = 200, covariance_type='diag',n_init = 3)
    gmm.fit(features)
    
    # dumping the trained gaussian model
    picklefile = path.split("-")[0]+".gmm"
    _pickle.dump(gmm,open(dest + picklefile,'w'))
    print ('+ modeling completed for speaker:',picklefile," with data point = ",features.shape)    
    features = np.asarray(())
    count = 0
count = count + 1

#PNCC code
from librosa.core import stft
from librosa import filters
from librosa import to_mono
#from librosa import resample
import numpy as np
#import soundfile as sf
import scipy
#import librosa
from PNCC_all import medium_time_power_calculation
from PNCC_all import asymmetric_lawpass_filtering
from PNCC_all import halfwave_rectification
from PNCC_all import temporal_masking
from PNCC_all import switch_excitation_or_non_excitation
from PNCC_all import weight_smoothing
from PNCC_all import time_frequency_normalization
from PNCC_all import mean_power_normalization
from PNCC_all import power_function_nonlinearity

#wave_path = 'C:\Users\Lenovo\Desktop\ME-SE\dataset\mi A2snehalLab.wav'
#sr, audio_wave = scipy.io.wavfile.read(str(wave_path))

#audio, sr = sf.read('mi A2snehalLab.wav', dtype='float32')
#audio_resam = librosa.resample(audio, sr, 16000)
def pncc_feature(audio,sr,n_fft=882,winlen=0.020,winstep=0.010,n_mels=128,n_pncc=13,weight_N=4,power=2):

pre_emphasis_signal = scipy.signal.lfilter([1.0, -0.97], 1, audio)
mono_wave = to_mono(pre_emphasis_signal.T)
stft_pre_emphasis_signal = np.abs(stft(mono_wave,
                                       n_fft=n_fft,
                                       hop_length=int(sr * winstep),
                                       win_length=int(sr * winlen),
                                       window=np.ones(int(sr * winlen)),
                                       center=False)) ** power

mel_filter = np.abs(filters.mel(sr, n_fft=n_fft, n_mels=n_mels)) ** power
power_stft_signal = np.dot(stft_pre_emphasis_signal.T, mel_filter.T)

medium_time_power = medium_time_power_calculation(power_stft_signal)

lower_envelope = asymmetric_lawpass_filtering(
    medium_time_power, 0.999, 0.5)

subtracted_lower_envelope = medium_time_power - lower_envelope

rectified_signal = halfwave_rectification(subtracted_lower_envelope)

floor_level = asymmetric_lawpass_filtering(rectified_signal)

temporal_masked_signal = temporal_masking(rectified_signal)

final_output = switch_excitation_or_non_excitation(
    temporal_masked_signal, floor_level, lower_envelope,
    medium_time_power)

spectral_weight_smoothing = weight_smoothing(
    final_output, medium_time_power, L=n_mels)

transfer_function = time_frequency_normalization(
    power_stft_signal,
    spectral_weight_smoothing)

normalized_power = mean_power_normalization(
    transfer_function, final_output, L=n_mels)

power_law_nonlinearity = power_function_nonlinearity(normalized_power)

dct = np.dot(power_law_nonlinearity, filters.dct(
    n_pncc, power_law_nonlinearity.shape[1]).T)
combined = np.hstack(dct) 
#specials = np.sort(list(self.specials.values()))


return combined
#return dct

Thank you

gammatone feature

I see you use mel feature not the same as this paper, which use gammatone filter bank, why?

IndexError: index 40 is out of bounds for axis 1 with size 40

First of all, thanks for sharing the code, it really helps a lot

One small mistake: When I tried to implement pncc on my wav file, i received this arror:

File "", line 99, in mean_power_normalization
myu[m] = lam_myu * myu[m - 1] + (1.0 - lam_myu) / L * sum([transfer_function[m, s] for s in range(0, L - 1)])

IndexError: index 40 is out of bounds for axis 1 with size 40

It's because the default of L = 80 while our size of transfer_function is (n * 40)

I'm new to PNCC, my question is shouldn't the result of pncc = (No. of frames * n_pncc = 13 in our case)? Also in pncc function, though we defined n_pncc = 13, seems that we never use it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.