AxisError: axis -1 is out of bounds for array of dimension 0

Thank you for Code.
I want to use this PNCC code to train the GMM(gaussion mixture model). but it gives me above error,when i am calling this pncc_feature to gmm

GMM Model

import _pickle
import numpy as np
#from scipy.io.wavfile import read
import soundfile as sf
from sklearn.mixture import GaussianMixture
from PNCC import pncc_feature
#from speakerfeatures import extract_features
import warnings
warnings.filterwarnings("ignore")

source = ("mobi_data/")
dest = ("Device_models_zcr/")
train_file = "trainData_mobiphone.txt"
file_paths = open(train_file,'r')
count = 1

Extracting features for each speaker (5 files per speakers)

features = np.asarray(())
for path in file_paths:
path = path.strip()
print (path)

# read the audio
sr,audio = sf.read(source + path)

# extract 40 dimensional MFCC & delta MFCC features
#vector   = pncc(audio,sr)
vector   = pncc_feature(audio,sr)


if features.size == 0:
    features = vector
else:
    features = np.vstack((features, vector))
# when features of 5 files of speaker are concatenated, then do model training
# -> if count == 5: --> edited below
if count == 15:    
    gmm = GaussianMixture(n_components = 16, n_iter = 200, covariance_type='diag',n_init = 3)
    gmm.fit(features)
    
    # dumping the trained gaussian model
    picklefile = path.split("-")[0]+".gmm"
    _pickle.dump(gmm,open(dest + picklefile,'w'))
    print ('+ modeling completed for speaker:',picklefile," with data point = ",features.shape)    
    features = np.asarray(())
    count = 0
count = count + 1

#PNCC code
from librosa.core import stft
from librosa import filters
from librosa import to_mono
#from librosa import resample
import numpy as np
#import soundfile as sf
import scipy
#import librosa
from PNCC_all import medium_time_power_calculation
from PNCC_all import asymmetric_lawpass_filtering
from PNCC_all import halfwave_rectification
from PNCC_all import temporal_masking
from PNCC_all import switch_excitation_or_non_excitation
from PNCC_all import weight_smoothing
from PNCC_all import time_frequency_normalization
from PNCC_all import mean_power_normalization
from PNCC_all import power_function_nonlinearity

#wave_path = 'C:\Users\Lenovo\Desktop\ME-SE\dataset\mi A2snehalLab.wav'
#sr, audio_wave = scipy.io.wavfile.read(str(wave_path))

#audio, sr = sf.read('mi A2snehalLab.wav', dtype='float32')
#audio_resam = librosa.resample(audio, sr, 16000)
def pncc_feature(audio,sr,n_fft=882,winlen=0.020,winstep=0.010,n_mels=128,n_pncc=13,weight_N=4,power=2):

pre_emphasis_signal = scipy.signal.lfilter([1.0, -0.97], 1, audio)
mono_wave = to_mono(pre_emphasis_signal.T)
stft_pre_emphasis_signal = np.abs(stft(mono_wave,
                                       n_fft=n_fft,
                                       hop_length=int(sr * winstep),
                                       win_length=int(sr * winlen),
                                       window=np.ones(int(sr * winlen)),
                                       center=False)) ** power

mel_filter = np.abs(filters.mel(sr, n_fft=n_fft, n_mels=n_mels)) ** power
power_stft_signal = np.dot(stft_pre_emphasis_signal.T, mel_filter.T)

medium_time_power = medium_time_power_calculation(power_stft_signal)

lower_envelope = asymmetric_lawpass_filtering(
    medium_time_power, 0.999, 0.5)

subtracted_lower_envelope = medium_time_power - lower_envelope

rectified_signal = halfwave_rectification(subtracted_lower_envelope)

floor_level = asymmetric_lawpass_filtering(rectified_signal)

temporal_masked_signal = temporal_masking(rectified_signal)

final_output = switch_excitation_or_non_excitation(
    temporal_masked_signal, floor_level, lower_envelope,
    medium_time_power)

spectral_weight_smoothing = weight_smoothing(
    final_output, medium_time_power, L=n_mels)

transfer_function = time_frequency_normalization(
    power_stft_signal,
    spectral_weight_smoothing)

normalized_power = mean_power_normalization(
    transfer_function, final_output, L=n_mels)

power_law_nonlinearity = power_function_nonlinearity(normalized_power)

dct = np.dot(power_law_nonlinearity, filters.dct(
    n_pncc, power_law_nonlinearity.shape[1]).T)
combined = np.hstack(dct) 
#specials = np.sort(list(self.specials.values()))


return combined
#return dct

Thank you

gammatone feature

I see you use mel feature not the same as this paper, which use gammatone filter bank, why?

IndexError: index 40 is out of bounds for axis 1 with size 40

First of all, thanks for sharing the code, it really helps a lot

One small mistake: When I tried to implement pncc on my wav file, i received this arror:

File "", line 99, in mean_power_normalization
myu[m] = lam_myu * myu[m - 1] + (1.0 - lam_myu) / L * sum([transfer_function[m, s] for s in range(0, L - 1)])

IndexError: index 40 is out of bounds for axis 1 with size 40

It's because the default of L = 80 while our size of transfer_function is (n * 40)

I'm new to PNCC, my question is shouldn't the result of pncc = (No. of frames * n_pncc = 13 in our case)? Also in pncc function, though we defined n_pncc = 13, seems that we never use it?

supikiti / pncc Goto Github PK

pncc's People

Contributors

Stargazers

Watchers

Forkers

pncc's Issues

How will you handle powers of negative numbers? In power_function_nonlinearity() function

test

AxisError: axis -1 is out of bounds for array of dimension 0

GMM Model

Extracting features for each speaker (5 files per speakers)

gammatone feature

IndexError: index 40 is out of bounds for axis 1 with size 40

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent