imfing / audio-classification Goto Github PK

View Code? Open in Web Editor NEW

161.0 7.0 54.0 16.11 MB

:musical_score: Environmental sound classification using Deep Learning with extracted features

Python 100.00%

deep-learning

audio-classification's Introduction

Environmental Sound Classification using Deep Learning

A project from Digital Signal Processing course

Dependencies

Python 3.6
numpy
librosa
pysoundfile
sounddevice
matplotlib
scikit-learn
tensorflow
keras

Dataset

Dataset could be downloaded at Dataverse or Github.

I'd recommend use ESC-10 for the sake of convenience.

Example:

├── 001 - Cat
│  ├── cat_1.ogg
│  ├── cat_2.ogg
│  ├── cat_3.ogg
│  ...
...
└── 002 - Dog
   ├── dog_barking_0.ogg
   ├── dog_barking_1.ogg
   ├── dog_barking_2.ogg
   ...

Feature Extraction

Put audio files (.wav untested) under data directory and run the following command:

python feat_extract.py

Features and labels will be generated and saved in the directory.

Classify with SVM

Make sure you have scikit-learn installed and feat.npy and label.npy under the same directory. Run svm.py and you could see the result.

Classify with Multilayer Perceptron

Install tensorflow and keras at first. Run nn.py to train and test the network.

Classify with Convolutional Neural Network

Run cnn.py -t to train and test a CNN. Optionally set how many epochs to train on.
Predict files by either:
- Putting target files under predict/ directory and running cnn.py -p
- Recording on the fly with cnn.py -P

audio-classification's People

Contributors

Stargazers

Watchers

audio-classification's Issues

Error! feat.npy is not UTF-8 encoded

Hi , i am using the same structure as you said but when i run feat_extract.py ,it run smoothly and gives me this output :

extract .ipynb_checkpoints features done
extract 001 - Dog bark features done
extract 002 - Rain features done
extract 003 - Sea waves features done
extract 004 - Baby cry features done
extract 005 - Clock tick features done
extract 006 - Person sneeze features done
extract 007 - Helicopter features done
extract 008 - Chainsaw features done
extract 009 - Rooster features done
extract 010 - Fire crackling features done

but in feat.npy, it says Error! feat.npy is not UTF-8 encoded

please help me to understand this and how i can resolve it

UnboundLocalError: local variable 'X_predict' referenced before assignment

The following is the error I have encountered. Could you please guide me out of it?

Traceback (most recent call last):
File "cnn.py", line 139, in
main(args)
File "cnn.py", line 125, in main
elif args.predict: predict(args)
File "cnn.py", line 88, in predict
X_predict = np.expand_dims(X_predict, axis=2)
UnboundLocalError: local variable 'X_predict' referenced before assignment

still getting error after executing the command cnn.py -p

Steps which I did..
1)download ESC-10 dataset then inserted in data directory
2)executed command python feat_extract.py
3)4 file are created feat.py,lable.py,predict_feat,predict_filenames
4)executed command svm.py and output is fitting... acc=0.756
5)executed nn.py
6)executed cnn.py -t
7)putting single sample audio at predict folder and executed the command cnn.py -t but erorr occures like

Traceback (most recent call last):
File "cnn.py", line 123, in
main(args)
File "cnn.py", line 109, in main
elif args.predict: predict(args)
File "cnn.py", line 73, in predict
pred = model.predict_classes(X_predict)
File "C:\Users\Abhijeet\Anaconda3\envs\tfp3.6\lib\site-packages\keras\engine\s
equential.py", line 268, in predict_classes
if proba.shape[-1] > 1:
AttributeError: 'list' object has no attribute 'shape'

steps for how to create custom model to run cnn.py -p command ????/

errors in cnn.py

I got two errors when executing the file "cnn.py".

Unresolved reference 'feat_extract' when we define the function:train(args).
solution for this error: we should use "import feat_extract" instead of using "from feat_extract import *";
2.Unresolved reference 'X_predict' when we define the function: predict(args).
"X_predict = np.expand_dims(X_predict, axis=2)", the "X_predict" in the right side is not pre-defined nor pre-produced.
solution for this error: we can replace "X_predict" by "predict_feat_path".

what accuracy can this cnn model reach?

what's the optimal accuracy can this cnn model reach after fully trained, can this be written on README ? thanks

error on svm.ph

I got the following error after running svm.py. How can I solve this one?

The class labels seem to differ from the actual labels.

When I tried to predict the classes there seemed to be some kind of pattern. The class names given by folder is different from the predicted name:
Name, Predicted,Actual
rain=0,2
seawaves=1,3
baby=2,4
clock=3,5
sneeze=4,6
helicopter=5,7
chainsaw=6,8
rooster=7,9
firecracker=8,10
dog=13,1
What could be the reason?

error ocuures when python cnn.py -p is executed like list object has no attribute 'shape''

Can I use .wav to train instead of .ogg

I want the versions of these packages

numpy
librosa
pysoundfile
sounddevice
matplotlib
scikit-learn
tensorflow
keras

index out of bound

when executing nn.py gives following error

Using TensorFlow backend.
Traceback (most recent call last):
File "nn.py", line 37, in
y_train = keras.utils.to_categorical(y_train-1, num_classes=10)
File "C:\Users\TUSHAR\Anaconda3\lib\site-packages\keras\utils\np_utils.py", line 32, in to_categorical
categorical[np.arange(n), y] = 1
IndexError: index 21 is out of bounds for axis 1 with size 10

Attribute error

Hello, when I run nn.py and cnn.py, the spyder showed that AttributeError: 'ProgbarLogger' object has no attribute 'log_values' . It seems that something wrong the keras

Classes

What are the classes if I use ECS-10? Is it like: 0 - dog bark, 1 - ***, ..., etc.

I put a dog bark.ogg from the training data into the predict folder, it outputs 13 however. What is wrong with it?

Thank you.

prediction

Can you guide the prediction part?

keras.utils.to_categorical(y, num_classes) question

Hello,
I have a question.

nn.py (under 34 line)

Convert label to onehot
y_train = keras.utils.to_categorical(y_train-1, num_classes=num_classes)
y_test = keras.utils.to_categorical(y_test-1, num_classes=num_classes)

Why did you use (y_train - 1), not just (y_train) when you call keras.utils.to_categorical()

Frequency band exceeds Nyquist. Reduce either fmin or n_bands.

I am experiencing the same issue. Should I prep the .wav files in a specific way?

Also, if I reduce fmin to 120.0, I receive the following error:
librosa.util.exceptions.ParameterError: Filter pass-band lies beyond Nyquist

Please advise.

Okay, thanks. It's a problem related to the audio files.
I'll close this issue for now.

Originally posted by @mtobeiyf in #10 (comment)

local variable 'batch_index' referenced before assignment

I have run feat_extract.py，and there is feat.npy and ;label.npy,when I run cnn.py,it occured error above

how to get .ogg file

I get an index 12 out of bounds when I run the cnn.py -t

So this happened with both my own OGG data and the ESC data, with one OGG file in the predict folder.

Everything went normally, the SVN and MLP worked. Then I tried the CNN:

Phonecian:audio-classification skiwheelr$ python3 cnn.py -t
Using TensorFlow backend.
2020-04-03 23:29:32.002190: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-03 23:29:32.015288: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8e13dfbac0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-03 23:29:32.015310: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
File "cnn.py", line 124, in
main(args)
File "cnn.py", line 109, in main
if args.train: train(args)
File "cnn.py", line 51, in train
y_train = keras.utils.to_categorical(y_train, num_classes=class_count)
File "/usr/local/lib/python3.7/site-packages/keras/utils/np_utils.py", line 52, in to_categorical
categorical[np.arange(n), y] = 1
IndexError: index 12 is out of bounds for axis 1 with size 12

What could be causing this? There is enough data.

wav file does not work

When I try to put wav files in data directory, it appears:
[Error] extract feature error in data_wav/6/148835-6-0-0.wav. Frequency band exceeds Nyquist. Reduce either fmin or n_bands.
Solution: tune "fmin" parameter in function "extract_feature(file_name=None)" as follows:
contrast = np.mean(librosa.feature.spectral_contrast(S=stft, sr=sample_rate, fmin=180.0).T,axis=0)