Giter Club home page Giter Club logo

orcacnn's People

Contributors

yosoyjay avatar zer-0-ne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

orcacnn's Issues

Deciding on ChunkSize

This would be the best time to decide on the chunkSize since we'll be starting with the development of detection and classification model soon, so it's better to take a proper decision for this.
First of all, let me show the difference between the chunkSizes of 1s, 2s and 3s:

  1. When chunkSize = 1s (shows 3 spectrograms of 1s each )
    merge_from_ofoct

  2. When chunkSize = 2s
    Field Recordings NGOS_2008 field acoustic recordings_20080614 E03_1007_0000_0000

  3. When chunkSize = 3s

Field Recordings NGOS_2008 field acoustic recordings_20080614 E03_1007_0000_0000

Why I would choose 1s chunkSize?

Keeping in mind the further classification and template matching/extraction methods:

  • First reason will be that the calls can be distinctly seen in each spectrogram. There is a very less chance of spectrograms overlapping for 1s chunks than in the other 2s and 3s chunks. Since they won't overlap, the calls for further classification models can be directly taken as the whole spectrogram without any template extraction process.
  • Though the 1s chunks are kinda blurry, the input to CNN model are often image sizes of around 224x224 dimension which is less than the dimension of each of the above chunks (around 600x400), so still the 1s chunk is a good choice.
  • The 3s chunks are also a good choice, but much of the frequency-time area is merely noise, it's better to avoid it rather than crop each of them individually.

Template Matching also works out well for 1s chunks as I just tried, we can just loop over the 1s directory with the template for the given pod.

  • Template (I've just cut out a portion spanning the width of the call since that would be enough right?):

  • Matched image result:
    temp_match

So, is it a go for 1s chunks? @yosoyjay
I would like to know if you have any ideas, criticism or suggestions.

Normalization of audio does not produce any significant change

Hi
I am trying to normalize the data that we have but I see that there is no significant change in the output waveform.
Do we need to consider this for pre-processing our data? or is it enough to just have all the samples downsampled to a particular sampling frequency and a fixed length?

def audio_norm(data):
    max_data = np.max(data)
    min_data = np.min(data)
    data = (data-min_data)/(max_data-min_data+1e-6)
    return data-0.5

Screenshot from 2019-03-24 20-11-50

Develop killer whale detection model

The goal here is to develop the killer whale detection model trained on passive acoustic data from SE Alaska and, perhaps, augmented with data from other sources.

  • 1. Develop model to recognize kw calls
  • 2. Use model developed in 1. to generate labeled kw samples
    (1 and 2 are likely to be iterative
  • Research potential sources of humpback whale sounds to improve model discrimination between humpback and kw
  • Develop test to distinguish between kw and humpback

Build model to distinguish between pods in SE Alaska sourced data

The goal is to build and develop a model that can identify the pod from which the killer whale called originated from a given sample call.

  1. - Develop test and training set from the SE Alaska call catalogue
  2. - Develop model
  3. - Integrate model into a command line tool that can be used to label the pod of a set of killer whale calls.

Average predictions from the 4 ML models to find start and end times of orca calls

Currently, there are 4 models uploaded to help in the prediction of orca calls. Some of these are checkpoint models which were saved while training during the GSoC period.

The 4 models offer different accuracy or number of predicted 1-second calls for the same unsampled acoustic data. Unfortunately, at this point, there is no one model for all.

The idea here is to average the predictions from the base models so that we can be more accurate about the predicted orca calls and their duration of occurrence in the acoustic/input data.

Since the autonomous recordings or the input data is divided into 1-second chunks, this makes the whole process easier. For eg., if 3 of the 4 base models predict that there is an orca call at the 37th second of the input data, we consider that there is a high probability of an orca call at that time. To generalize this observation, we consider with high confidence the presence of an orca call only if two or more of the base models predict it.

Tossing all the predictions this way to a .csv file seems like a good idea at this point in time. In addition, there can be 2 sections in the csv file

  • for predictions with high probability
  • predictions where further human assistance/intervention is required.

Evaluate model designs

The primary job of this tool is to detect orca in passive acoustic datasets comprised of a pipeline consisting of stages to: 1) preprocess and standardize input data, 2) apply a model to identify the presence of orca within the dataset, 3) apply additional models to extract additional information about the orca call including signal type and pod, and 4) create a summary of the results of the pipeline indicating presence of orca along with time and any additional information.

The issue here is to develop and evaluate models for stage 2 of the pipeline. Different model designs will be described and applied on the dataset developed in #1.

Develop summarizer

The primary job of this tool is to detect orca in passive acoustic datasets comprised of a pipeline consisting of stages to: 1) preprocess and standardize input data, 2) apply a model to identify the presence of orca within the dataset, 3) apply additional models to extract additional information about the orca call including signal type and pod, and 4) create a summary of the results of the pipeline indicating presence of orca along with time and any additional information.

The issue here is to develop a tool that summarizes what was learned in #3 to create stage 4 of the pipeline. The output should encapsulate all of the information extracted from the pipeline and output in formats amenable for humans (html?) and apis (json?).

Develop data pre-processing methods

The goal here is to develop the code to prepare the data for model development. Existing methods have been prototyped before the start of the project and will form the basis of the methods.

  • Develop methods to standardize data
  • Develop methods to visualize the data
  • Convert exploratory code to Python script that can be deployed in backed of web app
  • Ensure documentation is sufficiently adequate for new users to quickly pickup and apply

MBARI data credit

Please credit MBARI in this source code if their sound data was used in conjunction with Dan Olsen's data for validating model performance. This is per the collaboration agreement you have in place with them. Thank you!

Modification in the padding function in preprocessing.py

if len(data) > input_length:    
        max_offset = len(data) - input_length    
        offset = np.random.randint(max_offset)    
        data = data[offset:(input_length + offset)]

The case when input length is less than the actual data, in that case, wouldn't downsampling be better than chopping the data? Either downsampling the signal through averaging or skipping elements.

let n = length of actual data
let m = expected length
so an average of the signal for every n/m element occurs like below to make the array of size m

def average(arr, l = n/m):    
    end =  l * int(len(arr)/l)    
    return numpy.mean(arr[:end].reshape(-1, l), 1)    

And then pad some constant for remaining values if n/m is not an integer ?

Create input standardizer to preprocess audio data

The primary job of this tool is to detect orca in passive acoustic datasets comprised of a pipeline consisting of stages to: 1) preprocess and standardize input data, 2) apply a model to identify the presence of orca within the dataset, 3) apply additional models to extract additional information about the orca call including signal type and pod, and 4) create a summary of the results of the pipeline indicating presence of orca along with time and any additional information.

The issue here is to develop part 1 of the pipeline to develop both the format of the standardized data and the tool to create such a dataset.

Develop detection tool

The primary job of this tool is to detect orca in passive acoustic datasets comprised of a pipeline consisting of stages to: 1) preprocess and standardize input data, 2) apply a model to identify the presence of orca within the dataset, 3) apply additional models to extract additional information about the orca call including signal type and pod, and 4) create a summary of the results of the pipeline indicating presence of orca along with time and any additional information.

The issue here is to apply the model(s) developed in #2 to create stage 2 of the pipeline.

Minor typo in README

Noticed that to be consistent with the surrounding explanation that 67391498.180916010013 should be instead 67391498.180916010313. Then it is consistent with the subsequently stated time of 1:03 am (and 13 seconds).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.