Giter Club home page Giter Club logo

zooniversedata's Introduction

ZooniverseData

This repository contains two Jupyter notebooks and the datasets that are described in the arxiv paper HumBug Zooniverse: a crowd-sourced acoustic mosquito dataset.

The notebooks folder contains two files: metadata.ipynb contains instructions on how to process and visualise the full dataset with labels found in labels/coarse_data_2sec.csv, with a breakdown of the sources of recordings and the number of labels obtained from the crowdsourcing on Zooniverse. baseline.ipynb details an example use, where labels are aggregated from the 2 second overlapping audio recordings, to form a dataset audio_1sec found in data with its corresponding label audio_1sec.csv in labels. The baseline is a simple convolutional neural network performing classifications on the log-mel feature space with librosa. We supply a simple cross-entropy weighting for taking into account class imbalance.

We strongly recommend using the processed audio in non-overlapping 1 second segments, as the aggregation has been performed for the four recording groups. The votes supplied in audio_1sec.csv are given in the categories {yes, no, not_sure}.

The data is available to download at http://humbug.ac.uk/public/Zooniverse_audio_1sec.zip for the 1 second segments, and at http://humbug.ac.uk/public/Zooniverse_audio_2sec.zip for the overlapping original data. The wave files should be extracted to create the paths: ZooniverseData/data/audio_1sec and ZooniverseData/data/Zoo_segment.

The required packages for reproducing the code are given in the first cell of each notebook. This code has been tested in:

Windows 10 Anaconda3 5.2.0 Python 3.6 keras-gpu 2.3.1 installed via conda install -c anaconda keras-gpu

zooniversedata's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.