Giter Club home page Giter Club logo

birdclef-2021's Introduction

Bird call identification

See BirdCLEF 2021 - Birdcall Identification Kaggle competition.

Getting started

Install dependencies:

$ pip install -e '.[dev]'

On Linux, you also need to run sudo apt-get install libsndfile1.

Install also TensorFlow if not installed in your environment:

$ pip install -e .[tf]

Pull data files

$ dvc pull

Create down-sampled dataset

Create smaller dataset in data/ folder:

$ python -m src.sample

Download subset of files from Google Storage bucket

$ python -m src.download

Train model

$ python -m src.train --model smoke-test --data-dir data --metadata-csv train_metadata_small.csv

Setup Jupyter

Install kernel:

$ python -m ipykernel install --user --name bird-3.8.1 --display-name "Python (bird-3.8.1)"

Resources

Tips

Downloading data

Create a GCP VM with at least 100 GB disk space. Give write access to Google Storage API.

SSH to the instance using:

$ gcloud compute ssh INSTANCE_NAME

Install pip, tmux and unzip:

$ sudo apt install python3-pip tmux unzip

Install Kaggle CLI:

$ pip3 install kaggle

Make directory .kaggle and transfer kaggle.json from your machine:

$ scp ~/.kaggle/kaggle.json USERNAME@VM_IP:~/.kaggle/kaggle.json

Create new tmux session:

$ tmux new -s kimmo

Download data to folder data/:

$ chmod 600 ~/.kaggle/kaggle.json
$ ./.local/bin/kaggle competitions download birdclef-2021 -p data

Detach from the session with Ctrl+b d and attach with tmux a -t kimmo.

Extract and copy data to Google bucket bird-clef-kimmo:

$ unzip data/birdclef-2021.zip -d data
$ gsutil -m rsync -r data gs://bird-clef-kimmo/data

List and stop instances:

$ gcloud compute instances list
$ gcloud compute instances stop INSTANCE_NAME

Setting up a development instance in Vertex AI

Create a user-managed notebook in Vertex AI Workbench.

SSH to the instance with jupyter username:

$ gcloud compute ssh jupyter@bird-explore

Setup SSH configuration:

$ gcloud compute config-ssh

Switch User in ~/.ssh/config:

# ~/.ssh/config
Host some-host
  User jupyter

Connecting from VS Code using the SSH host should now use jupyter as user, allowing you to use /home/jupyter for files and save remotely.

You can also setup port forwarding to localhost with:

$ gcloud compute ssh jupyter@bird-explore -- -N -L 8080:localhost:8080

Setting up kaggle

Sign in to Kaggle. Follow the instructions to prepare ~/.kaggle/kaggle.json file.

Working with data

See the Data page.

Download the full 39 GiB dataset:

$ kaggle competitions download birdclef-2021 -p data

Download single file:

$ kaggle competitions download birdclef-2021 -p data/train_short_audio/acafly -f train_short_audio/acafly/XC109605.ogg

List all files in CSV format

$ kaggle competitions files birdclef-2021 --csv

Download train_metadata.csv:

$ kaggle competitions download birdclef-2021 -p data -f train_metadata.csv
$ unzip data/train_metadata.csv.zip -d data

birdclef-2021's People

Contributors

ksaaskil avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.