Giter Club home page Giter Club logo

botany-2022-train-object-classifier's Introduction

BOTANY 2022 Workshop - How to train an object classifier

Training an object classifier

In this module we will learn how to train a simple object classifier, which is a machine learning network capable of assigning a class prediction to an entire image. This type of ML network is useful for many tasks including filtering large datasets and is often one component of more complex software.

Other types of ML networks can be used for other tasks: object detectors place bounding boxes or polygons around items, semantic segmentation identifies the pixels of an image that correspond to each desired object class, and panoptic segmentation identifies the pixels of an image that correspond to each instance of a desired object class and can even learn object associations.

Here we will train an object classifier to predict the class for a set of rulers that are commonly present in herbarium specimen vouchers. We have a directory filled with rulers that are already grouped into 22 classes including a “fail” class for non-ruler objects. These are our ground truth training data. For this simplicity in this workshop, we will not split our training data into train/validation/test groups, but this is a crucial step of ML development, and it is standard practice to partition annotated data into train/validation/test groups in a roughly 70/15/15 or 60/20/20 ratio.

Getting Started

Here is a link to the Colab notebook: Open In Colab

As soon as you open the link, make a copy of the notebook in your own account so that you can edit and save any changes to the code.

Please run the first 4 code blocks as soon as you open your copy of the notebook, installing the required packages and downloading the data will take a few minutes.

Once you open the Colab notebook and download the data, you will see three different training datasets: tiny, small, large. For the workshop, please only use the tiny datasets because training on more images will take too much time. We will also need to switch the default CPU runtime in our Colab notebook to a GPU runtime. Using a GPU is ~5 times faster than using a CPU.

Trained ML networks are also provided if you want to skip the training step.

Evaluate Model

In the Evaluate Model section, we can choose a trained ML model and see how well it works. You will notice that models that are trained on the larger datasets perform better. Models that are fully trained will outperform models that do not reach a plateau.

Running Locally

If you want to modify this code for your own dataset or train it locally, please go to Will Weaver’s GitHub page and clone the “Botany Workshop 2022 - Ruler Classifier” repo. There are instructions on setting up a python virtual environment, installing PyTorch, and running the code.

Download these files: Dropbox

Unzip in arrange files:

File Structure

botany-2022-train-object-classifier's People

Contributors

gene-weaver avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.