Giter Club home page Giter Club logo

datacraft-ai / agml Goto Github PK

View Code? Open in Web Editor NEW

This project forked from project-agml/agml

0.0 0.0 0.0 146.73 MB

AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.

License: Apache License 2.0

Shell 0.29% C++ 1.07% Python 98.46% CMake 0.18%

agml's Introduction

agml framework


๐Ÿ‘จ๐Ÿฟโ€๐Ÿ’ป๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป๐ŸŒˆ๐Ÿชด Want to join the AI Institute for Food Systems team and help lead AgML development? ๐Ÿชด๐ŸŒˆ๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป

We're looking to hire a postdoc with both Python library development and ML experience. Send your resume and GitHub profile link to [email protected]!


Overview

AgML is a comprehensive library for agricultural machine learning. Currently, AgML provides access to a wealth of public agricultural datasets for common agricultural deep learning tasks. In the future, AgML will provide ag-specific ML functionality related to data, training, and evaluation. Here's a conceptual diagram of the overall framework.

agml framework

AgML supports both the TensorFlow and PyTorch machine learning frameworks.

Installation

To install the latest release of AgML, run the following command:

pip install agml

Quick Start

AgML is designed for easy usage of agricultural data in a variety of formats. You can start off by using the AgMLDataLoader to download and load a dataset into a container:

import agml

loader = agml.data.AgMLDataLoader('apple_flower_segmentation')

You can then use the in-built processing methods to get the loader ready for your training and evaluation pipelines. This includes, but is not limited to, batching data, shuffling data, splitting data into training, validation, and test sets, and applying transforms.

import albumentations as A

# Batch the dataset into collections of 8 pieces of data:
loader.batch(8)

# Shuffle the data:
loader.shuffle()

# Apply transforms to the input images and output annotation masks:
loader.mask_to_channel_basis()
loader.transform(
    transform = A.RandomContrast(),
    dual_transform = A.Compose([A.RandomRotate90()])
)

# Split the data into train/val/test sets.
loader.split(train = 0.8, val = 0.1, test = 0.1)

The split datasets can be accessed using loader.train_data, loader.val_data, and loader.test_data. Any further processing applied to the main loader will be applied to the split datasets, until the split attributes are accessed, at which point you need to apply processing independently to each of the loaders. You can also turn toggle processing on and off using the loader.eval(), loader.reset_preprocessing(), and loader.disable_preprocessing() methods.

You can visualize data using the agml.viz module, which supports multiple different types of visualization for different data types:

# Disable processing and batching for the test data:
test_ds = loader.test_data
test_ds.batch(None)
test_ds.reset_prepreprocessing()

# Visualize the image and mask side-by-side:
agml.viz.visualize_image_and_mask(test_ds[0])

# Visualize the mask overlaid onto the image:
agml.viz.visualize_overlaid_masks(test_ds[0])

AgML supports both the TensorFlow and PyTorch libraries as backends, and provides functionality to export your loaders to native TensorFlow and PyTorch formats when you want to use them in a training pipeline. This includes both exporting the AgMLDataLoader to a tf.data.Dataset or torch.utils.data.DataLoader, but also internally converting data within the AgMLDataLoader itself, enabling access to its core functionality.

# Export the loader as a `tf.data.Dataset`:
train_ds = loader.train_data.export_tensorflow()

# Convert to PyTorch tensors without exporting.
train_ds = loader.train_data
train_ds.as_torch_dataset()

You're now ready to use AgML for training your own models!

Public Dataset Listing

Dataset Task Number of Images
bean_disease_uganda Image Classification 1295
carrot_weeds_germany Semantic Segmentation 60
plant_seedlings_aarhus Image Classification 5539
soybean_weed_uav_brazil Image Classification 15336
sugarcane_damage_usa Image Classification 153
crop_weeds_greece Image Classification 508
sugarbeet_weed_segmentation Semantic Segmentation 1931
rangeland_weeds_australia Image Classification 17509
fruit_detection_worldwide Object Detection 565
leaf_counting_denmark Image Classification 9372
apple_detection_usa Object Detection 2290
mango_detection_australia Object Detection 1730
apple_flower_segmentation Semantic Segmentation 148
apple_segmentation_minnesota Semantic Segmentation 670
rice_seedling_segmentation Semantic Segmentation 224
plant_village_classification Image Classification 55448
autonomous_greenhouse_regression Image Regression 389
grape_detection_syntheticday Object Detection 448
grape_detection_californiaday Object Detection 126
grape_detection_californianight Object Detection 150
guava_disease_pakistan Image Classification 306
apple_detection_spain Object Detection 967
apple_detection_drone_brazil Object Detection 689
plant_doc_classification Image Classification 2598
plant_doc_detection Object Detection 2598
wheat_head_counting Object Detection 6512
peachpear_flower_segmentation Semantic Segmentation 42
red_grapes_and_leaves_segmentation Semantic Segmentation 258
white_grapes_and_leaves_segmentation Semantic Segmentation 273
ghai_romaine_detection Object Detection 500
ghai_green_cabbage_detection Object Detection 500
ghai_iceberg_lettuce_detection Object Detection 500
riseholme_strawberry_classification_2021 Image Classification 3520
ghai_broccoli_detection Object Detection 500
bean_synthetic_earlygrowth_aerial Semantic Segmentation 2500
ghai_strawberry_fruit_detection Object Detection 500

Usage Information

Using Public Agricultural Data

AgML aims to provide easy access to a range of existing public agricultural datasets The core of AgML's public data pipeline is AgMLDataLoader. You can use the AgMLDataLoader or agml.data.download_public_dataset() to download the dataset locally from which point it will be automatically loaded from the disk on future runs. From this point, the data within the loader can be split into train/val/test sets, batched, have augmentations and transforms applied, and be converted into a training-ready dataset (including batching, tensor conversion, and image formatting).

To see the various ways in which you can use AgML datasets in your training pipelines, check out the example notebook.

Annotation Formats

A core aim of AgML is to provide datasets in a standardized format, enabling the synthesizing of multiple datasets into a single training pipeline. To this end, we provide annotations in the following formats:

  • Image Classification: Image-To-Label-Number
  • Object Detection: COCO JSON
  • Semantic Segmentation: Dense Pixel-Wise

Contributions

We welcome contributions! If you would like to contribute a new feature, fix an issue that you've noticed, or even just mention a bug or feature that you would like to see implemented, please don't hesitate to use the Issues tab to bring it to our attention. See the contributing guidelines for more information.

Funding

This project is partly funded by the [National AI Institute for Food Systems (AIFS)](https://aifs.ucdavis.ed

agml's People

Contributors

amogh7joshi avatar masonearles avatar heesup avatar alexolenskyj avatar dariojavo avatar github-actions[bot] avatar pranav-raja-scale avatar ctyeong avatar momtanu-ag avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.