Giter Club home page Giter Club logo

dgp's Introduction

Dataset Governance Policy (DGP)

build-docker license open-issues coverage badge docs

To ensure the traceability, reproducibility and standardization for all ML datasets and models generated and consumed within Toyota Research Institute (TRI), we developed the Dataset-Governance-Policy (DGP) that codifies the schema and maintenance of all TRI's Autonomous Vehicle (AV) datasets.

3d-viz-proj

Components

  • Schema: Protobuf-based schemas for raw data, annotations and dataset management.
  • DataLoaders: Universal PyTorch DatasetClass to load all DGP-compliant datasets.
  • CLI: Main CLI for handling DGP datasets and the entrypoint of visulization tools.

Getting Started

Please see Getting Started for environment setup.

Getting started is as simple as initializing a dataset-class with the relevant dataset JSON, raw data sensor names, annotation types, and split information. Below, we show a few examples of initializing a Pytorch dataset for multi-modal learning from 2D bounding boxes, and 3D bounding boxes.

from dgp.datasets import SynchronizedSceneDataset

# Load synchronized pairs of camera and lidar frames, with 2d and 3d
# bounding box annotations.
dataset = SynchronizedSceneDataset('<dataset_name>_v0.0.json',
    datum_names=('camera_01', 'lidar'),
    requested_annotations=('bounding_box_2d', 'bounding_box_3d'),
    split='train')

Examples

A list of starter scripts are provided in the examples directory.

  • examples/load_dataset.py: Simple example script to load a multi-modal dataset based on the Getting Started section above.

Build and run tests

You can build the base docker image and run the tests within docker container via:

make docker-build
make docker-run-tests

Contributing

We appreciate all contributions to DGP! To learn more about making a contribution to DGP, please see Contribution Guidelines.

CI Ecosystem

Job CI Notes
docker-build Build Status Docker build and push to container registry
pre-merge Build Status Pre-merge testing
doc-gen Build Status GitHub Pages doc generation
coverage Build Status Code coverage metrics and badge generation

๐Ÿ’ฌ Where to file bug reports

Type Platforms
๐Ÿšจ Bug Reports GitHub Issue Tracker
๐ŸŽ Feature Requests GitHub Issue Tracker

๐Ÿ‘ฉโ€๐Ÿ’ป The Team ๐Ÿ‘จโ€๐Ÿ’ป

DGP is developed and currently maintained by Quincy Chen, Arjun Bhargava, Chao Fang, Chris Ochoa and Kuan-Hui Lee from ML-Engineering team at Toyota Research Institute (TRI), with contributions coming from ML-Research team at TRI, Woven Planet and Parallel Domain.

dgp's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.