monocongo / cvdata Goto Github PK
View Code? Open in Web Editor NEWTools for creating and manipulating computer vision datasets
License: MIT License
Tools for creating and manipulating computer vision datasets
License: MIT License
Add Darknet as a supported output annotation format (currently only PASCAL VOC is supported).
See here for details.
Make the KITTI IDs file argument optional when converting to KITTI format. Currently, the argument is optional at the CLI level but if converting to KITTI then it's expected to be present. In some cases, it's not important to have a file containing file IDs corresponding to a dataset with KITTI annotations, and therefore we should allow for this use case without requiring a KITTI IDs file argument.
Create a script that allows for downloading one or more image classes from OpenImages with annotations in various formats (initially in PASCAL VOC).
Create a function within convert.py
that will convert PNG image files to JPG format and update any associated annotation files that reference the file by name/extension (such as PASCAL).
Add the ability to use the convert.py
module to perform bulk conversion of images from PNG to JPG format.
In order to remove labels/bounding boxes from annotations add an optional argument to the CLI and corresponding functions of the clean.py
module to specify labels we want to be removed.
Build and upload to PyPI:
$ rm -rf dist
$ python setup.py sdist bdist_wheel
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
$ twine upload dist/*
Create a script/module for splitting a dataset into train/valid/test subsets.
Currently, the exclusion filter module exclude.py
expects arguments for both image and annotation directories. Modify the code so we can exclude image files only.
Follow the instructions here.
In order to avoid duplicate downloads of OpenImages CSV files, we can allow for a cache directory to be specified which is where we'll first look for the various CSV files used for OpenImages downloading from OpenImages. If the directory and/or CSV files don't exist then they should be created/downloaded.
Add a function to convert.py
to perform conversion of annotations from KITTI to Darknet format.
Add a module with CLI for locating and removing duplicate images and associated annotations.
The current script resizes images and associated annotations. Make the script accept only an images directory to specify resizing only images and no associated annotations should be modified.
Create a function and corresponding CLI option(s) for the conversion of datasets annotated in KITTI and/or PASCAL format to TFRecords that are suitable as input for TensorFlow object detection models. This will likely involve adding functions to convert.py
, pascal_to_tfrecord() and/or kitti_to_tfrecord().
Create a module with CLI for the conversion of dataset annotation files to/from various formats. Formats we intend to initially support are PASCAL VOC, Darknet, COCO, and TFRecord.
Add a module with CLI for filtering a dataset to contain only a certain set of class labels.
An example use case:
A dataset with a lop-sided distribution of annotated images:
dog: 80k images/annotations
cat: 10k images/annotations
panda: 3k images/annotations
We'd like to have a more evenly distributed dataset so to filter down the dataset to a maximum of 5000 annotated images (bounding boxes) per image class we will provide a module with a CLI like so:
$ python filter.py --format kitti --src_images /data/original/images \
> --src_annotations /data/original/kitti \
> --dest_images /data/filtered/images \
> --dest_annotations /data/filtered/kitti \
> --boxes_per_class dog:5000 cat:5000 panda:5000
We might instead have a maximum count argument but the above boxes per class will allow for finer-grained filtering, in case there are various numbers of boxes we want per class or if we only want a subset of the available image classes.
The exclusion module is missing any mention in the README. Add a basic usage section for this module's CLI.
To use the cvdata.split
module we can invoke it from CLI like so:
$ cd ${CVDATA_HOME}
$ python cvdata.split.py --annotations_dir /data/rifle/kitti/label_2 \
> --images_dir /data/rifle/kitti/image_2 \
> --train_annotations_dir /data/rifle/split/kitti/trainval/label_2 \
> --train_images_dir /data/rifle/split/kitti/trainval/image_2 \
> --val_annotations_dir /data/rifle/split/kitti/trainval/label_2 \
> --val_images_dir /data/rifle/split/kitti/trainval/image_2 \
> --test_annotations_dir /data/rifle/split/kitti/test/label_2 \
> --test_images_dir /data/rifle/split/kitti/test/image_2 \
> --format kitti --copy
Add script/module for analysis of a dataset.
Add support for running tests using tox.
Our visualization for TFRecords is utilizing a non-standard format used by the NVIDIA Transfer Learning Toolkit. We'll add the ability to visualize TFRecords that are in the standard format used for input to TensorFlow object detection models (described here).
create_split_files
to create_split_files_darknet
since it's only applicable for Darknet-related datasets--copy
and --no_copy
arguments into a single --move
argument which defaults to false (i.e. only move files if --move
is specified)As with the OpenImages module (cvdata.openimages
) we should provide a module that allows for retrieval of images and corresponding annotations from the YouTube-BoundingBoxes Dataset. This will likely involve the extraction of frames from video clips with ffmpeg (some basic tips here).
Within the module clean.py
we're checking the min/max bounding boxes for Darknet using min/max X/Y values as with KITTI/PASCAL. Instead, we need to modify it to use the actual Darknet bounding box of center X/Y and box width/height.
Add KITTI as a supported output annotation format (currently only PASCAL VOC is supported).
Create a module and script for resizing images and corresponding annotation files. Preserve aspect ratio and pad on right/bottom (initially with zeros).
Create a module and script for renaming labels in annotation files.
Add the ability to visualize an OpenImages dataset created from downloads from OpenImages and/or a dataset in the OpenImages CSV format resulting from a call to one of the functions of cvdata/convert.py
(i.e. *_to_openimages()
).
Currently, there is a default 0.7/0.2/0.1 train/valid/test split for several conversion functions. Make this optional so as to facilitate using the conversion functions on datasets that are already split.
Provide a script/module that allows a user to apply a list of files to be excluded from a dataset. Use CLI arguments for the annotation format, the directory for images, and associated annotations (either directory or CSV file).
Add an option such as --limit
to the openimages.py
CLI and associated module functions to limit the number of images downloaded from the OpenImages dataset.
The Darknet labels file is being rewritten at every file iteration, it needs to come out of the inner (per file) loop.
Add a recipe script for creating a dataset including people (person image class), vehicles (Car and Truck image classes), and weapons (Handgun and Rifle and Shotgun label classes) from OpenImages plus a handgun dataset from University of Granada for weapons ("hanndgun" class). Optional images and labels from custom directory locations.
Follow the steps laid out here.
Somehow the tests for resizing have gone bad. Fix these (and try to determine where/how they went off the rails).
Create a dataset API that allows a user to specify
Extend the PyTorch Dataset class, maintain dataset info (bounding boxes, file locations, etc. internally using a pandas DataFrame, read/write info in CSV files similar to OpenImages, etc.
setup.py
and __init__.py
$ rm -rf dist
$ python setup.py sdist bdist_wheel
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
$ twine upload dist/*
The current CLI for the split module is tailored for datasets (i.e. collections of images and corresponding annotations). We will extend the CLI to allow for splits of images only, essentially triggered by an absence of annotation-related CLI arguments.
Create a visualization tool to display images with bounding boxes to facilitate validation of files.
In order to bulk rename files, we will add a module that can get all files in a directory and rename them using a prefix and enumeration. The initial use case is when we have a directory full of image files with disparate names and we want them to all be uniformly named, i.e. abc_000000.jpg, abc_000001.jpg, abc_000002.jpg, etc.
To facilitate training an SSD/MobileNet model we should provide a script to convert a dataset to the OpenImages CSV format and splits suggested by the open_images_downloader.py
script used in the example for retraining the model on data from the OpenImages dataset.
The result should be three separate directories for the training/validation/testing images (default to a 70/20/10 split) and three corresponding CSV files with the following columns header:
ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,id,ClassName
For example:
<data_dir>
test
train
validation
sub-test-annotations-bbox.csv
sub-train-annotations-bbox.csv
sub-validation-annotations-bbox.csv
This is closely related to issue #3, and as such we may provide this as a conversion function such as convert_pascal_to_openimages(images_dir, pascal_dir, openimages_dir, copy_images=True).
Add a BibTeX entry in the README and documentation. For example:
@misc {climate_indices,
author = "James Adams",
title = "climate_indices, an open source Python library providing reference implementations of commonly used climate indices",
url = "https://github.com/monocongo/climate_indices",
month = "may",
year = "2017--"
}
Add a module with CLI for fetching images using the Google Images Download package. We can possibly create corresponding annotations using a pre-trained object detection model such as YOLOv3 etc. if the model has been trained to detect the class of object we're interested in annotating in the images. This script is one example of what can be used for this annotation step.
It's not really practical to have exclusions within separate files, so we'll consolidate into a single exclusions.txt
file.
The current implementation requires a prototext file to be present defining the labels in the dataset to be converted to TFRecord. Instead of this, we'll add the ability to generate this file based on the labels present in the annotations.
Create an option in the cvdata/clean.py
script for moving all image files within an image directory to another directory if there does not exist a corresponding annotation file within the specified annotations directory.
TFRecord files can come in various formats depending on how they're written (there's no standard layout from what I can tell). We'll first provide support for the TFRecord files created by the NVIDIA TLT framework since those are the ones we're typically using day-to-day.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.