Utility script for handling COCO-style data
Requires python version >= 3.9
git clone [email protected]:hoel-bagard/coco_utils.git --recurse-submodules
TODO
If necessary merge several coco json with:
python src/merge_coco.py <path_to_first_json> <path_to_second_json>
python src/merge_coco.py ../data/second_dataset/vott-json-export_brown/coco_annotations.json ../data/second_dataset/vott-json-export_blue/coco_annotations.json ../data/second_dataset/first_dataset/coco_annotations.json
Resize the dataset (if needed):
python src/resize_coco.py <path_to_image_dir> <path_to_json_annotations> <output_path> <size1> <size2>
python src/resize_coco.py ../data/original_dataset/train/images/ ../data/original_dataset/train/annotations.json ../data/resized_dataset/train 550 550
Change all the ids from strings to ints:
python src/coco_ids_to_int.py <path_to_annotation_file>
python src/coco_ids_to_int.py ../data/train/annotations.json
Convert images to grayscale if desired:
python src/imgs_to_grayscale.py <path_to_image_folder>
python src/imgs_to_grayscale.py ../data/validation/images/
Split the dataset into train and validation datasets:
python src/split_train_val.py <path to image folder> <path to annotation file> <output path>
python src/split_train_val.py ../data/original_dataset ../data/original_dataset/coco_annotations.json ../data/split_dataset/
Finally, check that everything works as expected by using:
python src/visualize_coco_data.py <path to image folder> <path to annotation file>
python src/visualize_coco_data.py ../data/train/images/ ../data/train/annotations.json
pip install -r requirements-dev.txt
pre-commit install
A few tests have been written using Pytest. Once Pytest is installed, those tests can be run with:
python -m pytest -v
The code is trying to follow diverse PEPs conventions (notably PEP8). To have a similar dev environment you can install the following packages:
pip install flake8 flake8-docstrings pep8-naming flake8-import-order flake8-bugbear flake8-quotes flake8-comprehensions
It's a good idea to run the hooks against all of the files when adding new hooks (usually pre-commit will only run on the changed files during git hooks).
pre-commit run --all-files