tri-ml / dgp Goto Github PK
View Code? Open in Web Editor NEWML Dataset Governance Policy for Autonomous Vehicle Datasets
Home Page: https://tri-ml.github.io/dgp/
License: MIT License
ML Dataset Governance Policy for Autonomous Vehicle Datasets
Home Page: https://tri-ml.github.io/dgp/
License: MIT License
See the live documentation here. In the upper-left corner, we mention DGP 1.0. At the time of this writing, though, we are on v1.3.
Either automate part of the release process to bump version numbers in docs before building them, for example as part of a GitHub Actions workflow, or do this manually around the time of a version release.
I'm trying to use the 'depth' information, but the visualization result looks very strange.
I follow DDAD.ipynb example to draw depth image, but it looks like an empty image.
from matplotlib.cm import get_cmap
plasma_color_map = get_cmap('plasma')
ddad_train = SynchronizedSceneDataset(
json_path,
split='train',
datum_names=['lidar', 'CAMERA_01', 'CAMERA_05', 'CAMERA_06', 'CAMERA_07', 'CAMERA_08', 'CAMERA_09'],
generate_depth_from_datum='lidar'
)
sample_0 = ddad_train[0]
camera_01 = sample_0[0][0]
depth_map = plasma_color_map(camera_01['depth'])[:, :, :3]
plt.imshow((camera_01['depth']*255).astype(np.uint8))
)
Is there a way to get DDAD depth information and show it like below image?
dgp/dgp/scripts/visualize_dataset.py attempts to SynchronizedDataset from dgp.datasets.synchronized_dataset, when the class is actually named _SynchronizedDataset
The pre-push hook prevents from pushing to fork of DGP repository with the following error message.
Here, the virtual environment was created and activated by following this doc.
(dev) nehal@device:~/dgp$ git push nehaldgp feat/nehal/point-line-polygon-3d-proto
************* Module .pylintrc
.pylintrc:1: [E0015(unrecognized-option), ] Unrecognized option found: accept-no-param-doc, accept-no-return-doc, accept-no-yields-doc
Aborting push due to files with lint.
error: failed to push some refs to '[email protected]:nehalmamgain/dgp.git'
Hi, thanks for your good work! When I use the DDAD_tiny dataset by the dgp, I encountered some errors. The code I used is as follow.
DDAD_TRAIN_VAL_JSON_PATH = '/DDAD_tiny/ddad_tiny.json'
DATUMS = ['camera_01']
ddad_train = SynchronizedSceneDataset(
DDAD_TRAIN_VAL_JSON_PATH,
split='train',
datum_names=DATUMS,
generate_depth_from_datum='lidar'
)
The error report is
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home1/wangyufei/anaconda3/envs/tri/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home1/wangyufei/anaconda3/envs/tri/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/base_dataset.py", line 1071, in _datum_index_for_scene
return scene.datum_index
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/base_dataset.py", line 375, in datum_index
assert len(datum_key_to_idx_in_scene) == bad_datums + num_datums, "Duplicated datum_key"
AssertionError: Duplicated datum_key
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home1/wangyufei/anaconda3/envs/tri/lib/python3.6/code.py", line 91, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/home1/wangyufei/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/home1/wangyufei/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home4/user_from_home1/wangyufei/dc/packnet-sfm/test_dataset.py", line 30, in <module>
generate_depth_from_datum='lidar'
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/synchronized_dataset.py", line 424, in __init__
only_annotated_datums=only_annotated_datums
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/synchronized_dataset.py", line 83, in __init__
requested_autolabels=requested_autolabels
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/base_dataset.py", line 704, in __init__
self.datum_index = self._build_datum_index()
File "/home4/user_from_home1/wangyufei/dgp-1.0/dgp/datasets/base_dataset.py", line 1080, in _build_datum_index
datum_index = list(proc.map(BaseDataset._datum_index_for_scene, self.scenes))
File "/home1/wangyufei/anaconda3/envs/tri/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home1/wangyufei/anaconda3/envs/tri/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
AssertionError: Duplicated datum_key
Would you provide some help?
Difficult to fetch from the repository after running git commit
.
Steps to reproduce the behavior:
When following Getting Started,
presumably at make setup-linters
, pre-commit run --all-files
, git commit
inside the docker, permissions under .git change for the following folders from user to root
-rw-r--r-- 1 root root 73 Dec 7 11:46 COMMIT_EDITMSG
-rw-r--r-- 1 root root 0 Dec 7 12:34 FETCH_HEAD
-rw-r--r-- 1 root root 23 Dec 7 12:28 HEAD
-rw-r--r-- 1 root root 322 Dec 7 12:30 config
-rw-r--r-- 1 root root 39355 Dec 7 12:28 index
This prevents git operations outside the docker like
~/dgp$ git pull
error: cannot open .git/FETCH_HEAD: Permission denied
and inside the docker like
/home/dgp# git pull
Bad owner or permissions on /root/.ssh/config
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
For shared development devices where users do not have root permissions, it is impossible to change the permissions affected by the container.
Being able to pull either from inside or outside the container. With above constraint (no root permissions), both workflows are blocked unless setting up repo from scratch again.
The best way would probably be for above folder permissions to not change to root in the first place ๐ค
Hi, thanks for your greate work!
I found that there is no 3d annotation in the dataset, did i miss someting?
Hello,
I am unable to download the ParallelDomain GUDA dataset from the provided link (https://paralleldomain.com/public-datasets)
The link to the dataset appears to be broken/ incorrect.
I use: curl -s https://tri-ml-public.s3.amazonaws.com/github/vidar/datasets/PD_guda.tar | tar xv -C vidar/
The PD GUDA data should be downloadable from the link provided on this page: https://paralleldomain.com/public-datasets
Thanks for your great work! Would you release LiDAR labels and 3D bounding box annotation for all scenes in the future?
Hello!
When installing
https://github.com/TRI-ML/packnet-sfm
I got the error:
Step 38/47 : RUN git clone https://github.com/TRI-ML/dgp.git && cd dgp && pip3 install -r requirements.txt
---> Running in 8fe2dbd1205b
Cloning into 'dgp'...
Requirement already satisfied: torch==1.4.0 in /usr/local/lib/python3.6/dist-packages (from -r requirements.txt (line 17)) (1.4.0)
Requirement already satisfied: torchvision==0.5.0 in /usr/local/lib/python3.6/dist-packages (from -r requirements.txt (line 18)) (0.5.0)
Collecting attrs==19.1.0
Downloading attrs-19.1.0-py2.py3-none-any.whl (35 kB)
Collecting awscli==1.16.192
Downloading awscli-1.16.192-py2.py3-none-any.whl (1.7 MB)
Requirement already satisfied: docutils>=0.10 in /usr/local/lib/python3.6/dist-packages (from awscli==1.16.192->-r requirements.txt (line 2)) (0.15.2)
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of attrs to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install -r requirements.txt (line 2) and botocore==1.12.79 because these package versions have conflicting dependencies.
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
The conflict is caused by:
The user requested botocore==1.12.79
awscli 1.16.192 depends on botocore==1.12.182
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
The command '/bin/bash -cu git clone https://github.com/TRI-ML/dgp.git && cd dgp && pip3 install -r requirements.txt' returned a non-zero code: 1
So it looks like this needs a bit of a fix!
The build-docker
workflow fails on master with
#4 [internal] load metadata for docker.io/nvidia/cuda:11.1-devel-ubuntu18.04
#4 ERROR: docker.io/nvidia/cuda:11.1-devel-ubuntu18.04: not found
------
> [internal] load metadata for docker.io/nvidia/cuda:11.1-devel-ubuntu18.04:
------
error: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: docker.io/nvidia/cuda:11.1-devel-ubuntu18.04: not found
Error: buildx failed with: error: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: docker.io/nvidia/cuda:11.1-devel-ubuntu18.04: not found
Run the build-docker
workflow on master manually or trigger it via a merge to master.
Update the base image in our Dockerfile.
Formerly:
FROM nvidia/cuda:11.1-devel-ubuntu18.04
I don't see this entry in the docker registry. I suspect it was renamed or replaced with alternatives.
Fix:
FROM nvidia/cuda:11.1.1-devel-ubuntu18.04
The new image is here.
DGP features various linters: pylint, YAPF, SuperLinter, and even a commit linter (CI-only?). To run all linters, users have to do so manually or attempt a commit locally to trigger the git hooks.
While this introduces yet another tool, let's use something like pre-commit
to manage our various linting tools. pre-commit
provides a common entrypoint to containerized linting, making it easy to run the same checks that we use in CI locally so that people can fix linting issues before going to CI. We could configure and use (most of?) the same linters we currently use, and we'd remove .githooks/
entirely.
At the time of this writing, make test
fails in the Docker environment after following the getting-started instructions.
docker pull ghcr.io/tri-ml/dgp:master
docker image tag ghcr.io/tri-ml/dgp:master dgp:latest
make docker-start-interactive
# either of the following fail
make build-proto
make test
For example, make test
fails with:
root@hostname:/home/dgp# make test
python3 setup.py clean && \
rm -rf build dist && \
find . -name "*.pyc" | xargs rm -f && \
find . -name "__pycache__" | xargs rm -rf
Traceback (most recent call last):
File "setup.py", line 5, in <module>
from setuptools import find_packages, setup
ModuleNotFoundError: No module named 'setuptools'
Makefile:33: recipe for target 'clean' failed
make: *** [clean] Error 1
python3
is actually 3.6.9.
root@hostname:/home/dgp# python3
Python 3.6.9 (default, Jun 29 2022, 11:45:57)
but setuptools
is installed for 3.7:
root@hostname:/home/dgp# pip show setuptools
Name: setuptools
Version: 63.2.0
Summary: Easily download, build, install, upgrade, and uninstall Python packages
Home-page: https://github.com/pypa/setuptools
Author: Python Packaging Authority
Author-email: [email protected]
License:
Location: /usr/local/lib/python3.7/dist-packages
Requires:
Required-by: astroid, grpcio-tools, xarray
Two solutions: one is to not install Python 3.6 at all so that python3
symlinks to 3.7. Another is to update the Makefile to use python3.7
specifically.
Is there plans to publish DDAD15M datasets?
Since 'key_line_2d' is not defined in the 'ONTOLOGY_REGISTRY', an Exception is generated when instantiating FrameSceneDataset()
FrameSceneDataset(
/usr/local/lib/python3.8/dist-packages/dgp/datasets/frame_dataset.py:211: in __init__
dataset_metadata = DatasetMetadata.from_scene_containers(scenes, requested_annotations, requested_autolabels)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'dgp.datasets.base_dataset.DatasetMetadata'>
scene_containers = [SceneContainer[<path_to_scene>][Samples: 100], SceneContainer[<path_to_scene>][Samples: 100], SceneContainer[<path_to_scene>][Samples: 100], ...]
requested_annotations = ['key_line_2d'], requested_autolabels = []
@classmethod
def from_scene_containers(cls, scene_containers, requested_annotations=None, requested_autolabels=None):
"""Load DatasetMetadata from Scene Dataset JSON.
Parameters
----------
scene_containers: list of SceneContainer
List of SceneContainer objects.
requested_annotations: List(str)
List of annotations, such as ['bounding_box_3d', 'bounding_box_2d']
requested_autolabels: List(str)
List of autolabels, such as['model_a/bounding_box_3d', 'model_a/bounding_box_2d']
"""
assert len(scene_containers), 'SceneContainers is empty.'
requested_annotations = [] if requested_annotations is None else requested_annotations
requested_autolabels = [] if requested_autolabels is None else requested_autolabels
if not requested_annotations and not requested_autolabels:
# Return empty ontology table
return cls(scene_containers, directory=os.path.dirname(scene_containers[0].directory), ontology_table={})
# For each annotation type, we enforce a consistent ontology across the
# dataset (i.e. 2 different `bounding_box_3d` ontologies are not
# permitted). However, an autolabel may support a different ontology
# for the same annotation type. For example, the following
# ontology_table is valid:
# {
# "bounding_box_3d": BoundingBoxOntology,
# "bounding_box_2d": BoundingBoxOntology,
# "my_autolabel_model/bounding_box_3d": BoundingBoxOntology
# }
dataset_ontology_table = {}
logging.info('Building ontology table.')
st = time.time()
# Determine scenes with unique ontologies based on the ontology file basename.
unique_scenes = {
os.path.basename(f): scene_container
for scene_container in scene_containers
for _, _, filenames in os.walk(os.path.join(scene_container.directory, ONTOLOGY_FOLDER)) for f in filenames
}
# Parse through relevant scenes that have unique ontology keys.
for _, scene_container in unique_scenes.items():
for ontology_key, ontology_file in scene_container.ontology_files.items():
# Keys in `ontology_files` may correspond to autolabels,
# so we strip those prefixes when instantiating `Ontology` objects
_autolabel_model, annotation_key = os.path.split(ontology_key)
# Look up ontology for specific annotation type
if annotation_key in ONTOLOGY_REGISTRY:
# Skip if we don't require this annotation/autolabel
if _autolabel_model:
if ontology_key not in requested_autolabels:
continue
else:
if annotation_key not in requested_annotations:
continue
ontology_spec = ONTOLOGY_REGISTRY[annotation_key]
# No need to add ontology-less tasks to the ontology table.
if ontology_spec is None:
continue
# If ontology and key have not been added to the table, add it.
if ontology_key not in dataset_ontology_table:
dataset_ontology_table[ontology_key] = ontology_spec.load(ontology_file)
# If we've already loaded an ontology for this annotation type, make sure other scenes have the same ontology
else:
assert dataset_ontology_table[ontology_key] == ontology_spec.load(
ontology_file
), "Inconsistent ontology for key {}.".format(ontology_key)
# In case an ontology type is not implemented yet
else:
> raise Exception(f"Ontology for key {ontology_key} not found in registry!")
E Exception: Ontology for key key_line_2d not found in registry!
/usr/local/lib/python3.8/dist-packages/dgp/datasets/base_dataset.py:592: Exception
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.