Giter Club home page Giter Club logo

kitti360_ros_player's Introduction

ROS publisher node for the KITTI-360 Dataset

This ROS package takes each sensor from the KITTI-360 dataset and publishes it in ROS. The simulation can be paused, speed up, and one can step frame by frame.

If you have comments or questions please contact [email protected].

Many thanks to Panagiotis Grigoriadis for creating a port to ROS2, which can be found in the ros2 branch. For questions about the ROS2 version please contact Panos via [email protected]. If you encounter any issues, please check his fork as some changes may have not been merged yet.

Quick Start

Installation: Clone this repository, install python dependencies with pip install -r requirements.txt, and run catkin_make from the root of your catkin workspace.

Launch Simulation: To start the simulation you use roslaunch and specify the launch file launch/Kitti360.launch as well as the directory (absolute path) where the KITTI-360 files are located.

roslaunch launch/Kitti360.launch directory:=/path/to/kitti/files/...

This will start the simulation and publish all available sensors in ROS. To control the simulation, check out the keyboard mappings that are printed upon start (and below).

Mandatory KITTI-360 files

For a given sequence at least the velodyne timestamps and the poses need to be present. Everything else is optional.

 KITTI-360
 |- data_3d_raw
    |- 2013_05_28_drive_{seq:0>4}_sync
        |- velodyne_points
            |- timestamps.txt
 |- data_poses
    |- 2013_05_28_drive_{seq:0>4}_sync
        |- poses.txt

Example in RVIZ

Below you can see how the simulation may look like in rviz. Both perspective and fisheye cameras as well as the velodyne pointcloud and the derived bounding boxes are visualized. Note that the size of the pointcloud markers is increased by a factor of 5 for this screenshot.

RVIZ example with bounding boxes

Labeled Velodyne pointclouds

We provide script (scripts/label_velodyne) which labels the velodyne pointcloud using the semantic 3d data provided by KITTI-360.

For each point in velodyne pointcloud frame we take the closest point in the 3d semantics dataset within 20cm radius and take its semantic label. If no point is found the point is marked as unlabeled. The obtained label is saved to the ring field.

This labeled pointcloud is also published if available. An example can be seen in the following screenshot.

RVIZ example with bounding boxes

Usage Guide

Required Directory Structure

This package assumes that the sensor data is structured as specified in the KITTI-360 documentation.

Keyboard Mappings

+--------------------------------------------------------------------+
| key mapping for simulation control via terminal                    |
|    *       : print this                                            |
|    s       : step to next frame by VELODYNE (skips 3 SICK frames)  |
|    d       : step to next frame by SICK                            |
|    <space> : pause/unpause simulation                              |
|    k       : increase playback speed factor by 0.1                 |
|    j       : decrease playback speed factor by 0.1                 |
|    [0-9]   : seek to x0% of simulation (e.g. 6 -> 60%)             |
|    b       : print duration of each publishing step                |
+--------------------------------------------------------------------+

Basic Configuration

The following parameters can be configured on launch:

parameter name default description
rate 1 The playback speed as a factor
looping True Whether to loop back to the start at the end
start 0.0 Start N seconds into the simulation
end 99999999 Stop N seconds into the simulation
sequence 00 The KITTI-360 dataset sequence to play
directory The path to the KIITI-360 directory

Timestamps

Timestamps in the KITTI-360 dataset contain the actual time the data was recorded in the real world. For simplicity, the timestamps within the simulation start at 0 for each sequence. The offsets are hardcoded into the function read_timestamps().

Custom Message Types

We use custom message types for semantics. These are specified in the /msg folder.

Performance

The main bottleneck of simulation is reading the *.png files, which contain the semantics, from the hard drive. When enabling all sensors, the simulation will not be able to keep up with real-time and skip frames, even with a fast NVMe M.2 SSD.

To figure out which publishing step takes the most time you can press b. This will print an overview of how much time the data of each sensor takes to publish as follows:

[INFO] [...]: -------------------------------------------------------
[INFO] [...]: image_02_data_rgb         = 0.661s (51.4%)
[INFO] [...]: 3d semantics static       = 0.417s (32.5%)
[INFO] [...]: image_03_data_rgb         = 0.069s (5.3%)
[INFO] [...]: image_00_data_rgb         = 0.023s (1.8%)
[INFO] [...]: image_00_data_rect        = 0.022s (1.7%)
[INFO] [...]: image_01_data_rect        = 0.019s (1.5%)
[INFO] [...]: image_01_data_rgb         = 0.020s (1.5%)
[INFO] [...]: 2d confidence left        = 0.018s (1.4%)
[INFO] [...]: 2d semantic rgb left      = 0.013s (1.0%)
[INFO] [...]: 2d instanceID left        = 0.008s (0.7%)
[INFO] [...]: bounding boxes            = 0.004s (0.3%)
[INFO] [...]: 2d semanticID left        = 0.004s (0.3%)
[INFO] [...]: sick points               = 0.002s (0.2%)
[INFO] [...]: velodyne                  = 0.002s (0.2%)
[INFO] [...]: 3d semantics dynamic      = 0.003s (0.2%)
[INFO] [...]: total                     = 1.285s
[INFO] [...]: -------------------------------------------------------

Disabling Sensors

All sensors are enabled by default. Which sensor to disable can be specified when launching the simulation. For example:

roslaunch launch/Kitti360.launch directory:=... pub_sick_points:=False

The precise name for each sensor are listed below:

parameter name description
pub_velodyne velodyne pointcloud
pub_sick_points sick points
pub_perspective_rectified_left images of left perspective camera
pub_perspective_rectified_right images of right perspective camera
pub_perspective_unrectified_left images of left perspective camera (unrectified)
pub_perspective_unrectified_right images of right perspective camera (unrectified)
pub_fisheye_left images of left fisheye camera
pub_fisheye_right images of right fisheye camera
pub_bounding_boxes bounding boxes
pub_bounding_boxes_rviz_marker bounding boxes visualization for rviz
pub_2d_semantics_left semantic ID of each pixel (left cam)
pub_2d_semantics_right semantic ID of each pixel (right cam)
pub_2d_semantics_rgb_left color-coded semantic label for each pixel (left cam)
pub_2d_semantics_rgb_right color-coded semantic label for each pixel (right cam)
pub_2d_instance_left instance label of each pixel (left cam)
pub_2d_instance_right instance label of each pixel (right cam)
pub_2d_confidence_left confidence map (left cam)
pub_2d_confidence_right confidence map (right cam)
pub_3d_semantics_static static 3d semantics
pub_3d_semantics_dynamic dynamic 3d semantics
pub_camera_intrinsics camera intrinsics

If the required files are not present for an enabled sensor it will disable itself automatically and print an error message in the logs. The simulation will simply continue without the sensor.

Transformation Tree

Transformation Tree

Python requirements

Python >=3.8 and package dependencies are listed in requirements.txt.

Contribution

As of now, this package has barely been used and is not sufficiently tested. If you encounter any issues please open a GitHub issue or contact me via email [email protected]. Also, please do not hesitate to send pull requests!

Notes / Todos

If you have comments on this or would like to contribute please email me ([email protected]) or open a pull request.

  • Sick Point Transforms: KITTI-360 only provides poses at the rate of the Velodyne publishing speed. Sick points are published at four times the rate, which means most of the transforms for sick points have to rely on "old" position data.
  • OXTS measurements: We are not yet publishing these.
  • Camera Intrinsics: We are not sure 100% sure whether the camera intrinsics are published correctly (see NOTEs in code).
  • Static Transforms: These are published from the launch file (Kitti360.launch). We computed the parameters by hand from the sketch on the KITTI360 website. There may be errors.
  • 3D Semantics and Bounding Boxes: The ranges specified in the filenames overlap in most cases (~15 frames). If we have the choice between two ranges we publish the one with the highest range, i.e., prefer 90-150 over 50-100 when the current frame is 95.
  • Bounding Boxes: static and dynamic bounding boxes are handled the same way at the moment. This may need to be changed. The custom bounding box message contains a bool that represents whether it is dynamic or static.
  • 3D Semantics: 3d semantics are accumulated to a range of frames in KITTI360. If there is no pointcloud for a given frame range then we publish nothing.
  • Backwards Simulation and RVIZ: when going backwards in time, either by stepping or seeking, the visualization in rviz may not be correct. For example, when displaying the TF tree in rviz it only shows the latest transform but not the most recently published.

Acknowledgements

kitti360_ros_player's People

Contributors

cmosig avatar yongseop avatar dcmlr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.