Giter Club home page Giter Club logo

semantic_slam's Introduction

Semantic SLAM

Author: Xuan Zhang, Supervisor: David Filliat

Research internship @ENSTA ParisTech

Semantic SLAM can generate a 3D voxel based semantic map using only a hand held RGB-D camera (e.g. Asus xtion) in real time. We use ORB_SLAM2 as SLAM backend, a CNN (PSPNet) to produce semantic prediction and fuse semantic information into a octomap. Note that our system can also be configured to generate rgb octomap without semantic information.

Semantic octomap RGB octomap
alt text alt text

Project Report & Demo:

Acknowledgement

This work cannot be done without many open source projets. Special thanks to

License

This project is released under a GPLv3 license.

Overview

alt text

Dependencies

  • Openni2_launch
sudo apt-get install ros-kinetic-openni2-launch
  • ORB_SLAM2

We use ORB_SLAM2 as SLAM backend. Please refer to the official repo for installation dependencies.

Installation

Build ORB_SLAM2

After installing dependencies for ORB_SLAM. You should first build the library.

cd ORB_SLAM2
./build.sh

Install dependencies

rosdep install semantic_slam

Make

cd <your_catkin_work_space>
catkin_make

Run with camera

Launch rgbd camera

roslaunch semantic_slam camera.launch

Run ORB_SLAM2 node

roslaunch semantic_slam slam.launch

When the slam system has finished initialization, try to move the camera and check if the camera trajectory in the viewer is reasonable, reset SLAM if not.

Run semantic_mapping

You can now run the semantic_cloud node and the octomap_generator node. You will have to provide trained models, see links below.

roslaunch semantic_slam semantic_mapping.launch

This will also launch rviz for visualization.

You can then move around the camera and construct semantic map. Make sure SLAM is not losing itself.

If you are constructing a semantic map, you can toggle the display color between semantic color and rgb color by running

rosservice call toggle_use_semantic_color

Run with ros bag

If you want to test semantic mapping without a camera, you can also run a rosbag.

Download rosbag with camera position (tracked by SLAM)

demo.bag

Run semantic_mapping

roslaunch semantic_slam semantic mapping.launch

Play ROS bag

rosbag play demo.bag

Trained models

Run time

Evaluation is done on a computer with 6 Xeon 1.7 GHz CPU and one GeForce GTX Titan Z GPU. Input image size is 640×480 recorded by a camera Asus Xtion Pro.

When our system works together, ORB-SLAM works at about 15 Hz (the setting is 30 Hz). Point cloud generation alone can run at 30 Hz. Semantic segmentation runs at about 2 to 3 Hz. Octomap insertion and visualization works at about 1 Hz. Please refer to section 3.6.2 of the project report for more analysis of run times.

Configuration

You can change parameters for launch. Parameters are in ./semantic_slam/params folder.

Note that you can set octomap/tree_type and semantic_cloud/point_type to 0 to generate a map with rgb color without doing semantic segmantation.

Parameters for octomap_generator node (octomap_generator.yaml)

namespace octomap

  • pointcloud_topic
    • Topic of input point cloud topic
  • tree_type
    • OcTree type. 0 for ColorOcTree, 1 for SemanticsOcTree using max fusion (keep the most confident), 2 for SemanticsOcTree using bayesian fusion (fuse top 3 most confident semantic colors). See project report for details of fusion methods.
  • world_frame_id
    • Frame id of world frame.
  • resolution
    • Resolution of octomap, in meters.
  • max_range
    • Maximum distance of a point from camera to be inserted into octomap, in meters.
  • raycast_range
    • Maximum distance of a point from camera be perform raycasting to clear free space, in meters.
  • clamping_thres_min
    • Octomap parameter, minimum octree node occupancy during update.
  • clamping_thres_max
    • Octomap parameter, maximum octree node occupancy during update.
  • occupancy_thres
    • Octomap parameter, octree node occupancy to be considered as occupied
  • prob_hit
    • Octomap parameter, hitting probability of the sensor model.
  • prob_miss
    • Octomap parameter, missing probability of the sensor model.
  • save_path
    • Octomap saving path. (not tested)

Parameters for semantic_cloud node (semantic_cloud.yaml)

namespace camera

  • fx, fy, cx, cy
    • Camera intrinsic matrix parameters.
  • width, height
    • Image size.

namespace semantic_pcl

  • color_image_topic
    • Topic for input color image.
  • depth_image_topic
    • Topic for input depth image.
  • point_type
    • Point cloud type, should be same as octomap/tree_type. 0 for color point cloud, 1 for semantic point cloud including top 3 most confident semanic colors and their confidences, 2 for semantic including most confident semantic color and its confident. See project report for details of point cloud types.
  • frame_id
    • Point cloud frame id.
  • dataset
    • Dataset on which PSPNet is trained. "ade20k" or "sunrgbd".
  • model_path
    • Path to pytorch trained model.

semantic_slam's People

Contributors

floatlazer avatar

Watchers

James Cloos avatar Chang Chen(陈常) avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.