Giter Club home page Giter Club logo

drive-any-robot's Introduction

GNM: A General Navigation Model to Drive Any Robot

Dhruv Shah*, Ajay Sridhar*, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

Berkeley AI Research

Project Page | arXiV | Summary Video


Update Oct 2023: This repository is subsumed by https://github.com/robodhruv/visualnav-transformer, which includes training and deployment infrastructure for GNM and all subsequent general navigation models.


Overview

This repository contains code for training a GNM with your own data, pre-trained model checkpoints, as well as example code to deploy it on a TurtleBot2/LoCoBot robot.

  • ./train/train.py: training script to train or fine-tune the GNM model on your custom data.
  • ./train/process_*.py: scripts to process rosbags or other formats of robot trajectories into training data.
  • ./deployment/src/record_bag.sh: script to collect a demo trajectory in the target environment on the robot. This trajectory is subsampled to generate a topological graph of the environment.
  • ./deployment/src/navigate.sh: script that deploys a trained GNM model on the robot to navigate to a desired goal in the generated topological graph. Please see relevant sections below for configuration settings.

Train

This subfolder contains code for processing datasets and training a GNM from your own data.

Pre-requisites

The codebase assumes access to a workstation running Ubuntu (tested on 18.04 and 20.04), Python 3.7+, and a GPU with CUDA 10+. It also assumes access to conda, but you can modify it to work with other virtual environment packages, or a native setup.

Setup

Run the commands below inside the drive-any-robot/ (topmost) directory:

  1. Set up the conda environment:
    conda env create -f train/environment.yml
  2. Source the conda environment:
    conda activate gnm_train
    
  3. Install the gnm_train packages:
    pip install -e train/

Data-Wrangling

In the GNM paper, we train on a combination of publicly available and unreleased datasets. Below is a list of publicly available datasets used for training; please contact the respective authors for access to the unreleased data.

We recommend you to download these (and any other datasets you may want to train on) and run the processing steps below.

Data Processing

We provide some sample scripts to process these datasets, either directly from a rosbag or from a custom format like HDF5s:

  1. Run process_bags.py with the relevant args, or process_recon.py for processing RECON HDF5s. You can also manually add your own dataset by following our structure below (if you are adding a custom dataset, please checkout the Custom Datasets section).
  2. Run data_split.py on your dataset folder with the relevant args.

After step 1 of data processing, the processed dataset should have the following structure:

├── <dataset_name>
│   ├── <name_of_traj1>
│   │   ├── 0.jpg
│   │   ├── 1.jpg
│   │   ├── ...
│   │   ├── T_1.jpg
│   │   └── traj_data.pkl
│   ├── <name_of_traj2>
│   │   ├── 0.jpg
│   │   ├── 1.jpg
│   │   ├── ...
│   │   ├── T_2.jpg
│   │   └── traj_data.pkl
│   ...
└── └── <name_of_trajN>
    	├── 0.jpg
    	├── 1.jpg
    	├── ...
        ├── T_N.jpg
        └── traj_data.pkl

Each *.jpg file contains an forward-facing RGB observation from the robot, and they are temporally labeled. The traj_data.pkl file is the odometry data for the trajectory. It’s a pickled dictionary with the keys:

  • "position": An np.ndarray [T, 2] of the xy-coordinates of the robot at each image observation.
  • "yaw": An np.ndarray [T,] of the yaws of the robot at each image observation.

After step 2 of data processing, the processed data-split should the following structure inside gnm_release/train/gnm_train/data/data_splits/:

├── <dataset_name>
│   ├── train
|   |   └── traj_names.txt
└── └── test
        └── traj_names.txt 

Training your GNM

Run this inside the gnm_release/train directory:

python train.py -c <path_of_train_config_file>

The premade config yaml files are in the train/config directory.

Custom Config Files

You can use one of the premade yaml files as a starting point and change the values as you need. config/gnm/gnm_public.yaml is good choice since it has commented arguments. config/defaults.yaml contains the default config values (don't directly train with this config file since it does not specify any datasets for training).

Custom Datasets

Make sure your dataset and data-split directory follows the structures provided in the Data Processing section. Locate train/gnm_train/data/data_config.yaml and append the following:

<dataset_name>:
    metric_waypoints_distance: <average_distance_in_meters_between_waypoints_in_the_dataset>

Locate your training config file and add the following text under the datasets argument (feel free to change the values of end_slack, goals_per_obs, and negative_mining):

<dataset_name>:
    data_folder: <path_to_the_dataset>
    train: data/data_splits/<dataset_name>/train/ 
    test: data/data_splits/<dataset_name>/test/ 
    end_slack: 0 # how many timesteps to cut off from the end of each trajectory  (in case many trajectories end in collisions)
    goals_per_obs: 1 # how many goals are sampled per observation
    negative_mining: True # negative mining from the ViNG paper (Shah et al.)

Training your GNM from a checkpoint

Instead of training from scratch, you can also load an existing checkpoint from the published results. Add load_run: <project_name>/<log_run_name>to your .yaml config file in gnm_release/train/config/. The *.pth of the file you are loading to be saved in this file structure and renamed to “latest”: gnm_release/train/logs/<project_name>/<log_run_name>/latest.pth. This makes it easy to train from the checkpoint of a previous run since logs are saved this way by default. Note: if you are loading a checkpoint from a previous run, check for the name the run in the gnm_release/train/logs/<project_name>/, since the code appends a string of the date to each run_name specified in the config yaml file of the run to avoid duplicate run names.

If you want to use our checkpoints, you can download the *.pth files from this link.

Deployment

This subfolder contains code to load a pre-trained GNM and deploy it on the open-source LoCoBot indoor robot platform. It can be easily adapted to be run on alternate robots, and researchers have been able to independently deploy it on the following robots – Clearpath Jackal, DJI Tello, Unitree A1, TurtleBot2, Vizbot – and in simulated environments like CARLA.

LoCoBot Setup

This software was tested on a LoCoBot running Ubuntu 16.04 (now legacy, but should be forward compatible).

Software Installation (in this order)

  1. ROS: ros-kinetic
  2. ROS packages:
    sudo apt-get install ros-kinetic-usb-cam ros-kinetic-joy
  3. PyRobot
  4. Conda
    • Install anaconda/miniconda/etc. for managing environments
    • Make conda env with environment.yml (run this inside the gnm_release/ directory)
      conda env create -f deployment/environment.yml
    • Source env
      conda activate gnm_deployment
    • (Recommended) add to ~/.bashrc:
      echo “conda activate gnm_deployment” >> ~/.bashrc 
  5. Install the gnm_train packages (run this inside the gnm_release/ directory):
    pip install -e train/
  6. (Recommended) Install tmux if not present. Many of the bash scripts rely on tmux to launch multiple screens with different commands. This will be useful for debugging because you can see the output of each screen.

Hardware Requirements

  • LoCoBot: http://locobot.org
  • A wide-angle RGB camera: Example. The gnm_locobot.launch file uses camera parameters that work with cameras like the ELP fisheye wide angle, feel free to modify to your own. Adjust the camera parameters in gnm_release/deployment/config/camera.yaml your camera accordingly (used for visualization).
  • Joystick/keyboard teleop that works with Linux. Add the index mapping for the deadman_switch on the joystick to the gnm_release/deployment/config/joystick.yaml. You can find the mapping from buttons to indices for common joysticks in the wiki.

Loading the model weights

Save the model weights *.pth file in gnm_release/deployment/model_weights folder. Our models weights are in this link.

Collecting a Topological Map

Make sure to run these scripts inside the gnm_release/deployment/src/ directory.

This section discusses a simple way to create a topological map of the target environment for deployment. For simplicity, we will use the robot in “path-following” mode, i.e. given a single trajectory in an environment, the task is to follow the same trajectory to the goal. The environment may have new/dynamic obstacles, lighting variations etc.

Record the rosbag:

./record_bag.sh <bag_name>

Run this command to teleoperate the robot with the joystick and camera. This command opens up three windows

  1. roslaunch gnm_locobot.launch: This launch file opens the usb_cam node for the camera, the joy node for the joystick, and several nodes for the robot’s mobile base).
  2. python joy_teleop.py: This python script starts a node that reads inputs from the joy topic and outputs them on topics that teleoperate the robot’s base.
  3. rosbag record /usb_cam/image_raw -o <bag_name>: This command isn’t run immediately (you have to click Enter). It will be run in the gnm_release/deployment/topomaps/bags directory, where we recommend you store your rosbags.

Once you are ready to record the bag, run the rosbag record script and teleoperate the robot on the map you want the robot to follow. When you are finished with recording the path, kill the rosbag record command, and then kill the tmux session.

Make the topological map:

./create_topomap.sh <topomap_name> <bag_filename>

This command opens up 3 windows:

  1. roscore
  2. python create_topomap.py —dt 1 —dir <topomap_dir>: This command creates a directory in /gmn_release/deployment/topomaps/images and saves an image as a node in the map every second the bag is played.
  3. rosbag play -r 5 <bag_filename>: This command plays the rosbag at x5 speed, so the python script is actually recording nodes 5 seconds apart. The <bag_filename> should be the entire bag name with the .bag extension. You can change this value in the make_topomap.sh file. The command does not run until you hit Enter, which you should only do once the python script gives its waiting message. Once you play the bag, move to the screen where the python script is running so you can kill it when the rosbag stops playing.

When the bag stops playing, kill the tmux session.

Running the model

Make sure to run this script inside the gnm_release/deployment/src/ directory.

./navigate.sh “--model <model_name> --dir <topomap_dir>

To deploy one of the models from the published results, we are releasing model checkpoints that you can download from this link.

The <model_name> is the name of the model in the gnm_release/deployment/config/models.yaml file. In this file, you specify these parameters of the model for each model (defaults used):

  • path (str, default: large_gnm.pth): path of the *.pth file in gnm_release/deployment/model_weights/
  • image_size (List[int, int], default: [85, 64]): [width, height] of the input images
  • model_type (str, default: gnm): one of these [gnm, stacked, siamese]
  • context (int, default: 5): context length
  • len_traj_pred (int, default: 5): number of future waypoints to predict
  • normalize (bool, default: True): whether or not to normalize the waypoints
  • learn_angle (bool, default: True): whether or not to learn the yaw for each waypoint
  • obs_encoding_size (int, default: 1024): observation encoding dimension (only for the GNM and the siamese model)
  • goal_encoding_size (int, default: 1024): goal encoding dimension (only for the GNM and the siamese model)
  • obsgoal_encoding_size (int, default: 2048): observation + goal encoding dimension (only for the stacked model)

Make sure these configurations match what you used to train the model. The configurations for the models we provided the weights for are provided in yaml file for your reference.

The <topomap_dir> is the name of the directory in gmn_release/deployment/topomaps/images that has the images corresponding to the nodes in the topological map. The images are ordered by name from 0 to N.

This command opens up 4 windows:

  1. roslaunch gnm_locobot.launch: This launch file opens the usb_cam node for the camera, the joy node for the joystick, and several nodes for the robot’s mobile base).
  2. python navigate.py --model <model_name> —dir <topomap_dir>: This python script starts a node that reads in image observations from the /usb_cam/image_raw topic, inputs the observations and the map into the model, and publishes actions to the /waypoint topic.
  3. python joy_teleop.py: This python script starts a node that reads inputs from the joy topic and outputs them on topics that teleoperate the robot’s base.
  4. python pd_controller.py: This python script starts a node that reads messages from the /waypoint topic (waypoints from the model) and outputs velocities to navigate the robot’s base.

When the robot is finishing navigating, kill the pd_controller.py script, and then kill the tmux session. If you want to take control of the robot while it is navigating, the joy_teleop.py script allows you to do so with the joystick.

Adapting this code to different robots

We hope that this codebase is general enough to allow you to deploy it to your favorite ROS-based robots. You can change the robot configuration parameters in gnm_release/deployment/config/robot.yaml, like the max angular and linear velocities of the robot and the topics to publish to teleop and control the robot. Please feel free to create a Github Issue or reach out to the authors at [email protected].

Citing

@inproceedings{shah2022gnm,
   author    = {Dhruv Shah and Ajay Sridhar and Arjun Bhorkar and Noriaki Hirose and Sergey Levine},
   title     = {{GNM: A General Navigation Model to Drive Any Robot}},
   booktitle = {arXiV},
   year      = {2022},
   url      = {https://arxiv.org/abs/2210.03370}
}

drive-any-robot's People

Contributors

ajaysridhar0 avatar eltociear avatar robodhruv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

drive-any-robot's Issues

Some questions about running the project in the CARLA simulator

Hi
thanks for the great work !
I tried to run your work in the CARLA simulator, but encountered some problems.
First of all, whether I use lagre model or medium model, the car always turns in circles. What is the reason? Is the initial point of the car must be consistent with the initial point recorded by rosbag?
In addition, I found that the vehicle will not avoid obstacles when it encounters obstacles. Is it because this work does not add obstacle avoidance algorithms? Just pure path planning?
Finally, can you provide more information on how your work was run in the CARLA simulator?
Looking forward to your reply, thank you!

Code and data release

Hi, I very much enjoyed reading your paper.

In the paper you write that both the code and data will be made publicy available. I wanted to ask when you plan to do this or, if it is not cleaned yet, if it would be possible to already make them available privately to other researchers like myself in the meantime? That would be fantastic.

Dataset Sampling

Hello author,

Why do larger datasets have fewer training points,such as scand?

Code and data release

Hello author, I'm very interested in your research, thanks for your great job
and I want to know when the code and data will go to open source?

Training Dataset

Hi, I really appreciate your excellent work !!!!!
I have the following problems while reproducing the project:

  1. Dataset: Is the "Berkeley [34] Jackal 2m/s 4h suburban" dataset mentioned in TableI of the paper available for download? Where can I get this dataset?

  2. Is it wrong that the platform is ATV in "NeBula [40] ATV 10m/s 10h off-road" mentioned in TableI of the paper? I checked the original NeBula paper (https://arxiv.org/pdf/2103.11470.pdf), and don't see any ATV robots in the paper. I would like to know which dataset is used in the GNM project?

Thanks a lot!!

GNM models cannot avoid obstacles

Hello, your work is excellent, thank you for sharing. When I tested the GNM model in gazebo, I found that the model did not implement obstacle avoidance. Does the model not have this capability during navigation? Or is it limited obstacle avoidance?

Is GNM works for any camera?

There are many types of cameras in the robotics domain. In case of different camera is used in the training and deployment, is GNM work well? Is there any reason to recommend a wide-angle RGB camera in deployment?

some question about depolyment with CARLA

I use default model which you offer ,and CARLA's image is 800600
and in the models.yaml
image_size: [85, 64] # [widht, height] of input images
should i set image_size to [800,600] ?
the error with 800
600
image
the error with 85*64
image
thanks for your reply

rosbag file for deployment

Hi @PrieureDeSion,
Is it enough to prepare image and odometry topic in rosbag file? And is the accuracy of the odometry an important factor in graph generation?

Two forward passes during training

What is the reason of doing two forward passed through the model during training to calculate losses for distance and action? Is it just implementation detail or there is some reason to do it this way related to propagating loss backward?

https://github.com/PrieureDeSion/drive-any-robot/blob/b58b2d75d4153f28f3c0ce9a68d291822b9cf263/train/gnm_train/training/train_utils.py#L239
https://github.com/PrieureDeSion/drive-any-robot/blob/b58b2d75d4153f28f3c0ce9a68d291822b9cf263/train/gnm_train/training/train_utils.py#L246

Could not find Dijkstra’s algorithm

Hi

Thank you for sharing great work! In the paper, I found that you used Dijkstra’s algorithm to compute the optimal sequence. However, I could not find any code related to Dijkstra’s algorithm in 'deployment' directory.

Please let me know if I missed anything.

Question on data sampling rate

Dear authors,

Thank you for the amazing work and releasing the datasets! I have a question regarding the RECON dataset. In the process_bags.py script the sample rate is set by default to 4 Hz. In the process_recon.py script I didn't find anything related to the sample rate. Do you assume that the sample rate in the RECON dataset is also equal to 4 Hz?

Looking forward for your reply!

some question about create_topomap.py

Hi, i try to deploy it with carla,and i collect the .bag file, and i use
./create_topomap.sh topomap_test_name /home/gahho/code/drive-any-robot/deployment/topomaps/bags/test202301191036_2023-01-20-10-53-19.bag
it occurs some error
and i use default
python create_topomap.py
it occurs the same error

图片
thanks for your reply

Deployed in the simulation environment

Hello author, I'm very interested in your research, thanks for your great job.

I see that GNM can be deployed in a simulation environment and I would like to ask you how to deploy it to the simulated environments
?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.