Giter Club home page Giter Club logo

radar_depth's Introduction

Depth Estimation from Monocular Images and Sparse Radar Data

This is the official implementation of the paper Depth Estimation from Monocular Images and Sparse Radar Data. In this repo, we provide code for dataset preprocessing, training, and evaluation.

Some parts of the implementation are adapted from sparse-to-dense. We thank the authors for sharing their implementation.

Updates

  • Training and evaluation code.

  • Trained models.

  • Download instructions for the processed dataset.

  • Detailed documentation for the processed dataset.

  • Code and instructions to process data from the official nuScenes dataset.

Installation

git clone https://github.com/brade31919/radar_depth.git
cd radar_depth

Dataset preparation

Use our processed files

We provide our processed files specifically for the RGB + Radar depth estimation task. The download and setup instructions are:

mkdir DATASET_PATH # Set the path you want to use on your own PC/cluster.
cd DATASET_PATH
wget https://data.vision.ee.ethz.ch/daid/NuscenesRadar/Nuscenes_depth.tar.gz
tar -zxcf Nuscenes_depth.tar.gz

⚠️ Since the processed dataset is an adapted material (non-commercial purpose) from the official nuScenes dataset, the contents in the processed dataset are also subject to the official terms of use and the licenses.

Package installation

cd radar_depth # Go back to the project root
pip install -r requirements.txt

If you encounter error message like "ImportError: libSM.so.6: cannot open shared object file: No such file or directory" from cv2, you can try:

sudo apt-get install libsm6 libxrender1 libfontconfig1

Project configuration setting

we put important path setting in config/config_nuscenes.py. You need to modify them to the paths you use on your own PC/cluster.

Project and dataset root setting

In line 14 and 18, please specify your PROJECT_ROOT and DATASET_ROOT

PROJECT_ROOT = "YOUR_PATH/radar_depth"
DATASET_ROOT = "DATASET_PATH"

Experiment path setting

In line 53, please specify your EXPORT_PATH (the path you want to put our processed dataset).

EXPORT_ROOT = "YOUR_EXP_PATH"

Training

Downlaod the pre-trained models

We provide some pretrained models. They are not the original models used to produce the numbers on the paper but they have similar performances (I lost the original checkpoints due to some cluster issue...).

Please download the pretrained models from here, and put them to pretrained/ folder so that the directory structue looks like this:

pretrained/
├── resnet18_latefusion.pth.tar
└── resnet18_multistage.pth.tar

Train the late fusion model yourself

python main.py \
    --arch resnet18_latefusion \
    --data nuscenes \
    --modality rgbd \
    --decoder upproj \
    -j 12 \
    --epochs 20 \
    -b 16 \
    --max-depth 80 \
    --sparsifier radar

Train the full multi-stage model

To make sure that the training process is stable, we'll initialize each stage from the reset18_latefusion model. If you want to skip the trainig of resnet18_latefusion, you can use our pre-trained models.

python main.py \
    --arch resnet18_multistage_uncertainty_fixs \
    --data nuscenes \
    --modality rgbd \
    --decoder upproj \
    -j 12 \
    --epochs 20 \
    -b 8 \
    --max-depth 80 \
    --sparsifier radar

Here we use batch size 8 (instead of 16). This allows us to train the model on cheaper GPU models such as GTX1080Ti, GTX2080Ti, etc., and the training process is more stable.

Evaluation

After the training process finished, you can evaluate the model by (replace the PATH_TO_CHECKPOINT with the path to checkpoint file you want to evaluate):

python main.py \
    --evaluate PATH_TO_CHECKPOINT \
    --data nuscenes

Code Borrowed From

Citation

Please use the following citation format if you want to reference to our paper.

@InProceedings{radar:depth:20,
   author = {Lin, Juan-Ting and Dai, Dengxin and {Van Gool}, Luc},
   title = {Depth Estimation from Monocular Images and Sparse Radar Data},
   booktitle = {International Conference on Intelligent Robots and Systems (IROS)},
   year = {2020}
}

If you use the processed dataset, remember to cite the offical nuScenes dataset.

@article{nuscenes2019,
  title={nuScenes: A multimodal dataset for autonomous driving},
  author={Holger Caesar and Varun Bankiti and Alex H. Lang and Sourabh Vora and 
          Venice Erin Liong and Qiang Xu and Anush Krishnan and Yu Pan and 
          Giancarlo Baldan and Oscar Beijbom},
  journal={arXiv preprint arXiv:1903.11027},
  year={2019}
}

radar_depth's People

Contributors

brade31919 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

radar_depth's Issues

Without lidar

Hi, @brade31919 In the paper, it is mentioned that Two-Stage architecture can be used to filter radar instead of lidar, but in the program def filter_radar_points(self, input_data): are all applied lidar data to filter radar data. Can we generate preprocessed data without using lidar data?
My english is not very good, sorry.

Own dataset make method

Hi @brade31919,
I am a rookie, forgive me for asking such a stupid question.
Can you give me a tutorial to train my data and apply the model to my own sensor? Or how do you deal with the image and radar data?
Thank you in advance.

Evaluation error args = checkpoint['args']

@brade31919 ,
I debugged the code , and encountered the error in main.py line 199 "args = checkpoint['args']", error notice is KeyError: 'args' I found checkpoint have function value 'arch'.
Can you help me sovle this problem?

Questions about hyperparameters and processed dataset

Hi @brade31919 ,

first of all, thanks for sharing your code again! I have a few questions about your code:

1. Learning rate and batch size:

I noticed that you wrote in your paper:

Unless stated otherwise, all the models are trained using a batch size of 16 and the SGD optimizer with a learning rate of 0.001 and a momentum of 0.9 for 20 epochs.

However, in your code, the default learning rate is 0.01:

parser.add_argument('--lr', '--learning-rate', default=0.01, type=float,

And the batch size indicated in the shell script is 8

Are these the hyperparameters used to achieve the results from your paper? Or did you train your recent models with different parameters?

2. Weight decay:

I think you did not mention weight decay in your paper. Did you train the model in your paper with weight decay of 1e-4 as indicated in your code?

parser.add_argument('--weight-decay', '--wd', default=1e-4, type=float,

3. Processed dataset:

Unfortunately, I cannot download your processed dataset at the moment. I know you plan to add some documentation for that, but perhaps you could answer this question beforehand: Did you make any alteration to the actual data? Or did you merely change the structure of the dataset?

4. Transform points:

What is the usage of the following function?

def transform_point(self, point_data):

Is this incorporated in the generation of your processed dataset? If yes, what is the intention behind it?

Thanks a lot in advance!

Best,
Patrick

About the pretrained argument

Hi @brade31919 ,

me again :) I'm still very glad you shared this, thanks again!

I have a remark and suggestion about args.pretrained. It is set to True by default:

parser.set_defaults(pretrained=True)

And it can be changed by running main.py with option --no-pretrain. It is described as option to disable using ImageNet pretrained weights:

radar_depth/utils.py

Lines 61 to 62 in 5e6e757

parser.add_argument('--no-pretrain', dest='pretrained', action='store_false',
help='not to use ImageNet pre-trained weights')

However, I think ImageNet pretrained weights are always used, as stage-1 and stage-2 model are created with pretrained=True independent of the argument described above:

self.stage1 = ResNet_latefusion2(layers, decoder, output_size, in_channels=4, pretrained=True)

Instead, I think args.pretrained from above refers to loading a checkpoint of the latefusion model. Because ResNet_multistage is created with args.pretrained as input argument and then in ResNet_multistage:

if pretrained is True:
# Get pretrained weights
pretrained_path = os.path.join(cfg.PROJECT_ROOT, "pretrained/resnet18_latefusion.pth.tar")
if not os.path.exists(pretrained_path):
raise ValueError("[Error] Can't find pretrained latefusion model. "\
"Please follow the instructions in README.md to download the weights!")
checkpoint = torch.load(pretrained_path)

So, I think the name args.pretrained is misleading, perhaps it is better to change it to args.load_checkpoint or something similar. The description in utils.py would need to be changed, too.

Thanks and best regards,
Patrick

Evaluation of Pretrained Models

Hello,
I tried to evaluate the models given, I also followed the suggestion provided in #11

But then there are errors generated due to the model creation in line 207

model, loss_weights = create_model(args, output_size=train_loader.dataset.output_size)

In order to get the code running, I commented some of the conditions that address the args.evaluate in the method:

def create_data_loaders(args):

Although some inconsistences are available, but I got the code to evaluate the available pretrained models in:
https://github.com/brade31919/radar_depth/blob/5e6e75772ff379aac65379a50d4042a7c64c869d/README.md#downlaod-the-pre-trained-models

The results are really different from the paper (RMSE is in magnitude of 20).
Is there something wrong, or there're a problem with the available pretrained models?

Question about ref_chan in radar multisweep

Hi @brade31919 ,

another question from my side. It's not a bug report, I just want to get an understanding of how you are processing the data: In your dataset generation, you are aggregating multiple radar sweeps like that:

point_clouds, times = RadarPointCloud.from_file_multisweep(self.dataset, sample_obj,
radar_record["channel"], "LIDAR_TOP",
nsweeps=1)

May I know why you set ref_chan to "LIDAR_TOP" here? Does it make any difference? Because in the same function, the radar points are then transformed from lidar frame to camera frame:

cs_record = self.get('calibrated_sensor', lidar_record['calibrated_sensor_token'])
point_clouds.rotate(Quaternion(cs_record['rotation']).rotation_matrix)
point_clouds.translate(np.array(cs_record['translation']))
# Second step: transform to the global frame.
poserecord = self.dataset.get('ego_pose', lidar_record['ego_pose_token'])
point_clouds.rotate(Quaternion(poserecord['rotation']).rotation_matrix)
point_clouds.translate(np.array(poserecord['translation']))
# Third step: transform into the ego vehicle frame for the timestamp of the image.
poserecord = self.dataset.get('ego_pose', camera_record['ego_pose_token'])
point_clouds.translate(-np.array(poserecord['translation']))
point_clouds.rotate(Quaternion(poserecord['rotation']).rotation_matrix.T)
# Fourth step: transform into the camera.
cs_record = self.dataset.get('calibrated_sensor', camera_record['calibrated_sensor_token'])
point_clouds.translate(-np.array(cs_record['translation']))
point_clouds.rotate(Quaternion(cs_record['rotation']).rotation_matrix.T)

You would get the same result if you used the radar as ref_chan and then transformed from radar to camera frame, right?

Thanks in advance and best regards,
Patrick

Visualization

Hi, thanks for your awesome work!
Can you provide the script for visualization?
Thank you in advance.

Best regards,

kuanchih

Release date

Hi @brade31919,

thanks for your contribution and interesting work. I've read your paper and aim to solve a similar task. It's great that you are willing to share your code. Do you already know when you are going to publish it?

Thanks in advance!

Best,
Patrick

Can not evaluate provided pretrained models

I have downloaded your pretrained models from google drive. Then I follow Evaluation part and launch python main.py, but there is error with missing args. There are not any args in provided checkpoints!

Bidirectional radar sweeps

Hi,

thanks again for sharing your work!

I have one more question regarding the radar sweeps:
I saw that you used radar sweeps t-1, t, t+1 for training. Did you also use radar sweeps from the future (t+1) to test your model as described in your paper or did you only use radar sweeps t-1, t for testing?

Thanks and best regards,
Patrick

Question on the RGB - Radar alignment issue

Hello! I have a small question on the RGB and Radar alignment.

I fused RGB and Radar images obtained from tensorboard and got the following results:

image

Some of images show correct overlap (i.e., projection) results but the right-bottom one looks strange.

Is there any idea on this issue? (Problem of calibration? or synchronization?)

Thank you!

max_depth or max-depth?

Hello! Looks like args.max-depth assigned in the commands are never called. In the main.py, the one being called is args.max_depth. So I guess no matter what number you assigned, e.g. --max-depth 80, the max_depth will only become 0 or np.inf depending on modality?

About train / val scene numbers

Hi @brade31919 ,

I have a quick question about scene numbers for train / val splits of the released preprocessed dataset.

  1. Does the dataset use the exact scene numbers for train / val splits from the following codes?
  2. The random seed is always fixed to 100. Did you also use the seed 100 for the dataset?

def get_train_val_table(self):
seed = cfg.TRAIN_VAL_SEED
val_num = int(cfg.VAL_RATIO * self.num_scenes)
# Get train / val scenes
all = set(list(range(self.num_scenes)))
np.random.seed(seed)
val = set(np.random.choice(np.arange(0, self.num_scenes, 1), val_num, replace=False))
train = all - val
# Split number set
train_scenes = list(train)
val_scenes = list(val)

Thank you very much!

About ver2 in code

Thank you very much for releasing the code.
I noted in the processed data that I downloaded, there were 2 folders, ver2_lidar1_radar1 and ver2_lidar1_radar3_radar_only. It seems to me that ver2_lidar1_radar1 contains image, lidar and some radar information(probably front radar?), and ver2_lidar1_radar1 contains another set of radar information (back radar)? It would be great if you can tell me more about this.

It seems to me that when "ver3" is being used, radar data in folder ver2_lidar1_radar3_radar_only is used instead of the other folder as seen here(keys like 'radar_points' are overwritten?).

I wanted to train a model which can do depth prediction on all the 6 cameras, in addition to the front and back that are done now. It seems to me that 'ver2' of the code might do something like that. I noted that ver2 of the code uses only 1 folder to train - ver2_lidar1_radar1. When I changed version = "ver2" in the config file, I encountered in an error -

image

(I updated the file nuscenes_day_night_info.pkl before running ver2)
Would you know how I could fix this error? I wanted to train a model on all the 6 cameras of the dataset. If you could give some hints on how I could do this, that would be very helpful. Thank you very much for your time!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.