Giter Club home page Giter Club logo

simipu's Introduction

SimIPU

SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang, Bolei Zhou, Hang Zhao

AAAI 2021 (arXiv pdf)

Notice

  • Redundancy version of SimIPU. Main codes are in SimIPU/project_cl.
  • You can find codes of MonoDepth here. We provide detailed configs and results, even in an indoor environment depth dataset, which demonstrates the generalization of SimIPU. Since we enhance the depth framework, model performances are stronger than the ones presented in our paper.

Usage

Installation

This repo is tested on python=3.7, cuda=10.1, pytorch=1.6.0, mmcv-full=1.3.4, mmdetection=2.11.0, mmsegmentation=0.13.0 and mmdetection3D=0.13.0.

Note: since mmdetection and mmdetection3D have made huge compatibility change in their latest versions, their latest version is not compatible with this repo. Make sure you install the correct version.

Follow instructions below to install:

  • Create a conda environment
conda create -n simipu python=3.7
conda activate monocon
git clone https://github.com/zhyever/SimIPU.git
cd SimIPU
  • Install Pytorch 1.6.0
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
  • Install mmcv-full=1.3.4
pip install mmcv-full==1.3.4 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
  • Install mmdetection=2.11.0
git clone https://github.com/open-mmlab/mmdetection.git
cd ./mmdetection
git checkout v2.11.0
pip install -r requirements/build.txt
pip install -v -e .
cd ..
  • Install mmsegmentation=0.13.0
pip install mmsegmentation==0.13.0
  • Build SimIPU
# remember you have "cd SimIPU"
pip install -v -e .
  • Others Maybe there will be notice that there is no required future package after build SimIPU. Install it via conda.
conda install future

Data Preparation

Download KITTI dataset and organize data following the official instructions in mmdetection3D. Then generate data by running:

python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti

If you would like to run experiments on Mono3D Nus, you should follow the official instructions to prepare the NuScenes dataset.

For Waymo pre-training, we have no plan to release corresponding data-preparing scripts for a short time. Some of the scripts are presented in project_cl/tools/. I just have no effort or resources to reproduce the Waymo pre-training process. Since we provide how to prepare the Waymo dataset in our paper, if you have a problem to achieve it, feel free to contact me and I would like to help you.

Pre-training on KITTI

bash tools/dist_train.sh project_cl/configs/simipu/simipu_kitti.py 8 --work-dir work_dir/your/work/dir

Downstream Evaluation

1. Camera-lidar fusion based 3D object detection on kitti dataset.

Remember to change the pre-trained model via changing the value of key load_from in the config.

bash tools/dist_train.sh project_cl/configs/kitti_det3d/moca_r50_kitti.py 8 --work-dir work_dir/your/work/dir

2. Monocular 3D object detection on Nuscenes dataset.

Remember to change the pre-trained model via changing the value of key load_from in the config. Before training, you also need align the key name in checkpoint['state_dict']. See project_cl/tools/convert_pretrain_imgbackbone.py for details.

bash tools/dist_train.sh project_cl/configs/fcos3d_mono3d/fcos3d_r50_nus.py 8 --work-dir work_dir/your/work/dir

2. Monocular Depth Estimation on KITTI/NYU dataset.

See Depth-Estimation-Toolbox.

Pre-trained Model and Results

We provide pre-trained models. As default, the "Full Waymo or Waymo" presents Waymo dataset with load_interval=5. We use discrete frames to ensure training variety. Previous experiments indicate model improvement with load_interval=1 is slight. So actually, 1/10 Waymo means 1/5 (load_interval=5) times 1/10 (use first 1/10 scene data) = 1/50 Waymo data.

Dataset Model
SimIPU KITTI link
SimIPU Waymo link
SimIPU ImageNet Sup + Waymo SimIPU link

Fusion-based 3D object detection results.

AP40@Easy AP40@Mod. AP40@Hard Link
Moca 81.32 70.88 66.19 Log

Monocular 3D object detection results.

Pre-train mAP Link
Fcos3D Scratch 17.9 Log
Fcos3D 1/10 Waymo SimIPU 20.3 Log
Fcos3D 1/5 Waymo SimIPU 22.5 Log
Fcos3D 1/2 Waymo SimIPU 24.7 Log
Fcos3D Full Waymo SimIPU 26.2 Log
Fcos3D ImageNet Sup 27.7 Log
Fcos3D ImageNet Sup + Full Waymo SimIPU 28.4 Log

Citation

If you find our work useful for your research, please consider citing the paper

@article{li2021simipu,
  title={SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations},
  author={Li, Zhenyu and Chen, Zehui and Li, Ang and Fang, Liangji and Jiang, Qinhong and Liu, Xianming and Jiang, Junjun and Zhou, Bolei and Zhao, Hang},
  journal={arXiv preprint arXiv:2112.04680},
  year={2021}
}

simipu's People

Contributors

zhyever avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

simipu's Issues

issues about create_data

Hi, thanks for sharing your great work. I encounter some issues during creating data by running create_data.py
First
create reduced point cloud for training set
[ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/create_data.py", line 247, in
out_dir=args.out_dir)
File "tools/create_data.py", line 24, in kitti_data_prep
kitti.create_reduced_point_cloud(root_path, info_prefix)
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/kitti_converter.py", line 374, in create_reduced_point_cloud
_create_reduced_point_cloud(data_path, train_info_path, save_path)
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/kitti_converter.py", line 314, in _create_reduced_point_cloud
count=-1).reshape([-1, num_features])
ValueError: cannot reshape array of size 461536 into shape (6)

It seems to set the num_features=4 and front_camera_id=2?
in this line:

I assume doing this can solve the problem but encounter another problem when
Create GT Database of KittiDataset
[ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/create_data.py", line 247, in
out_dir=args.out_dir)
File "tools/create_data.py", line 44, in kitti_data_prep
with_bbox=True) # for moca
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/create_gt_database.py", line 275, in create_groundtruth_database
P0 = np.array(example['P0']).reshape(4, 4)
KeyError: 'P0'

Can you help me figure out how to solve these issues?

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

A question about Tab.5 in Ablation Study

Thanks for your excellent work first! I have a question about Tab.5 in Ablation Study. Why "Scratch" equals "SimIPU w/o inter-module ", which means that the intra-module is useless?

error for env setup:ImportError: cannot import name 'ball_query_ext' from 'mmdet3d.ops.ball_query'

Thanks for your insightful paper and clear code repo!

Hi, I met with the ImportError: cannot import name 'ball_query_ext' from 'mmdet3d.ops.ball_query' when run the command bash tools/dist_train.sh project_cl/configs/simipu/simipu_kitti.py 1 --work_dir ./

Do you know how to solve it?

Traceback (most recent call last):
File "tools/train.py", line 16, in
from mmdet3d.apis import train_model
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/apis/init.py", line 1, in
from .inference import (convert_SyncBN, inference_detector,
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/apis/inference.py", line 10, in
from mmdet3d.core import (Box3DMode, DepthInstance3DBoxes,
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/init.py", line 2, in
from .bbox import * # noqa: F401, F403
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/bbox/init.py", line 4, in
from .iou_calculators import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/bbox/iou_calculators/init.py", line 1, in
from .iou3d_calculator import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/bbox/iou_calculators/iou3d_calculator.py", line 5, in
from ..structures import get_box_type
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/bbox/structures/init.py", line 1, in
from .base_box3d import BaseInstance3DBoxes
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/core/bbox/structures/base_box3d.py", line 5, in
from mmdet3d.ops.iou3d import iou3d_cuda
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/ops/init.py", line 5, in
from .ball_query import ball_query
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/ops/ball_query/init.py", line 1, in
from .ball_query import ball_query
File "/mnt/lustre/xxh/SimIPU-main/mmdet3d/ops/ball_query/ball_query.py", line 4, in
from . import ball_query_ext
ImportError: cannot import name 'ball_query_ext' from 'mmdet3d.ops.ball_query' (/mnt/lustre/xxh/SimIPU-main/mmdet3d/ops/ball_query/init.py)

I noticed that you once met with the same error.
open-mmlab/mmdetection3d#503 (comment)

So, I would like to ask for your help~ Hopefully you have a good solution. :)

Have you tried not to crop gradient of f^{\alpha} in eq7?

Hi, I like your good work!
I am wondering have you tried not to crop the gradient of $f^{\alpha}$ in eq7?
If you crop the gradient, it seems like the pertaining of the point branch cannot learn anything from the image branch.

Question about augmentation

Hi, I'm a little confused about the data augmentation.

  1. How did you set img_aug when img_moco=True? It seems that we need an 'img_pipeline' in 'simipu_kitti.py', right?
  2. For 3D augmentation, it seems that it is done in this line. So the 3D augmentation is done based on the point features instead the raw points, right? If I want to try moco=True, how to set 3D augmentation? should I do this in the dataset building part?
    loc_to_ori = apply_3d_transformation(loc_t[i], 'LIDAR', img_metas[i], reverse=True, points_center=self.cl_cfg["points_center"])

Looking forward to your reply. Many thanks.

A question about eq5 and eq6

Thanks for your inspiring work.
I have some wonder about eq5 and eq6.
As far as I know, After eq5, f should be a tensor which is a global feature with shape (batchsize * 2048 * 1 * 1), how can you sample corresponding image features by projection location? After all, there's no spatial information in f anymore.
Or maybe you got features from a previous layer of ResNet?
Looking forward to your reply.

About intra-modal spatial perception module parameters update problem

Hello author, in the SimIPU paper, I find that the inter-modal module loss is gradient truncation in the point cloud features. When gradient backpropagation is considered, this step does not contribute to the update of the intra-modal module parameters. Why does adding this loss improve the point cloud 3D object detection performance?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.