dvlab-research / uvtr Goto Github PK

View Code? Open in Web Editor NEW

218.0 6.0 15.0 636 KB

Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)

Python 99.74% Shell 0.26%

3d-detection multi-modality pytorch

uvtr's Introduction

UVTR

Unifying Voxel-based Representation with Transformer for 3D Object Detection

Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia

[arXiv] [BibTeX]

This project provides an implementation for the NeurIPS 2022 paper "Unifying Voxel-based Representation with Transformer for 3D Object Detection" based on mmDetection3D. UVTR aims to unify multi-modality representations in the voxel space for accurate and robust single- or cross-modality 3D detection.

Preparation

This project is based on mmDetection3D, which can be constructed as follows.

Install PyTorch v1.7.1 and mmDetection3D v0.17.3 following the instructions.
Copy our project and related files to installed mmDetection3D:

cp -r projects mmdetection3d/
cp -r extra_tools mmdetection3d/

Prepare the nuScenes dataset following the structure.
Generate the unified data info and sampling database for nuScenes dataset:

python3 extra_tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes_unified

Training

You can train the model following the instructions. You can find the pretrained models here if you want to train the model from scratch. For example, to launch UVTR training on multi GPUs, one should execute:

cd /path/to/mmdetection3d
bash extra_tools/dist_train.sh ${CFG_FILE} ${NUM_GPUS}

or train with a single GPU:

python3 extra_tools/train.py ${CFG_FILE}

Evaluation

You can evaluate the model following the instructions. For example, to launch UVTR evaluation with a pretrained checkpoint on multi GPUs, one should execute:

bash extra_tools/dist_test.sh ${CFG_FILE} ${CKPT} ${NUM_GPUS} --eval=bbox

or evaluate with a single GPU:

python3 extra_tools/test.py ${CFG_FILE} ${CKPT} --eval=bbox

nuScenes 3D Object Detection Results

We provide results on nuScenes val set with pretrained models.

	NDS(%)	mAP(%)	mATE↓	mASE↓	mAOE↓	mAVE↓	mAAE↓	download
Camera-based
UVTR-C-R50-H5	40.1	31.3	0.810	0.281	0.486	0.793	0.187	GoogleDrive
UVTR-C-R50-H11	41.8	33.3	0.795	0.276	0.452	0.761	0.196	GoogleDrive
UVTR-C-R101	44.1	36.1	0.761	0.271	0.409	0.756	0.203	GoogleDrive
UVTR-CS-R50	47.2	36.2	0.756	0.276	0.399	0.467	0.189	GoogleDrive
UVTR-CS-R101	48.3	37.9	0.739	0.267	0.350	0.510	0.200	GoogleDrive
UVTR-L2C-R101	45.0	37.2	0.735	0.269	0.397	0.761	0.193	GoogleDrive
UVTR-L2CS3-R101	48.8	39.2	0.720	0.268	0.354	0.534	0.206	GoogleDrive
LiDAR-based
UVTR-L-V0075	67.6	60.8	0.335	0.257	0.303	0.206	0.183	GoogleDrive
Multi-modality
UVTR-M-V0075-R101	70.2	65.4	0.333	0.258	0.270	0.216	0.176	GoogleDrive

Acknowledgement

We would like to thank the authors of mmDetection3D and DETR3D for their open-source release.

License

UVTR is released under the Apache 2.0 license.

Citing UVTR

Consider cite UVTR in your publications if it helps your research.

@inproceedings{li2022uvtr,
  title={Unifying Voxel-based Representation with Transformer for 3D Object Detection},
  author={Li, Yanwei and Chen, Yilun and Qi, Xiaojuan and Li, Zeming and Sun, Jian and Jia, Jiaya},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

uvtr's People

Contributors

Stargazers

Watchers

Forkers

chisyliu shengstar bookorcat jlqzzz hongbo123467 a1exr holmes-gu xtk8532704 rhythmoftherain-byte eralien qianqian121 bob-cheng bradgers kairie06

uvtr's Issues

Is the create_data.py same with that in mmdetection3d v0.17.3

Hi, thank you for this great work!

I am wondering whether the data generation process is the same as mmdetection3d except for the output file name. If so, can I use the nuScenes data processed with mmdetection3d?

Thank you very much!

Train

Nice work! Could you release a guide to train UVTR ?

RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling `cusolverDnCreate(handle)

My env : mmcv==1.5.0,mmdet==2.25.0,mmdet3d=v1.0.0ec0,cuda==11.3,pyttorch==1.10.1

File "/root/UVTR/mmdetection3d/projects/mmdet3d_plugin/models/utils/uni3d_viewtrans.py", line 131, in forward
voxel_space = self.depth_proj(mlvl_feats, img_depth=kwargs.pop('img_depth'), **kwargs)
File "/root/miniconda/envs/uvtr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/root/UVTR/mmdetection3d/projects/mmdet3d_plugin/models/utils/uni3d_viewtrans.py", line 213, in forward
kwargs['img_metas'], img_depth, num_sweep, num_cam, fp16_enabled)
File "/root/UVTR/mmdetection3d/projects/mmdet3d_plugin/models/utils/uni3d_viewtrans.py", line 276, in feature_sampling
reference_voxel = reference_voxel @ torch.inverse(uni_rot_aug)[:,None,None]
RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling `cusolverDnCreate(handle)

How can i fix it? Thanks a lot

Cuda out of memory when batch size was set to 1

When using RTX4090 to train the multimodal model, the batch size was set to 1. While the code still reports an error of "cuda out of memory". How can I solve this problem.
Thank you!

RuntimeError: CUDA out of memory.

您好，我想问一下该项目使用的GPU类型，我用的是TITAN RTX，但在使用projects/configs/uvtr/camera_based/knowledge_distill/uvtr_l2cs3_r101_h11.py跑训练的时候，会出现显存不够。
RuntimeError: CUDA out of memory. Tried to allocate 408.00 MiB (GPU 0; 23.65 GiB total capacity; 5.46 GiB already allocated; 212.12 MiB free; 5.95 GiB reserved in total by PyTorch)

Inquiry about 'Effect of Height in Voxel Space' in Sec. 4.2.

Hi Yanwei, thanks for your awesome work! @yanwei-li

You've showed the effect of height in Table 1, which demonstrated larger height values along axis Z contribute more for camera-based 3d detector. I'm very curious about the performance of camera-based detectors when we set height = 21, 41 or even larger. Have you conducted some experiments with much larger height? Do you have any insight about the setting of height value?

Look forward to your reply!

Prediction values becomes 'Nan' after 1 or 2 epochs

Hi,
while training, I receive this error after 1 epoch or sometimes 2 epochs, can you give me a solution for what to do ? thanks in advance

bbox_pred:(tensor(nan, device='cuda:0', grad_fn=), tensor(nan, device='cuda:0', grad_fn=)), cls_score:(tensor(nan, device='cuda:0', grad_fn=), tensor(nan, device='cuda:0', grad_fn=)), gt_bboxes:(tensor(50.7982, device='cuda:0'), tensor(-51.4520, device='cuda:0')), gt_labels:tensor([8, 8, 8, 8, 8, 3, 8, 0, 8, 8, 0, 0, 0, 6, 8, 0, 8, 8, 1, 1, 2, 2, 2, 2, 3, 4, 4, 4, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 9, 9], device='cuda:0'), gt_bboxes_ignore:None

Probably the error is caused from here,

mmdet3d_plugin/models/dense_heads/uvtr_head.py", line 265, in _get_target_single
sampling_result = self.sampler.sample(assign_result, bbox_pred,
UnboundLocalError: local variable 'assign_result' referenced before assignment

No module named 'mmdet3d.ops.iou3d'

Hi
When I run the extra_tools/create_data.py file, I face this module issue, it's not available in mmdet3d.ops. How can I fix this problem?
Thanks

KeyError: 'cam_sweeps_info'

Hi,
Tanks for your excellent work!
But I confuse with the cam_sweeps_info in NuScenesSweepDataset cam_sweeps_info. Where did it get it from？

about knowledge transfer

Which config can perform the setting of Multi-mod to Camera?

Training Errors: "RuntimeError: Expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in call to _th_mm_out"

Hi, I ran into the following problems while training based on this project. Hope for help, thank you.

RUN:
python extra_tools/train.py projects/configs/uvtr/lidar_based/uvtr_l_v0075_h5.py

Environment：
TorchVision: 0.6.0a0+82fd1c8
OpenCV: 4.6.0
MMCV: 1.4.0
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.1
MMDetection: 2.14.0
MMSegmentation: 0.14.1
MMDetection3D: 0.17.3+

Errors:
2022-08-12 11:21:31,523 - mmdet - INFO - Checkpoints will be saved to /0812/mmdetection3d-0.17.3/work_dirs/uvtr_l_v0075_h5 by HardDiskBackend.
Traceback (most recent call last):
File "extra_tools/train.py", line 248, in
main()
File "extra_tools/train.py", line 244, in main
meta=meta)
File "/0812/mmdetection3d-0.17.3/mmdet3d/apis/train.py", line 35, in train_model
meta=meta)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], kwargs)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, kwargs)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
kwargs)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
return self.module.train_step(inputs[0], kwargs[0])
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(data)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 216, in new_func
output = old_func(new_args, new_kwargs)
File "/0812/mmdetection3d-0.17.3/projects/mmdet3d_plugin/models/detectors/uvtr.py", line 255, in forward
return self.forward_train(kwargs)
File "/0812/mmdetection3d-0.17.3/projects/mmdet3d_plugin/models/detectors/uvtr.py", line 296, in forward_train
pts_feat, img_feats, img_depth = self.extract_feat(points=points, img=img, img_metas=img_metas)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 130, in new_func
output = old_func(new_args, new_kwargs)
File "/0812/mmdetection3d-0.17.3/projects/mmdet3d_plugin/models/detectors/uvtr.py", line 187, in extract_feat
pts_feats = self.extract_pts_feat(points, img_feats, img_metas)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 216, in new_func
output = old_func(new_args, new_kwargs)
File "/0812/mmdetection3d-0.17.3/projects/mmdet3d_plugin/models/detectors/uvtr.py", line 138, in extract_pts_feat
x = self.pts_middle_encoder(voxel_features, coors, batch_size)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(args, kwargs)
File "/0812/mmdetection3d-0.17.3/projects/mmdet3d_plugin/models/pts_encoder/sparse_encoder_hd.py", line 119, in forward
x = self.conv_input(input_sp_tensor)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "/0812/mmdetection3d-0.17.3/mmdet3d/ops/spconv/modules.py", line 130, in forward
input = module(input)
File "/.conda/envs/open-mmlab2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "/0812/mmdetection3d-0.17.3/mmdet3d/ops/spconv/conv.py", line 186, in forward
outids.shape[0])
File "/0812/mmdetection3d-0.17.3/mmdet3d/ops/spconv/functional.py", line 65, in forward
indice_pair_num, num_activate_out, False, True)
File "*/0812/mmdetection3d-0.17.3/mmdet3d/ops/spconv/ops.py", line 124, in indice_conv
int(subm))
RuntimeError: Expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in call to _th_mm_out

Which mmdet3d version should I use?

Hello,

Thanks for your great job and sharing the code.

In Readme, you mentioned we should install mmdet v0.17.3.
However, in issues #8 , you mentioned mmdet v1.x supported and data converter updated for mmdet v1.x coordination system.

I am confused which version mmdet3d should be used. Could you clarify it?

I tried v1.x, but failed to process nuScenes data for UVTR. And error is:

Traceback (most recent call last):
File "extra_tools/create_data.py", line 5, in
from data_converter import nuscenes_converter as nuscenes_converter
File "/dfs/data/open_mmlab/UVTR/mmdetection3d/extra_tools/data_converter/nuscenes_converter.py", line 16, in
from projects.mmdet3d_plugin.datasets import NuScenesSweepDataset
File "/dfs/data/open_mmlab/UVTR/mmdetection3d/projects/mmdet3d_plugin/init.py", line 9, in
from .models.detectors import UVTR, UVTRKDCS, UVTRKDL, UVTRKDM
File "/dfs/data/open_mmlab/UVTR/mmdetection3d/projects/mmdet3d_plugin/models/detectors/init.py", line 1, in
from .uvtr import UVTR
File "/dfs/data/open_mmlab/UVTR/mmdetection3d/projects/mmdet3d_plugin/models/detectors/uvtr.py", line 13, in
from projects.mmdet3d_plugin.core.merge_all_augs import merge_all_aug_bboxes_3d
File "/dfs/data/open_mmlab/UVTR/mmdetection3d/projects/mmdet3d_plugin/core/merge_all_augs.py", line 4, in
from mmdet3d.ops.iou3d.iou3d_utils import nms_gpu, nms_normal_gpu
ModuleNotFoundError: No module named 'mmdet3d.ops.iou3d'

How to compute heights Z in voxel space?

Excuse me, I didn't understand the core of following table in your paper.

Could you please explain how to compute heights Z in voxel space, such as h=5 or h=11.
Looking forward to your early reply. Thanks a lot!

AttributeError: 'list' object has no attribute 'new_zeros'

Sorry to bother again. I have tried to reimplement your LiDAR-based model on KITTI-like dataset. But after the first epoch, the error occurs as follows:

2022-11-16 17:11:58,335 - mmdet - INFO - workflow: [('train', 1)], max: 40 epochs
/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/functional.py:3981: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn(
2022-11-16 17:13:23,481 - mmdet - INFO - Epoch [1][50/2521]	lr: 2.000e-05, eta: 1 day, 23:24:53, time: 1.694, data_time: 0.245, memory: 21946, loss_cls: 0.9841, loss_bbox: 2.1871, d0.loss_cls: 1.4273, d0.loss_bbox: 2.6728, d1.loss_cls: 1.1218, d1.loss_bbox: 2.3758, loss: 10.7690, grad_norm: 51.1293
2022-11-16 17:14:37,135 - mmdet - INFO - Epoch [1][100/2521]	lr: 2.000e-05, eta: 1 day, 20:18:22, time: 1.473, data_time: 0.014, memory: 21984, loss_cls: 0.4211, loss_bbox: 1.6809, d0.loss_cls: 0.4380, d0.loss_bbox: 2.1764, d1.loss_cls: 0.4186, d1.loss_bbox: 1.8243, loss: 6.9592, grad_norm: 42.8634
2022-11-16 17:15:51,275 - mmdet - INFO - Epoch [1][150/2521]	lr: 2.001e-05, eta: 1 day, 19:20:50, time: 1.483, data_time: 0.013, memory: 21984, loss_cls: 0.4054, loss_bbox: 1.5339, d0.loss_cls: 0.4055, d0.loss_bbox: 1.8876, d1.loss_cls: 0.4038, d1.loss_bbox: 1.6010, loss: 6.2371, grad_norm: 95.1544
2022-11-16 17:17:05,245 - mmdet - INFO - Epoch [1][200/2521]	lr: 2.001e-05, eta: 1 day, 18:50:00, time: 1.479, data_time: 0.011, memory: 21984, loss_cls: 0.3977, loss_bbox: 1.4369, d0.loss_cls: 0.4028, d0.loss_bbox: 1.7172, d1.loss_cls: 0.4005, d1.loss_bbox: 1.4918, loss: 5.8469, grad_norm: 119.0783
2022-11-16 17:18:19,204 - mmdet - INFO - Epoch [1][250/2521]	lr: 2.002e-05, eta: 1 day, 18:30:57, time: 1.479, data_time: 0.012, memory: 21984, loss_cls: 0.3829, loss_bbox: 1.3625, d0.loss_cls: 0.4055, d0.loss_bbox: 1.6472, d1.loss_cls: 0.3987, d1.loss_bbox: 1.4512, loss: 5.6480, grad_norm: 135.1175
2022-11-16 17:19:33,492 - mmdet - INFO - Epoch [1][300/2521]	lr: 2.002e-05, eta: 1 day, 18:19:40, time: 1.486, data_time: 0.013, memory: 21984, loss_cls: 0.3688, loss_bbox: 1.3312, d0.loss_cls: 0.4055, d0.loss_bbox: 1.6500, d1.loss_cls: 0.3922, d1.loss_bbox: 1.4440, loss: 5.5917, grad_norm: 139.5463
2022-11-16 17:20:47,706 - mmdet - INFO - Epoch [1][350/2521]	lr: 2.003e-05, eta: 1 day, 18:10:54, time: 1.484, data_time: 0.012, memory: 22019, loss_cls: 0.3540, loss_bbox: 1.2724, d0.loss_cls: 0.3996, d0.loss_bbox: 1.5833, d1.loss_cls: 0.3783, d1.loss_bbox: 1.3937, loss: 5.3813, grad_norm: 163.9016
2022-11-16 17:22:01,765 - mmdet - INFO - Epoch [1][400/2521]	lr: 2.004e-05, eta: 1 day, 18:03:22, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.3373, loss_bbox: 1.2172, d0.loss_cls: 0.3936, d0.loss_bbox: 1.5230, d1.loss_cls: 0.3601, d1.loss_bbox: 1.3354, loss: 5.1668, grad_norm: 174.4764
2022-11-16 17:23:15,944 - mmdet - INFO - Epoch [1][450/2521]	lr: 2.006e-05, eta: 1 day, 17:57:41, time: 1.484, data_time: 0.012, memory: 22019, loss_cls: 0.3214, loss_bbox: 1.2013, d0.loss_cls: 0.4001, d0.loss_bbox: 1.5164, d1.loss_cls: 0.3453, d1.loss_bbox: 1.3183, loss: 5.1028, grad_norm: 188.8877
2022-11-16 17:24:29,580 - mmdet - INFO - Epoch [1][500/2521]	lr: 2.007e-05, eta: 1 day, 17:51:05, time: 1.473, data_time: 0.012, memory: 22019, loss_cls: 0.3037, loss_bbox: 1.1773, d0.loss_cls: 0.3995, d0.loss_bbox: 1.5013, d1.loss_cls: 0.3298, d1.loss_bbox: 1.2953, loss: 5.0068, grad_norm: 197.3203
2022-11-16 17:25:43,725 - mmdet - INFO - Epoch [1][550/2521]	lr: 2.008e-05, eta: 1 day, 17:46:59, time: 1.483, data_time: 0.011, memory: 22019, loss_cls: 0.2847, loss_bbox: 1.1176, d0.loss_cls: 0.3936, d0.loss_bbox: 1.4716, d1.loss_cls: 0.3031, d1.loss_bbox: 1.2501, loss: 4.8207, grad_norm: 190.2340
2022-11-16 17:26:57,780 - mmdet - INFO - Epoch [1][600/2521]	lr: 2.010e-05, eta: 1 day, 17:43:08, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2828, loss_bbox: 1.0936, d0.loss_cls: 0.3934, d0.loss_bbox: 1.4779, d1.loss_cls: 0.2948, d1.loss_bbox: 1.2363, loss: 4.7790, grad_norm: 198.6171
2022-11-16 17:28:11,650 - mmdet - INFO - Epoch [1][650/2521]	lr: 2.011e-05, eta: 1 day, 17:39:12, time: 1.477, data_time: 0.013, memory: 22019, loss_cls: 0.2745, loss_bbox: 1.0336, d0.loss_cls: 0.3889, d0.loss_bbox: 1.4437, d1.loss_cls: 0.2831, d1.loss_bbox: 1.1817, loss: 4.6055, grad_norm: 174.9819
2022-11-16 17:29:26,145 - mmdet - INFO - Epoch [1][700/2521]	lr: 2.013e-05, eta: 1 day, 17:37:08, time: 1.490, data_time: 0.013, memory: 22019, loss_cls: 0.2646, loss_bbox: 1.0278, d0.loss_cls: 0.3832, d0.loss_bbox: 1.4581, d1.loss_cls: 0.2719, d1.loss_bbox: 1.1750, loss: 4.5805, grad_norm: 196.6794
2022-11-16 17:30:40,141 - mmdet - INFO - Epoch [1][750/2521]	lr: 2.015e-05, eta: 1 day, 17:34:05, time: 1.480, data_time: 0.013, memory: 22019, loss_cls: 0.2662, loss_bbox: 0.9865, d0.loss_cls: 0.3821, d0.loss_bbox: 1.4307, d1.loss_cls: 0.2684, d1.loss_bbox: 1.1346, loss: 4.4686, grad_norm: 228.7645
2022-11-16 17:31:54,235 - mmdet - INFO - Epoch [1][800/2521]	lr: 2.017e-05, eta: 1 day, 17:31:27, time: 1.482, data_time: 0.012, memory: 22019, loss_cls: 0.2662, loss_bbox: 0.9459, d0.loss_cls: 0.3800, d0.loss_bbox: 1.4285, d1.loss_cls: 0.2684, d1.loss_bbox: 1.0973, loss: 4.3862, grad_norm: 208.7490
2022-11-16 17:33:08,368 - mmdet - INFO - Epoch [1][850/2521]	lr: 2.020e-05, eta: 1 day, 17:29:04, time: 1.483, data_time: 0.012, memory: 22019, loss_cls: 0.2593, loss_bbox: 0.9391, d0.loss_cls: 0.3730, d0.loss_bbox: 1.4255, d1.loss_cls: 0.2590, d1.loss_bbox: 1.0849, loss: 4.3407, grad_norm: 221.2812
2022-11-16 17:34:22,173 - mmdet - INFO - Epoch [1][900/2521]	lr: 2.022e-05, eta: 1 day, 17:26:12, time: 1.476, data_time: 0.014, memory: 22019, loss_cls: 0.2561, loss_bbox: 0.9201, d0.loss_cls: 0.3678, d0.loss_bbox: 1.4339, d1.loss_cls: 0.2527, d1.loss_bbox: 1.0674, loss: 4.2980, grad_norm: 232.0119
2022-11-16 17:35:36,218 - mmdet - INFO - Epoch [1][950/2521]	lr: 2.025e-05, eta: 1 day, 17:23:56, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2566, loss_bbox: 0.8935, d0.loss_cls: 0.3706, d0.loss_bbox: 1.4161, d1.loss_cls: 0.2546, d1.loss_bbox: 1.0344, loss: 4.2258, grad_norm: 209.8056
2022-11-16 17:36:50,186 - mmdet - INFO - Exp name: uvtr_lidar_v005_h5_dair_base.py
2022-11-16 17:36:50,186 - mmdet - INFO - Epoch [1][1000/2521]	lr: 2.027e-05, eta: 1 day, 17:21:38, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2621, loss_bbox: 0.8907, d0.loss_cls: 0.3714, d0.loss_bbox: 1.4253, d1.loss_cls: 0.2594, d1.loss_bbox: 1.0270, loss: 4.2359, grad_norm: 195.6460
2022-11-16 17:38:04,256 - mmdet - INFO - Epoch [1][1050/2521]	lr: 2.030e-05, eta: 1 day, 17:19:36, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2440, loss_bbox: 0.8606, d0.loss_cls: 0.3578, d0.loss_bbox: 1.3989, d1.loss_cls: 0.2406, d1.loss_bbox: 0.9983, loss: 4.1002, grad_norm: 228.3762
2022-11-16 17:39:18,318 - mmdet - INFO - Epoch [1][1100/2521]	lr: 2.033e-05, eta: 1 day, 17:17:38, time: 1.481, data_time: 0.011, memory: 22019, loss_cls: 0.2528, loss_bbox: 0.8443, d0.loss_cls: 0.3680, d0.loss_bbox: 1.3890, d1.loss_cls: 0.2517, d1.loss_bbox: 0.9703, loss: 4.0760, grad_norm: 228.4887
2022-11-16 17:40:32,249 - mmdet - INFO - Epoch [1][1150/2521]	lr: 2.036e-05, eta: 1 day, 17:15:32, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2454, loss_bbox: 0.8432, d0.loss_cls: 0.3524, d0.loss_bbox: 1.3916, d1.loss_cls: 0.2417, d1.loss_bbox: 0.9657, loss: 4.0400, grad_norm: 193.1862
2022-11-16 17:41:46,378 - mmdet - INFO - Epoch [1][1200/2521]	lr: 2.039e-05, eta: 1 day, 17:13:47, time: 1.483, data_time: 0.011, memory: 22019, loss_cls: 0.2477, loss_bbox: 0.8358, d0.loss_cls: 0.3528, d0.loss_bbox: 1.3768, d1.loss_cls: 0.2444, d1.loss_bbox: 0.9497, loss: 4.0072, grad_norm: 183.7507
2022-11-16 17:43:00,171 - mmdet - INFO - Epoch [1][1250/2521]	lr: 2.043e-05, eta: 1 day, 17:11:38, time: 1.476, data_time: 0.013, memory: 22019, loss_cls: 0.2449, loss_bbox: 0.8308, d0.loss_cls: 0.3465, d0.loss_bbox: 1.3713, d1.loss_cls: 0.2445, d1.loss_bbox: 0.9499, loss: 3.9879, grad_norm: 205.1622
2022-11-16 17:44:14,310 - mmdet - INFO - Epoch [1][1300/2521]	lr: 2.046e-05, eta: 1 day, 17:09:59, time: 1.483, data_time: 0.012, memory: 22019, loss_cls: 0.2431, loss_bbox: 0.8084, d0.loss_cls: 0.3487, d0.loss_bbox: 1.3483, d1.loss_cls: 0.2385, d1.loss_bbox: 0.9131, loss: 3.9001, grad_norm: 194.9144
2022-11-16 17:45:28,180 - mmdet - INFO - Epoch [1][1350/2521]	lr: 2.050e-05, eta: 1 day, 17:08:02, time: 1.477, data_time: 0.011, memory: 22019, loss_cls: 0.2390, loss_bbox: 0.8170, d0.loss_cls: 0.3428, d0.loss_bbox: 1.3517, d1.loss_cls: 0.2329, d1.loss_bbox: 0.9203, loss: 3.9038, grad_norm: 218.2914
2022-11-16 17:46:42,190 - mmdet - INFO - Epoch [1][1400/2521]	lr: 2.053e-05, eta: 1 day, 17:06:19, time: 1.480, data_time: 0.013, memory: 22019, loss_cls: 0.2407, loss_bbox: 0.8077, d0.loss_cls: 0.3378, d0.loss_bbox: 1.3599, d1.loss_cls: 0.2365, d1.loss_bbox: 0.9170, loss: 3.8997, grad_norm: 183.4504
2022-11-16 17:47:56,515 - mmdet - INFO - Epoch [1][1450/2521]	lr: 2.057e-05, eta: 1 day, 17:04:59, time: 1.486, data_time: 0.014, memory: 22019, loss_cls: 0.2392, loss_bbox: 0.8242, d0.loss_cls: 0.3317, d0.loss_bbox: 1.3597, d1.loss_cls: 0.2396, d1.loss_bbox: 0.9319, loss: 3.9264, grad_norm: 193.3301
2022-11-16 17:49:10,456 - mmdet - INFO - Epoch [1][1500/2521]	lr: 2.061e-05, eta: 1 day, 17:03:14, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2360, loss_bbox: 0.8014, d0.loss_cls: 0.3198, d0.loss_bbox: 1.3446, d1.loss_cls: 0.2322, d1.loss_bbox: 0.9105, loss: 3.8444, grad_norm: 199.8085
2022-11-16 17:50:24,102 - mmdet - INFO - Epoch [1][1550/2521]	lr: 2.065e-05, eta: 1 day, 17:01:12, time: 1.473, data_time: 0.012, memory: 22019, loss_cls: 0.2332, loss_bbox: 0.7892, d0.loss_cls: 0.3172, d0.loss_bbox: 1.3103, d1.loss_cls: 0.2254, d1.loss_bbox: 0.8903, loss: 3.7657, grad_norm: 173.7445
2022-11-16 17:51:37,885 - mmdet - INFO - Epoch [1][1600/2521]	lr: 2.070e-05, eta: 1 day, 16:59:21, time: 1.476, data_time: 0.012, memory: 22019, loss_cls: 0.2333, loss_bbox: 0.8114, d0.loss_cls: 0.3123, d0.loss_bbox: 1.3273, d1.loss_cls: 0.2297, d1.loss_bbox: 0.9138, loss: 3.8278, grad_norm: 199.8943
2022-11-16 17:52:51,448 - mmdet - INFO - Epoch [1][1650/2521]	lr: 2.074e-05, eta: 1 day, 16:57:20, time: 1.471, data_time: 0.013, memory: 22019, loss_cls: 0.2292, loss_bbox: 0.7962, d0.loss_cls: 0.3172, d0.loss_bbox: 1.3213, d1.loss_cls: 0.2271, d1.loss_bbox: 0.8975, loss: 3.7886, grad_norm: 201.4746
2022-11-16 17:54:05,006 - mmdet - INFO - Epoch [1][1700/2521]	lr: 2.079e-05, eta: 1 day, 16:55:21, time: 1.471, data_time: 0.012, memory: 22019, loss_cls: 0.2382, loss_bbox: 0.8029, d0.loss_cls: 0.3163, d0.loss_bbox: 1.3239, d1.loss_cls: 0.2349, d1.loss_bbox: 0.9031, loss: 3.8194, grad_norm: 161.0498
2022-11-16 17:55:18,464 - mmdet - INFO - Epoch [1][1750/2521]	lr: 2.083e-05, eta: 1 day, 16:53:19, time: 1.469, data_time: 0.012, memory: 22019, loss_cls: 0.2330, loss_bbox: 0.8077, d0.loss_cls: 0.3111, d0.loss_bbox: 1.3382, d1.loss_cls: 0.2298, d1.loss_bbox: 0.9107, loss: 3.8306, grad_norm: 160.1327
2022-11-16 17:56:31,881 - mmdet - INFO - Epoch [1][1800/2521]	lr: 2.088e-05, eta: 1 day, 16:51:17, time: 1.468, data_time: 0.012, memory: 22019, loss_cls: 0.2326, loss_bbox: 0.7828, d0.loss_cls: 0.3100, d0.loss_bbox: 1.2945, d1.loss_cls: 0.2313, d1.loss_bbox: 0.8835, loss: 3.7346, grad_norm: 167.5468
2022-11-16 17:57:45,458 - mmdet - INFO - Epoch [1][1850/2521]	lr: 2.093e-05, eta: 1 day, 16:49:27, time: 1.472, data_time: 0.013, memory: 22019, loss_cls: 0.2278, loss_bbox: 0.7971, d0.loss_cls: 0.3011, d0.loss_bbox: 1.3097, d1.loss_cls: 0.2277, d1.loss_bbox: 0.8909, loss: 3.7543, grad_norm: 150.6342
2022-11-16 17:58:59,027 - mmdet - INFO - Epoch [1][1900/2521]	lr: 2.098e-05, eta: 1 day, 16:47:38, time: 1.471, data_time: 0.012, memory: 22019, loss_cls: 0.2273, loss_bbox: 0.7713, d0.loss_cls: 0.3019, d0.loss_bbox: 1.2775, d1.loss_cls: 0.2264, d1.loss_bbox: 0.8682, loss: 3.6727, grad_norm: 165.1924
2022-11-16 18:00:12,236 - mmdet - INFO - Epoch [1][1950/2521]	lr: 2.103e-05, eta: 1 day, 16:45:33, time: 1.464, data_time: 0.012, memory: 22019, loss_cls: 0.2252, loss_bbox: 0.7895, d0.loss_cls: 0.2985, d0.loss_bbox: 1.3028, d1.loss_cls: 0.2237, d1.loss_bbox: 0.8888, loss: 3.7285, grad_norm: 182.9384
2022-11-16 18:01:25,631 - mmdet - INFO - Exp name: uvtr_lidar_v005_h5_dair_base.py
2022-11-16 18:01:25,636 - mmdet - INFO - Epoch [1][2000/2521]	lr: 2.109e-05, eta: 1 day, 16:43:39, time: 1.468, data_time: 0.013, memory: 22019, loss_cls: 0.2275, loss_bbox: 0.7946, d0.loss_cls: 0.2998, d0.loss_bbox: 1.2840, d1.loss_cls: 0.2266, d1.loss_bbox: 0.8917, loss: 3.7242, grad_norm: 168.2174
2022-11-16 18:02:39,211 - mmdet - INFO - Epoch [1][2050/2521]	lr: 2.114e-05, eta: 1 day, 16:41:57, time: 1.472, data_time: 0.013, memory: 22019, loss_cls: 0.2159, loss_bbox: 0.7594, d0.loss_cls: 0.2909, d0.loss_bbox: 1.2688, d1.loss_cls: 0.2173, d1.loss_bbox: 0.8575, loss: 3.6099, grad_norm: 165.3231
2022-11-16 18:03:52,862 - mmdet - INFO - Epoch [1][2100/2521]	lr: 2.120e-05, eta: 1 day, 16:40:19, time: 1.473, data_time: 0.011, memory: 22019, loss_cls: 0.2234, loss_bbox: 0.7869, d0.loss_cls: 0.2921, d0.loss_bbox: 1.2852, d1.loss_cls: 0.2176, d1.loss_bbox: 0.8808, loss: 3.6860, grad_norm: 165.1741
2022-11-16 18:05:06,507 - mmdet - INFO - Epoch [1][2150/2521]	lr: 2.126e-05, eta: 1 day, 16:38:42, time: 1.473, data_time: 0.013, memory: 22019, loss_cls: 0.2214, loss_bbox: 0.7613, d0.loss_cls: 0.2893, d0.loss_bbox: 1.2568, d1.loss_cls: 0.2207, d1.loss_bbox: 0.8534, loss: 3.6028, grad_norm: 155.9333
2022-11-16 18:06:20,227 - mmdet - INFO - Epoch [1][2200/2521]	lr: 2.132e-05, eta: 1 day, 16:37:09, time: 1.474, data_time: 0.012, memory: 22034, loss_cls: 0.2240, loss_bbox: 0.7575, d0.loss_cls: 0.3003, d0.loss_bbox: 1.2418, d1.loss_cls: 0.2227, d1.loss_bbox: 0.8438, loss: 3.5903, grad_norm: 175.5790
2022-11-16 18:07:33,918 - mmdet - INFO - Epoch [1][2250/2521]	lr: 2.138e-05, eta: 1 day, 16:35:36, time: 1.474, data_time: 0.012, memory: 22034, loss_cls: 0.2296, loss_bbox: 0.7708, d0.loss_cls: 0.2934, d0.loss_bbox: 1.2698, d1.loss_cls: 0.2288, d1.loss_bbox: 0.8680, loss: 3.6603, grad_norm: 168.9581
2022-11-16 18:08:47,585 - mmdet - INFO - Epoch [1][2300/2521]	lr: 2.144e-05, eta: 1 day, 16:34:03, time: 1.473, data_time: 0.013, memory: 22037, loss_cls: 0.2158, loss_bbox: 0.7582, d0.loss_cls: 0.2880, d0.loss_bbox: 1.2530, d1.loss_cls: 0.2133, d1.loss_bbox: 0.8483, loss: 3.5765, grad_norm: 196.4898
2022-11-16 18:10:01,111 - mmdet - INFO - Epoch [1][2350/2521]	lr: 2.150e-05, eta: 1 day, 16:32:24, time: 1.471, data_time: 0.012, memory: 22037, loss_cls: 0.2122, loss_bbox: 0.7493, d0.loss_cls: 0.2837, d0.loss_bbox: 1.2204, d1.loss_cls: 0.2123, d1.loss_bbox: 0.8336, loss: 3.5114, grad_norm: 197.4392
2022-11-16 18:11:14,752 - mmdet - INFO - Epoch [1][2400/2521]	lr: 2.157e-05, eta: 1 day, 16:30:52, time: 1.473, data_time: 0.013, memory: 22037, loss_cls: 0.2184, loss_bbox: 0.7428, d0.loss_cls: 0.2821, d0.loss_bbox: 1.2388, d1.loss_cls: 0.2155, d1.loss_bbox: 0.8361, loss: 3.5336, grad_norm: 191.5447
2022-11-16 18:12:28,253 - mmdet - INFO - Epoch [1][2450/2521]	lr: 2.163e-05, eta: 1 day, 16:29:15, time: 1.470, data_time: 0.011, memory: 22037, loss_cls: 0.2189, loss_bbox: 0.7567, d0.loss_cls: 0.2813, d0.loss_bbox: 1.2494, d1.loss_cls: 0.2155, d1.loss_bbox: 0.8518, loss: 3.5737, grad_norm: 172.0291
2022-11-16 18:13:41,919 - mmdet - INFO - Epoch [1][2500/2521]	lr: 2.170e-05, eta: 1 day, 16:27:44, time: 1.473, data_time: 0.012, memory: 22037, loss_cls: 0.2170, loss_bbox: 0.7464, d0.loss_cls: 0.2834, d0.loss_bbox: 1.2192, d1.loss_cls: 0.2165, d1.loss_bbox: 0.8354, loss: 3.5180, grad_norm: 179.5014
2022-11-16 18:14:13,086 - mmdet - INFO - Saving checkpoint at 1 epochs
[                                                  ] 0/2016, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/train.py", line 248, in <module>
    main()
  File "tools/train.py", line 237, in main
    train_model(
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/apis/train.py", line 28, in train_model
    train_detector(
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_epoch')
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
    getattr(hook, fn_name)(self)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 237, in after_train_epoch
    self._do_evaluate(runner)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/core/evaluation/eval_hooks.py", line 17, in _do_evaluate
    results = single_gpu_test(runner.model, self.dataloader, show=False)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/apis/test.py", line 27, in single_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 42, in forward
    return super().forward(*inputs, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 257, in forward
    return self.forward_test(**kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 326, in forward_test
    results = self.simple_test(img_metas[0], points, img[0], **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 345, in simple_test
    pts_feat, img_feats, img_depth = self.extract_feat(points=points, img=img, img_metas=img_metas)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 187, in extract_feat
    pts_feats = self.extract_pts_feat(points, img_feats, img_metas)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 132, in extract_pts_feat
    voxels, num_points, coors = self.voxelize(pts)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py", line 225, in voxelize
    res_voxels, res_coors, res_num_points = self.pts_voxel_layer(res)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 136, in forward
    return voxelization(input, self.voxel_size, self.point_cloud_range,
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 57, in forward
    voxels = points.new_zeros(
AttributeError: 'list' object has no attribute 'new_zeros'

mmdet3d version problem

Hi, thanks for you work. I found this project is based on mmdet 0.17. I want to know if I use mmdet > 1.0, will I need to modify the code? mmdet3d > 1.0 seems have changed the coordinate system.

Results based on distance

I see you report the performance with different distances, how did you get results on different distances??

could you please provide your training log?

rt,thanks.

'LoadPointsFromFile is not in the pipeline registry'

Thank you for your excellent work. When training the multi-modal model, I met an error that

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
    return obj_cls(**args)
  File "/workspace/mmdetection3d/projects/mmdet3d_plugin/datasets/pipelines/transform_3d.py", line 608, in __init__
    self.db_sampler = build_from_cfg(db_sampler, OBJECTSAMPLERS)
  File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: "UnifiedDataBaseSampler: 'LoadPointsFromFile is not in the pipeline registry'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
    return obj_cls(**args)
  File "/workspace/mmdetection3d/projects/mmdet3d_plugin/datasets/nuscenes_dataset.py", line 134, in __init__
    super().__init__(
  File "/workspace/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 99, in __init__
    self.pipeline = Compose(pipeline)
  File "/workspace/mmdetection3d/mmdet3d/datasets/pipelines/compose.py", line 31, in __init__
    transform = build_from_cfg(transform, MMDET_PIPELINES)
  File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: 'UnifiedObjectSample: "UnifiedDataBaseSampler: \'LoadPointsFromFile is not in the pipeline registry\'"'

I think is is caused by your mmdet3d plugin code, do you have any ideas about that? Thank you in advance.

About the verison of mmdet3d

Dear Author,
I wonder if your given weights is finsihed with mmDetection3D v0.17.3, since when I test it on nuscenes eval with v1.0.0rc5. The performace got crashed with mAP: 0.1344. I guessed the new weight file trained with mmdet3d 1.0 is requried and do you have a new weight when mmdet3d v1.0? Thank you in advance.

image gtaug difference with pointaugmenting

Hi,I see in paper that you say: 'In this work, we follow previous studies [50, 7] and use a unified approach for GT-sampling', but after I reading the point-augmenting code and your code, I still find some difference between your unified approach and point-augmenting approach, for example, when generate image gtaug, you only keep one image gtaug for one 3d box, but point-augmenting keep 6 image gtaug in nuscenes, can I ask did your approach is better than Pointaugmenting.

TypeError: init() got an unexpected keyword argument 'return_gt_info'

@yanwei-li @yukang2017
First, I appreciate your work very much. But when generating ground-truth dataset, I encountered the following problems:

Create GT Database of NuScenesDataset
Traceback (most recent call last):
File "anaconda3/envs/uvtr_mmdet3d/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
return obj_cls(**args)
TypeError: init() got an unexpected keyword argument 'return_gt_info'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "extra_tools/create_data.py", line 85, in
max_sweeps=args.max_sweeps)
File "extra_tools/create_data.py", line 43, in nuscenes_data_prep
f'{out_dir}/{info_prefix}_infos_train.pkl')
File "3D_Det/UVTR_mmdetection3d/extra_tools/data_converter/create_unified_gt_database.py", line 70, in create_groundtruth_database
dataset = build_dataset(dataset_cfg)
File "3D_Det/UVTR_mmdetection3d/mmdet3d/datasets/builder.py", line 41, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "anaconda3/envs/uvtr_mmdet3d/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: NuScenesDataset: init() got an unexpected keyword argument 'return_gt_info'

In addition, in create_unified_gt_database.py, how to obtain the file of nuscenes_img_pro_infos_train.pkl?

about checkpoints

hi i find there is no fcos3d and uvtr_l_v01_h5 ckpt on uvtr_l2c_r101_h11.py
could you fix it ?

Question about traning time and inference time?

There is no information about the flops in your great work, could you tell me how many time you use to train this model , the graphics card you select and the flops?
Thanks a lot!

nuscene preprocess

It takes a long time on my device for preprocess nuscene dataset. Can you make the preprocessing results available for direct download?

Thank you so much!

excuese me，create data error

environment：
mmcv-full 1.3.8
mmdet 2.14.0
mmdet3d 0.17.3
mmsegmentation 0.14.1
torch1.7.1
cuda 11.0

UVTR/mmdetection3d/mmdet3d/core/bbox/structures/utils.py", line 139, in points_cam2img
4, device=proj_mat.device, dtype=proj_mat.dtype)
AttributeError: 'numpy.ndarray' object has no attribute 'device'

Gpu memory exceeded when training a a model

Hi, thanks for the great work. I was experimenting on an adversarial training method using your multi-modality model, but the gpu memory ran out very quickly. Can you share roughly how much gpu memory is required to train the model assuming the batch size is one?

low results on nusenes-mini

Hi,
I was trying to evaluate the pretrained models on the nuscenes mini data, but I am getting very bad results which is unexpected.
I have mmdet3d v1.0.0.rc4 installed

Do you maybe know what the problem is?

Thanks :)

here is the log

user@gpu-cluster:~/src/mmdet$ python3 extra_tools/test.py configs/uvtr/multi_modality/uvtr_m_v0075_r101_h5.py models/uvtr/uvtr_m_v0075_r101_h5.pth --eval=bbox --out mini_test.pkl
/home/user/.local/lib/python3.8/site-packages/mmcv/__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
/home/user/.local/lib/python3.8/site-packages/mmdet/utils/setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/home/user/.local/lib/python3.8/site-packages/mmdet/utils/setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
load checkpoint from local path: models/uvtr/uvtr_m_v0075_r101_h5.pth
2023-04-25 16:42:17,693 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.0.conv2 is upgraded to version 2.
2023-04-25 16:42:17,695 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.1.conv2 is upgraded to version 2.
2023-04-25 16:42:17,697 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.2.conv2 is upgraded to version 2.
2023-04-25 16:42:17,699 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.3.conv2 is upgraded to version 2.
2023-04-25 16:42:17,701 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.4.conv2 is upgraded to version 2.
2023-04-25 16:42:17,703 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.5.conv2 is upgraded to version 2.
2023-04-25 16:42:17,704 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.6.conv2 is upgraded to version 2.
2023-04-25 16:42:17,706 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.7.conv2 is upgraded to version 2.
2023-04-25 16:42:17,708 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.8.conv2 is upgraded to version 2.
2023-04-25 16:42:17,710 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.9.conv2 is upgraded to version 2.
2023-04-25 16:42:17,712 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.10.conv2 is upgraded to version 2.
2023-04-25 16:42:17,714 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.11.conv2 is upgraded to version 2.
2023-04-25 16:42:17,716 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.12.conv2 is upgraded to version 2.
2023-04-25 16:42:17,717 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.13.conv2 is upgraded to version 2.
2023-04-25 16:42:17,719 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.14.conv2 is upgraded to version 2.
2023-04-25 16:42:17,721 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.15.conv2 is upgraded to version 2.
2023-04-25 16:42:17,723 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.16.conv2 is upgraded to version 2.
2023-04-25 16:42:17,725 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.17.conv2 is upgraded to version 2.
2023-04-25 16:42:17,728 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.18.conv2 is upgraded to version 2.
2023-04-25 16:42:17,730 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.19.conv2 is upgraded to version 2.
2023-04-25 16:42:17,732 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.20.conv2 is upgraded to version 2.
2023-04-25 16:42:17,733 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.21.conv2 is upgraded to version 2.
2023-04-25 16:42:17,735 - root - INFO - ModulatedDeformConvPack img_backbone.layer3.22.conv2 is upgraded to version 2.
2023-04-25 16:42:17,737 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.0.conv2 is upgraded to version 2.
2023-04-25 16:42:17,741 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.1.conv2 is upgraded to version 2.
2023-04-25 16:42:17,744 - root - INFO - ModulatedDeformConvPack img_backbone.layer4.2.conv2 is upgraded to version 2.
[                                                  ] 0/81, elapsed: 0s, ETA:/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:4227: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn(
/src/mmdet3d/core/bbox/coders/nms_free_coder.py:67: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  self.post_center_range = torch.tensor(
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 2.6 task/s, elapsed: 32s, ETA:     0s
writing results to mini_test.pkl

Formating bboxes of pts_bbox
Start to convert detection format...
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 23.1 task/s, elapsed: 3s, ETA:     0s
Results writes to /tmp/tmps9zh56ms/results/pts_bbox/results_nusc.json
Evaluating bboxes of pts_bbox
mAP: 0.1188                                                                                                                                             
mATE: 0.8707
mASE: 0.5113
mAOE: 0.6999
mAVE: 1.0449
mAAE: 0.4827
NDS: 0.2029
Eval time: 4.8s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.204   0.872   0.186   0.334   0.212   0.065
truck   0.075   0.673   0.290   0.355   0.216   0.118
bus     0.017   1.250   0.216   0.406   3.521   1.000
trailer 0.000   1.000   1.000   1.000   1.000   1.000
construction_vehicle    0.000   1.000   1.000   1.000   1.000   1.000
pedestrian      0.319   0.582   0.308   0.673   0.572   0.367
motorcycle      0.212   0.869   0.317   0.922   0.070   0.011
bicycle 0.095   0.636   0.372   0.609   1.767   0.301
traffic_cone    0.265   0.825   0.425   nan     nan     nan
barrier 0.000   1.000   1.000   1.000   nan     nan
{'pts_bbox_NuScenes/car_AP_dist_0.5': 0.006, 'pts_bbox_NuScenes/car_AP_dist_1.0': 0.1039, 'pts_bbox_NuScenes/car_AP_dist_2.0': 0.2847, 'pts_bbox_NuScenes/car_AP_dist_4.0': 0.4234, 'pts_bbox_NuScenes/car_trans_err': 0.8717, 'pts_bbox_NuScenes/car_scale_err': 0.1856, 'pts_bbox_NuScenes/car_orient_err': 0.3339, 'pts_bbox_NuScenes/car_vel_err': 0.2123, 'pts_bbox_NuScenes/car_attr_err': 0.0646, 'pts_bbox_NuScenes/mATE': 0.8707, 'pts_bbox_NuScenes/mASE': 0.5113, 'pts_bbox_NuScenes/mAOE': 0.6999, 'pts_bbox_NuScenes/mAVE': 1.0449, 'pts_bbox_NuScenes/mAAE': 0.4827, 'pts_bbox_NuScenes/truck_AP_dist_0.5': 0.0037, 'pts_bbox_NuScenes/truck_AP_dist_1.0': 0.0371, 'pts_bbox_NuScenes/truck_AP_dist_2.0': 0.0762, 'pts_bbox_NuScenes/truck_AP_dist_4.0': 0.1825, 'pts_bbox_NuScenes/truck_trans_err': 0.673, 'pts_bbox_NuScenes/truck_scale_err': 0.2899, 'pts_bbox_NuScenes/truck_orient_err': 0.3548, 'pts_bbox_NuScenes/truck_vel_err': 0.2161, 'pts_bbox_NuScenes/truck_attr_err': 0.118, 'pts_bbox_NuScenes/construction_vehicle_AP_dist_0.5': 0.0, 'pts_bbox_NuScenes/construction_vehicle_AP_dist_1.0': 0.0, 'pts_bbox_NuScenes/construction_vehicle_AP_dist_2.0': 0.0, 'pts_bbox_NuScenes/construction_vehicle_AP_dist_4.0': 0.0, 'pts_bbox_NuScenes/construction_vehicle_trans_err': 1.0, 'pts_bbox_NuScenes/construction_vehicle_scale_err': 1.0, 'pts_bbox_NuScenes/construction_vehicle_orient_err': 1.0, 'pts_bbox_NuScenes/construction_vehicle_vel_err': 1.0, 'pts_bbox_NuScenes/construction_vehicle_attr_err': 1.0, 'pts_bbox_NuScenes/bus_AP_dist_0.5': 0.0, 'pts_bbox_NuScenes/bus_AP_dist_1.0': 0.0, 'pts_bbox_NuScenes/bus_AP_dist_2.0': 0.0003, 'pts_bbox_NuScenes/bus_AP_dist_4.0': 0.0676, 'pts_bbox_NuScenes/bus_trans_err': 1.2499, 'pts_bbox_NuScenes/bus_scale_err': 0.216, 'pts_bbox_NuScenes/bus_orient_err': 0.4058, 'pts_bbox_NuScenes/bus_vel_err': 3.5214, 'pts_bbox_NuScenes/bus_attr_err': 1.0, 'pts_bbox_NuScenes/trailer_AP_dist_0.5': 0.0, 'pts_bbox_NuScenes/trailer_AP_dist_1.0': 0.0, 'pts_bbox_NuScenes/trailer_AP_dist_2.0': 0.0, 'pts_bbox_NuScenes/trailer_AP_dist_4.0': 0.0, 'pts_bbox_NuScenes/trailer_trans_err': 1.0, 'pts_bbox_NuScenes/trailer_scale_err': 1.0, 'pts_bbox_NuScenes/trailer_orient_err': 1.0, 'pts_bbox_NuScenes/trailer_vel_err': 1.0, 'pts_bbox_NuScenes/trailer_attr_err': 1.0, 'pts_bbox_NuScenes/barrier_AP_dist_0.5': 0.0, 'pts_bbox_NuScenes/barrier_AP_dist_1.0': 0.0, 'pts_bbox_NuScenes/barrier_AP_dist_2.0': 0.0, 'pts_bbox_NuScenes/barrier_AP_dist_4.0': 0.0, 'pts_bbox_NuScenes/barrier_trans_err': 1.0, 'pts_bbox_NuScenes/barrier_scale_err': 1.0, 'pts_bbox_NuScenes/barrier_orient_err': 1.0, 'pts_bbox_NuScenes/barrier_vel_err': nan, 'pts_bbox_NuScenes/barrier_attr_err': nan, 'pts_bbox_NuScenes/motorcycle_AP_dist_0.5': 0.0019, 'pts_bbox_NuScenes/motorcycle_AP_dist_1.0': 0.1057, 'pts_bbox_NuScenes/motorcycle_AP_dist_2.0': 0.2772, 'pts_bbox_NuScenes/motorcycle_AP_dist_4.0': 0.4626, 'pts_bbox_NuScenes/motorcycle_trans_err': 0.8689, 'pts_bbox_NuScenes/motorcycle_scale_err': 0.3167, 'pts_bbox_NuScenes/motorcycle_orient_err': 0.9223, 'pts_bbox_NuScenes/motorcycle_vel_err': 0.0699, 'pts_bbox_NuScenes/motorcycle_attr_err': 0.0111, 'pts_bbox_NuScenes/bicycle_AP_dist_0.5': 0.0086, 'pts_bbox_NuScenes/bicycle_AP_dist_1.0': 0.0796, 'pts_bbox_NuScenes/bicycle_AP_dist_2.0': 0.1092, 'pts_bbox_NuScenes/bicycle_AP_dist_4.0': 0.1837, 'pts_bbox_NuScenes/bicycle_trans_err': 0.636, 'pts_bbox_NuScenes/bicycle_scale_err': 0.3723, 'pts_bbox_NuScenes/bicycle_orient_err': 0.6093, 'pts_bbox_NuScenes/bicycle_vel_err': 1.7671, 'pts_bbox_NuScenes/bicycle_attr_err': 0.3007, 'pts_bbox_NuScenes/pedestrian_AP_dist_0.5': 0.1008, 'pts_bbox_NuScenes/pedestrian_AP_dist_1.0': 0.2702, 'pts_bbox_NuScenes/pedestrian_AP_dist_2.0': 0.3986, 'pts_bbox_NuScenes/pedestrian_AP_dist_4.0': 0.5082, 'pts_bbox_NuScenes/pedestrian_trans_err': 0.5822, 'pts_bbox_NuScenes/pedestrian_scale_err': 0.3078, 'pts_bbox_NuScenes/pedestrian_orient_err': 0.6728, 'pts_bbox_NuScenes/pedestrian_vel_err': 0.572, 'pts_bbox_NuScenes/pedestrian_attr_err': 0.3673, 'pts_bbox_NuScenes/traffic_cone_AP_dist_0.5': 0.0334, 'pts_bbox_NuScenes/traffic_cone_AP_dist_1.0': 0.2129, 'pts_bbox_NuScenes/traffic_cone_AP_dist_2.0': 0.3517, 'pts_bbox_NuScenes/traffic_cone_AP_dist_4.0': 0.4602, 'pts_bbox_NuScenes/traffic_cone_trans_err': 0.8253, 'pts_bbox_NuScenes/traffic_cone_scale_err': 0.4245, 'pts_bbox_NuScenes/traffic_cone_orient_err': nan, 'pts_bbox_NuScenes/traffic_cone_vel_err': nan, 'pts_bbox_NuScenes/traffic_cone_attr_err': nan, 'pts_bbox_NuScenes/NDS': 0.20291704194649415, 'pts_bbox_NuScenes/mAP': 0.1187504058313846}

problem of registry

Hello, I run the code according to the guidance, but It will report an error:

What should I do about it

UnboundLocalError: local variable 'assign_result' referenced before assignment

I was reproducing the output, but I got UnboundLocalError: local variable 'assign_result' referenced before assignment error. The problem happens after 10 epochs.
Here is the error /.../ .../projects/mmdet3d_plugin/models/dense_heads/uvtr_head.py", line 263, in _get_target_single
sampling_result = self.sampler.sample(assign_result, bbox_pred,
UnboundLocalError: local variable 'assign_result' referenced before assignment.

Could you give the more detailed structure of nuScenes?

Thanks for your great work!

but like this:

'nuscenes/samples' and 'nuscenes/sweeps'

I don't know where do these two subfiles come from?

maybe come from all partial V1.0-trainval dataset of nuScenes?

CPU memory ran out while running create_data.py

When “Create GT Database of NuScenesSweepDataset”，the process is killed。。。How can I fix this problem？ Thanks a lot ！！！！