openrobotlab / embodiedscan Goto Github PK

View Code? Open in Web Editor NEW

415.0 6.0 26.0 4.87 MB

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

Home Page: https://tai-wang.github.io/embodiedscan/

License: Apache License 2.0

Shell 0.16% Python 64.90% Jupyter Notebook 34.93%

3d-vision computer-vision multi-modal-learning robotics

embodiedscan's People

Contributors

Stargazers

Watchers

embodiedscan's Issues

[Docs] What type of data in Matterport3D do we need to download?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

The full matterport3D dataset is very large, may I ask what type of data do we need to download?

Suggest a potential alternative/fix

No response

[Bug] OSError: Can't load tokenizer for 'roberta-base'.

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

System environment: [1085/1460]
sys.platform: linux
Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 1551893665
GPU 0,1: NVIDIA A100-SXM4-80GB
CUDA_HOME: /mnt/lustre/share/cuda-11.0
NVCC: Cuda compilation tools, release 11.0, V11.0.221
GCC: gcc (GCC) 5.4.0
PyTorch: 1.12.1
PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=
sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.3.2 (built against CUDA 11.5)
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_$
BGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unuse$
-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostic$
-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.$
, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.1
OpenCV: 4.9.0
MMEngine: 0.10.3

Reproduces the problem - code sample

In embodiedscan/models/detectors/sparse_featfusion_grounder.py line 100:
self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type)

Reproduces the problem - command or script

sh tools/mv-grounding.sh

Reproduces the problem - error message

Traceback (most recent call last):
File "tools/train.py", line 133, in
Traceback (most recent call last):
File "tools/train.py", line 133, in
main()
File "tools/train.py", line 122, in main
main()
File "tools/train.py", line 122, in main
runner = Runner.from_cfg(cfg)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 462, in from_cfg
runner = Runner.from_cfg(cfg)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 462, in from_cfg
runner = cls(
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 429, in init
runner = cls(
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 429, in init
self.model = self.build_model(model)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 836, in build_model
self.model = self.build_model(model)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 836, in build_model
model = MODELS.build(model)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build
model = MODELS.build(model)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 94, in init
obj = obj_cls(**args) # type: ignore
File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 94, in init
self._init_layers()
File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 100, in _init_layers
self._init_layers()
File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 100, in _init_layers
self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2070, in from_pretrained
self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type)
File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2070, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'roberta-base'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'roberta-base' is the correct path to a directory containing all relevant
files for a RobertaTokenizerFast tokenizer.

Additional information

No response

[Bug] Low reproducibility? Limit gpus?

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

GCC 9.3

C++ Version: 201402

Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications

Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)

OpenMP 201511 (a.k.a. OpenMP 4.5)

LAPACK is enabled (usually provided by MKL)

NNPACK is enabled

CPU capability usage: AVX2

CUDA Runtime 11.3

NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=
sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37

CuDNN 8.3.2 (built against CUDA 11.5)

Magma 2.5.2

Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_BGEMM−DUSEQNNPACK−DUSEPYTORCHQNNPACK−DUSEXNNPACK−DSYMBOLICATEMOBILEDEBUGHANDLE−DEDGEPROFILERUSEKINETO−O2−fPIC−Wno−narrowing−Wall−Wextra−Werror=return−type−Wno−missing−field−initializers−Wno−type−limits−Wno−array−bounds−Wno−unknown−pragmas−Wno−unuse
-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostic−color=always−faligned−new−Wno−unused−but−set−variable−Wno−maybe−uninitialized−fno−math−errno−fno−trapping−math−Werror=format−Werror=cast−function−type−Wno−stringop−overflow,LAPACKINFO=mkl,PERFWITHAVX=1,PERFWITHAVX2=1,PERFWITHAVX512=1,TORCHVERSION=1.12.
, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.1
OpenCV: 4.9.0
MMEngine: 0.10.3

Reproduces the problem - code sample

N/A

Reproduces the problem - command or script

sh tools/mv-grounding.sh

Reproduces the problem - error message

The reproducibility results are:
AP25:
| Type | Easy | Hard | View-Dep | View-Indep | Unique | Multi | Overall |
| results | 0.2093 | 0.1840 | 0.1966 | 0.2129 | 0.0000 | 0.2073 | 0.2073 |

AP50:
| Type | Easy | Hard | View-Dep | View-Indep | Unique | Multi | Overall |
| results | 0.0535 | 0.0452 | 0.0581 | 0.0501 | 0.0000 | 0.0528 | 0.0528 |

In addition, the training can only be completed when the number of GPUs is 8.
When the number of GPUs is 2 or 4, issue 30 will sometimes occur, and issue 26 will sometimes occur.

Additional information

Is there a limit to the number of GPUs, or is the problem random, and it just runs out when gpu=8?
Are the results of visual grounding reported in the paper using the default config in tools/mv_grounding.sh? Or added fcaf_coder or modified other parameters?

[Docs] How to run the demo ?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

I used 'configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py' , 'mv-3ddet.pth' provided by you, and your simple dataset office to test demo.ipynb. but I can only get one picture of the results, the length of results in results = model.test_step(collate_data) is 1

Suggest a potential alternative/fix

No response

[Feature] Any plan to support 3D instance segmentation

What is the feature?

Hi, authors. Thanks for your great work and make it available. In 3D scene understanding, instance segmentation is also a common task for the embodied agent. So, I'd like to ask whether there are plans to make EmbodiedScan support this task.

Any other context?

No response

about training time

What is the feature?

Hi, thanks for your nice work and open code!
I want to know how many hours did you train on what kind of gpus configurations?
Thanks!

Any other context?

No response

[Bug] UPD - ValueError: Plane vertices are not coplanar.

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

System environment:
sys.platform: linux
Python: 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:47:35) [GCC 12.3.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 545726448
GPU 0,1,2,3,4,5,6,7: NVIDIA RTX A6000
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.3, V11.3.58
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 1.11.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0
OpenCV: 4.9.0
MMEngine: 0.10.3

Runtime environment:
cudnn_benchmark: False
mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
dist_cfg: {'backend': 'nccl'}
seed: 545726448
Distributed launcher: pytorch
Distributed training: True
GPU number: 8

Reproduces the problem - code sample

Reproduces the problem - command or script

3D mv-Det:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet --launcher="pytorch"

3D mv-VG:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py --work-dir=work_dirs/mv-3dground --launcher="pytorch"

Reproduces the problem - error message

04/15 13:56:37 - mmengine - INFO - Checkpoints will be saved to /data/zyp/code/EmbodiedScan/work_dirs/mv-3dground.

/data/zyp/code/EmbodiedScan/embodiedscan/models/layers/fusion_layers/point_fusion.py:48: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone(
).detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
pcd_rotate_mat = (torch.tensor(img_meta['pcd_rotation'],
/data/zyp/code/EmbodiedScan/embodiedscan/models/layers/fusion_layers/point_fusion.py:48: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone(
).detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
pcd_rotate_mat = (torch.tensor(img_meta['pcd_rotation'],
/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmcv/cnn/bricks/transformer.py:524: UserWarning: position encoding of key ismissing in MultiheadAttention.
warnings.warn(f'position encoding of key is'
Traceback (most recent call last):
File "tools/train.py", line 133, in
main()
File "tools/train.py", line 129, in main
runner.train()
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run() # type: ignore
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 121, in train_step
losses = self._run_forward(data, mode='loss')
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 161, in _run_forward
results = self(**data, mode=mode)
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 666, in forward
return self.loss(inputs, data_samples, **kwargs)
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 507, in loss
losses = self.bbox_head.loss(**head_inputs_dict,
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 637, in loss
losses = self.loss_by_feat(*loss_inputs)
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 668, in loss_by_feat
losses_cls, losses_bbox = multi_apply(
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 711, in loss_by_feat_single
cls_reg_targets = self.get_targets(cls_scores_list,
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 258, in get_targets
pos_inds_list, neg_inds_list) = multi_apply(self._get_targets_single,
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 398, in _get_targets_single
assign_result = self.assigner.assign(
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/task_modules/assigners/hungarian_assigner.py", line 113, in assign
cost = match_cost(pred_instances=pred_instances_3d,
File "/data/zyp/code/EmbodiedScan/embodiedscan/models/losses/match_cost.py", line 108, in call
overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
File "/data/zyp/code/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, in overlaps
_, iou3d = box3d_overlap(corners1, corners2, eps=eps)
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/pytorch3d/ops/iou_box3d.py", line 160, in box3d_overlap
_check_coplanar(boxes2, eps)
File "/data/zyp/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/pytorch3d/ops/iou_box3d.py", line 66, in _check_coplanar
raise ValueError(msg)
ValueError: Plane vertices are not coplanar

Additional information

I can run the 3D mv-det task very smoothly in both training and testing. However, when I run the 3D mv-VG task in the same environment with 8*A6000 (48G), it always encounters a ValueError: Plane vertices are not coplanar in the first epoch.

I have checked the related issues #22, #32, #30, facebookresearch/pytorch3d/issues/992, and facebookresearch/pytorch3d/issues/1771.

I have also tried the following solutions:

Modifying eps in box3d_overlap with values like 1e-2, 1e-3, 1e-4, and 1e-5.
Changing the learning rate (lr) in the training script to values like 5e-2 and 5e-4.
Training with detection checkpoint and without detection checkpoint.
Using 2xA6000, 4xA6000, and 8xA6000.
Using --resume and --resume auto

However, none of these solutions have worked so far. Could anyone please share how to solve this issue or provide a successful environment setup? Will the team look into this matter? Many thanks.

OSError when loading roberta-base tokenizer

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.3, V11.3.58
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.11.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0
OpenCV: 4.9.0
MMEngine: 0.10.3
MMDetection: 3.3.0+a69213d

Reproduces the problem - code sample

from transformers import RobertaTokenizerFast
tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base')

Reproduces the problem - command or script

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7;export MASTER_ADDR=127.0.0.1;export RANK=0;export WORLD_SIZE=1;export MASTER_PORT=29320;python tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py --work-dir=work_dirs/mv-3dvg --launcher="pytorch"

Reproduces the problem - error message

Traceback (most recent call last):
File "tools/train.py", line 133, in
main()
File "tools/train.py", line 122, in main
runner = Runner.from_cfg(cfg)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 462,
in from_cfg
runner = cls(
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 429,
in init
self.model = self.build_model(model)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 836,
in build_model
model = MODELS.build(model)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/registry/registry.py", line 5
70, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/registry/build_functions.py",
line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/registry/build_functions.py",
line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/data/home/kye/projects/cvpr2024challenge/visual_grounding/EmbodiedScan/embodiedscan/models/detectors/sparse_f
eatfusion_grounder.py", line 145, in init
self._init_layers()
File "/data/home/kye/projects/cvpr2024challenge/visual_grounding/EmbodiedScan/embodiedscan/models/detectors/sparse_f
eatfusion_grounder.py", line 151, in _init_layers
self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type)
File "/data/home/kye/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/transformers/tokenization_utils_base.p
y", line 2032, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'roberta-base'. If you were trying to load it from 'https://huggingface.co/models',
make sure you don't have a local directory with the same name. Otherwise, make sure 'roberta-base' is the correct path
to a directory containing all relevant files for a RobertaTokenizerFast tokenizer.

Additional information

No response

[Bug] 'SparseFeatureFusion3DGrounder is not in the mmengine

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

Reproduces the problem - code sample

python -m torch.distributed.launch tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof-full.py --work-dir=work_dirs/mv-3ddet

Reproduces the problem - command or script

when I started code with python -m torch.distributed.launch tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof-full.py --work-dir=work_dirs/mv-3ddet.
It showed:

Reproduces the problem - error message

07/31 16:03:31 - mmengine - WARNING - Failed to import None.registry make sure the registry.py exists in None package.
07/31 16:03:31 - mmengine - WARNING - Failed to search registry with scope "embodiedscan" in the "vis_backend" registry tree. As a workaround, the current "vis_backend" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "embodiedscan" is a correct scope, or whether the registry is initialized.
07/31 16:03:33 - mmengine - WARNING - Failed to search registry with scope "embodiedscan" in the "model" registry tree. As a workaround, the current "model" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "embodiedscan" is a correct scope, or whether the registry is initialized.
Traceback (most recent call last):
File "tools/train.py", line 142, in
main()
File "tools/train.py", line 131, in main
runner = Runner.from_cfg(cfg)
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/runner/runner.py", line 462, in from_cfg
runner = cls(
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/runner/runner.py", line 429, in init
self.model = self.build_model(model)
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/runner/runner.py", line 836, in build_model
model = MODELS.build(model)
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/slurm-files/lfan/env/embo/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 100, in build_from_cfg
raise KeyError(
KeyError: 'SparseFeatureFusion3DGrounder is not in the mmengine::model registry. Please check whether the value of SparseFeatureFusion3DGrounder is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'

Additional information

No response

[Bug] Use --amp for mixed precision accelerated training

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

python 3.9
torch 1.11.0
torchaudio 0.11.0
torchvision 0.12.0

Reproduces the problem - code sample

From EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py:line 108:

 _, iou3d = box3d_overlap(corners1, corners2, eps=eps)

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 tools/train.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py --work-dir=work_dirs/mv-grounding --launcher="pytorch" --amp

Reproduces the problem - error message

  File "/data1/luojingzhou/projects/EmbodyAI/EmbodiedScan/embodiedscan/models/losses/match_cost.py", line 108, in __ca
ll__
    cost = match_cost(pred_instances=pred_instances_3d,
  File "/data1/luojingzhou/projects/EmbodyAI/EmbodiedScan/embodiedscan/models/losses/match_cost.py", line 108, in __ca
ll__
    overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
  File "/data1/luojingzhou/projects/EmbodyAI/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, i
n overlaps
    overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
  File "/data1/luojingzhou/projects/EmbodyAI/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, i
n overlaps
    _, iou3d = box3d_overlap(corners1, corners2, eps=eps)
  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 164, in 
box3d_overlap
        overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
_, iou3d = box3d_overlap(corners1, corners2, eps=eps)
  File "/data1/luojingzhou/projects/EmbodyAI/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, i
n overlaps
  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 164, in 
box3d_overlap
    _, iou3d = box3d_overlap(corners1, corners2, eps=eps)
  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 164, in 
box3d_overlap
    vol, iou = _box3d_overlap.apply(boxes1, boxes2)    vol, iou = _box3d_overlap.apply(boxes1, boxes2)

  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 103, in 
forward
  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 103, in 
forward
    vol, iou = _C.iou_box3d(boxes1, boxes2)    vol, iou = _C.iou_box3d(boxes1, boxes2)

RuntimeError: RuntimeError: expected scalar type Float but found Half
    vol, iou = _box3d_overlap.apply(boxes1, boxes2)
  File "/data1/luojingzhou/anaconda3/envs/3DVQA/lib/python3.9/site-packages/pytorch3d/ops/iou_box3d.py", line 103, in 
forward
    vol, iou = _C.iou_box3d(boxes1, boxes2)
RuntimeError: expected scalar type Float but found Half
expected scalar type Float but found Half

Additional information

When I try to use the following method to solve it, it can run normally, but during training, the loss will appear as a nan value.

_, iou3d = box3d_overlap(corners1.float(), corners2.float(), eps=eps)
if (corners1.dtype == torch.float16):
     iou3d = iou3d.half()

log:

2024/03/30 17:08:51 - mmengine - INFO - Epoch(train)  [1][1000/1501]  base_lr: 5.0000e-04 lr: 5.0000e-04  eta: 1 day, 16:41:29  time: 8.5409  data_time: 0.6598  memory: 16470  grad_norm: nan  loss: nan  loss_cls: nan  loss_bbox: nan  d0.loss_cls: nan  d0.loss_bbox: nan  d1.loss_cls: nan  d1.loss_bbox: nan  d2.loss_cls: nan  d2.loss_bbox: nan  d3.loss_cls: nan  d3.loss_bbox: nan  d4.loss_cls: nan  d4.loss_bbox: nan

[Bug] SurroundOcc bug: error in ms_deformable_col2im_cuda: invalid configuration argument

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux
Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0,1: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 1.11.0+cu113
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0+cu113
OpenCV: 4.9.0
MMEngine: 0.8.0

Reproduces the problem - code sample

python -m torch.distributed.launch --nproc_per_node=1 tools/train.py configs/occupancy/mv-occ_surroundocc.py --work-dir=work_dirs/mv_occ --launcher="pytorch"

Reproduces the problem - command or script

python -m torch.distributed.launch --nproc_per_node=1 tools/train.py configs/occupancy/mv-occ_surroundocc.py --work-dir=work_dirs/mv_occ --launcher="pytorch"

Reproduces the problem - error message

error in ms_deformable_col2im_cuda: invalid configuration argument
error in ms_deformable_im2col_cuda: invalid configuration argument

Additional information

I find that before deformable attention in SurroundOcc, the num_query = 0.

Could you release your SurroundOcc config or code?

Thanks a lot !

[Docs] Correct number of categories?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi! Thanks for the awesome work and the extensive annotations! :)

I have a question about the number of categories. I read in the paper that the total number of categories is over 760, but I only see 288 categories in the .pkl files for both train and validation. I assume that's why some objects have bounding boxes but are labeled as "object"? Are you planning to release annotations for more categories, or is this all?

Thanks!

-Luke

Suggest a potential alternative/fix

No response

[Bug] RuntimeError: CUDA out of memory. Tried to allocate 1048475.67 GiB

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

System environment:
sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 1591519926
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr
NVCC: Cuda compilation tools, release 10.1, V10.1.24
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 1.11.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0
OpenCV: 4.9.0
MMEngine: 0.10.3

Runtime environment:
cudnn_benchmark: False
mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
dist_cfg: {'backend': 'nccl'}
seed: 1591519926
Distributed launcher: none
Distributed training: False
GPU number: 1

Reproduces the problem - code sample

Traceback (most recent call last):
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py", line 283, in optim_context
yield
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
results = self(**data, mode=mode)
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/storage1/Fudongyi/AutoDrive/code/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_single_stage.py", line 325, in forward
return self.loss(inputs, data_samples, **kwargs)
File "/storage1/Fudongyi/AutoDrive/code/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_single_stage.py", line 242, in loss
losses = self.bbox_head.loss(x, batch_data_samples, **kwargs)
File "/storage1/Fudongyi/AutoDrive/code/EmbodiedScan/embodiedscan/models/dense_heads/fcaf3d_head.py", line 1037, in loss
outs = self(x)
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/storage1/Fudongyi/AutoDrive/code/EmbodiedScan/embodiedscan/models/dense_heads/fcaf3d_head.py", line 1010, in forward
x = self._prune(x, prune_score)
File "/storage1/Fudongyi/AutoDrive/code/EmbodiedScan/embodiedscan/models/dense_heads/fcaf3d_head.py", line 1103, in _prune
interpolated_scores = scores.features_at_coordinates(coordinates)
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiSparseTensor.py", line 713, in features_at_coordinates
return MinkowskiInterpolationFunction().apply(
File "/home/fudongyi/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiInterpolation.py", line 52, in forward
out_feat, in_map, out_map, weights = fw_fn(
RuntimeError: CUDA out of memory. Tried to allocate 1048475.67 GiB (GPU 0; 23.70 GiB total capacity; 1.47 GiB already allocated; 20.20 GiB free; 1.50 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Reproduces the problem - command or script

python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet

Reproduces the problem - error message

There show the fw_fn() function tried to allocate 1048475.67 GiB at GPU. I think it is a bug. how can I solve this problem?

Additional information

No response

Why 3RScan dataset cannot be downloaded?

Model/Dataset/Scheduler description

Hi,

Thanks for your barvo work!

I'm trying to re-implemented your project for further reasearch, but I found I cannot access 3RScan dataset by their offical download link. Specfically, I'm using the script with "https://gist.github.com/WaldJohannaU/55f5e35992ea91157b789b15eac4d432", and no matter how we switch the vpn, we still cannot visit any data from "'http://campar.in.tum.de/public_datasets/3RScan/xxxx".

I don't know if anyone else has this problem. If so, would you mind sharing the 3RScan data set to Google Drive or Baidu cloud disk? This may allow more people to try your work.

Best regards,

Open source status

The model implementation is available
The model weights are available.

Provide useful links for the implementation

No response

Is work_dirs/mv-3ddet/epoch_12.pth == https://download.openxlab.org.cn/models/wangtai/EmbodiedScan/weight/mv-3ddet.pth?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Can't find information on which model is work_dirs/mv-3ddet/epoch_12.pth

Suggest a potential alternative/fix

No response

[Bug] CUDA error: an illegal memory access was encountered

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux
Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA H100 80GB HBM3
CUDA_HOME: /fs/applications/cuda/12.1.1
NVCC: Cuda compilation tools, release 12.1, V12.1.105
GCC: gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
PyTorch: 2.2.1+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

TorchVision: 0.17.1+cu121
OpenCV: 4.9.0
MMEngine: 0.10.3
MMDetection: 3.3.0
MMDetection3D: 1.4.0+
spconv2.0: False

Reproduces the problem - code sample

Reproduces the problem - command or script

python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py

Reproduces the problem - error message

[rank2]:[E ProcessGroupNCCL.cpp:1182] [Rank 2] NCCL watchdog thread terminated with exception: CUDA error: an illegal memory access was encountered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x14a547a35d87 in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libc10.so)frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x14a5479e675f in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libc10.so)frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x14a547b068a8 in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)frame #3: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x6c (0x14a548bd93ac in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)frame #4: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0x58 (0x14a548bdd4c8 in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)frame #5: c10d::ProcessGroupNCCL::workCleanupLoop() + 0x15a (0x14a548be0bfa in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #6: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x119 (0x14a548be1839 in /home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)frame #7: <unknown function> + 0xdbbf4 (0x14a5928edbf4 in /home/user/cache/conda/envs/embodiedscan/bin/../lib/libstdc++.so.6)
frame #8: <unknown function> + 0x81ca (0x14a59edab1ca in /lib64/libpthread.so.0)frame #9: clone + 0x43 (0x14a59e28de73 in /lib64/libc.so.6)

Additional information

I sometimes run in to CUDA error: an illegal memory access was encountered. Do you happen to know what might be the cause?

[Docs] Annotations for Monocular 3D Perception

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi, is the annotations for Monocular 3D Perception as referred to in the paper available to the public? I don't see the annotations in the data currently provided. Thanks!

-Luke

Suggest a potential alternative/fix

No response

[Docs] The ckpt and log links on the README seem to be unavaliable

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

The ckpt and log links on the README seem to be unavaliable; they were still available last week.

I am not sure it's the case on my side, but I can't open it even if I switch to another network.

Suggest a potential alternative/fix

No response

[Docs] ckpt `mv-3ddet.pth` doesn't work for grounding task

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Thanks for your awesome work!

To run 3d grounding task with train.py, the downloaded 3d-detection ckpt mv-3ddet.pth mentioned in readme doesn't work after replacing work_dirs/mv-3ddet/epoch_12.pth in configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py

Did I miss something? I look forward to your reply. Thanks in advance.

I paste the error log:

04/03 01:06:47 - mmengine - WARNING - The model and loaded state dict do not match exactly

size mismatch for conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 3, 7, 7]).
size mismatch for bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 1, 1]).
size mismatch for layer1.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.conv2.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for layer1.0.bn2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.bn2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.0.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
size mismatch for layer1.0.bn3.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.bn3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.bn3.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.bn3.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.downsample.0.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
size mismatch for layer1.0.downsample.1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.downsample.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.downsample.1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.0.downsample.1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
size mismatch for layer1.1.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.conv2.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for layer1.1.bn2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.bn2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.1.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
size mismatch for layer1.1.bn3.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.1.bn3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.1.bn3.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.1.bn3.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
size mismatch for layer1.2.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.conv2.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for layer1.2.bn2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.bn2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for layer1.2.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
size mismatch for layer1.2.bn3.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.2.bn3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.2.bn3.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer1.2.bn3.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 64, 1, 1]).
size mismatch for layer2.0.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.conv2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for layer2.0.bn2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.bn2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.0.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 32, 1, 1]).
size mismatch for layer2.0.bn3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.bn3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.bn3.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.bn3.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]).
size mismatch for layer2.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 128, 1, 1]).
size mismatch for layer2.1.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.conv2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for layer2.1.bn2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.bn2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.1.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 32, 1, 1]).
size mismatch for layer2.1.bn3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.1.bn3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.1.bn3.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.1.bn3.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.2.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 128, 1, 1]).
size mismatch for layer2.2.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.conv2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for layer2.2.bn2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.bn2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.2.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 32, 1, 1]).
size mismatch for layer2.2.bn3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.2.bn3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.2.bn3.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.2.bn3.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.3.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 128, 1, 1]).
size mismatch for layer2.3.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.conv2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for layer2.3.bn2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.bn2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for layer2.3.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 32, 1, 1]).
size mismatch for layer2.3.bn3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.3.bn3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.3.bn3.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer2.3.bn3.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]).
size mismatch for layer3.0.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.0.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.0.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.0.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.downsample.0.weight: copying a param with shape torch.Size([1024, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]).
size mismatch for layer3.0.downsample.1.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.downsample.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.downsample.1.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.0.downsample.1.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer3.1.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.1.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.1.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.1.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.1.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.1.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.1.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.2.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer3.2.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.2.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.2.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.2.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.2.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.2.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.2.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.3.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer3.3.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.3.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.3.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.3.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.3.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.3.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.3.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.4.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer3.4.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.4.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.4.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.4.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.4.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.4.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.4.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.5.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer3.5.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.conv2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer3.5.bn2.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn2.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.bn2.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layer3.5.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).
size mismatch for layer3.5.bn3.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.5.bn3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.5.bn3.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer3.5.bn3.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]).
size mismatch for layer4.0.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.conv2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for layer4.0.bn2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn2.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.bn2.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.0.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 128, 1, 1]).
size mismatch for layer4.0.bn3.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.bn3.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.bn3.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.bn3.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.downsample.0.weight: copying a param with shape torch.Size([2048, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]).
size mismatch for layer4.0.downsample.1.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.downsample.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.downsample.1.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.0.downsample.1.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layer4.1.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.conv2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for layer4.1.bn2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn2.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.bn2.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.1.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 128, 1, 1]).
size mismatch for layer4.1.bn3.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.1.bn3.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.1.bn3.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.1.bn3.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.2.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layer4.2.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.conv2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
size mismatch for layer4.2.bn2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn2.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.bn2.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for layer4.2.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 128, 1, 1]).
size mismatch for layer4.2.bn3.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.2.bn3.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.2.bn3.running_mean: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer4.2.bn3.running_var: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]).
unexpected key in source state_dict: fc.weight, fc.bias

Traceback (most recent call last):
  File "tools/train.py", line 158, in <module>
    main()
  File "tools/train.py", line 154, in main
    runner.train()
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1765, in train
    self.load_or_resume()
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1696, in load_or_resume
    self.resume(resume_from)
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 2017, in resume
    checkpoint = self.load_checkpoint(filename, map_location=device)
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 2127, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location=map_location)
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/checkpoint.py", line 548, in _load_checkpoint
    return CheckpointLoader.load_checkpoint(filename, map_location, logger)
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/checkpoint.py", line 324, in load_checkpoint
    checkpoint_loader = cls._get_checkpoint_loader(filename)
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/checkpoint.py", line 307, in _get_checkpoint_loader
    if re.match(p, path) is not None:
  File "/home/debug/anaconda3/envs/embodiedscan/lib/python3.8/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
TypeError: expected string or bytes-like object

Suggest a potential alternative/fix

No response

How can I find the corresponding image and camera parameters according to the 3d bbox？

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

How can I find the corresponding image and camera parameters according to the 3d bbox？ Thanks.

Suggest a potential alternative/fix

No response

[Docs] Full log on the multiview grounding full setting

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

The full log for the multiview grounding full setting seems to include only logs from resumption.
https://download.openmmlab.com/mim-example/embodiedscan/mv-grounding-full.log

Suggest a potential alternative/fix

Is it possible to request for the full log from the start?

How to run demo.ipynb on custom data?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Thank you for your work. I have successfully run the demo.ipynb. Now, I would like to know how to run the demo.ipynb on custom data for inference, such as how to obtain my own data's "poses. txt" and "axis_align_matrix. txt"?

Suggest a potential alternative/fix

No response

[Bug] Why mvdet uses do much RAM memory during evaluation? Shooting above 500+gb and causes the program to crash

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux
Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA H100 80GB HBM3
CUDA_HOME: /fs/applications/cuda/12.1.1
NVCC: Cuda compilation tools, release 12.1, V12.1.105
GCC: gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
PyTorch: 2.2.1+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

TorchVision: 0.17.1+cu121
OpenCV: 4.9.0
MMEngine: 0.10.3
MMDetection: 3.3.0
MMDetection3D: 1.4.0+
spconv2.0: False

Reproduces the problem - code sample

Reproduces the problem - command or script

python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py

Reproduces the problem - error message

job schedular indicates TERM_MEMLIMIT: job killed after reaching LSF memory usage limit.

Additional information

Evaluation shouldn't use that much memory, 500+gb is crazy!

[Docs] Code to process data into a format usable by the demo

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi, can you please post the code that processes the data into a usable format for the demo?

Suggest a potential alternative/fix

No response

[Docs] seem a bug in "nms_filter" when running demo

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

When I ran demo as the readme say, I met a bug.

And my configs are as follows:
config file: cont-det3d_8xb1_embodiedscan-3d-284class-9dof.py (from your project)
weights: cont-3ddet.pth (from your download link)
data: openscan.zip (from your download link)
scene: restroom

The problem seems like small 3d-boxes exists and are not filtered.
Could you help me check that it's a bug or my config is wrong. Thanks a lot and very appreciate your work.

Suggest a potential alternative/fix

No response

The testing process is frozen

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

------------------------------------------------------------                                                                                                               
System environment:                                                                                                                                                        
    sys.platform: linux                                                                                                                                                    
    Python: 3.8.16 (default, Mar  2 2023, 03:21:46) [GCC 11.2.0]                                                                                                           
    CUDA available: True                                                                                                                                                   
    MUSA available: False                                                                                                                                                  
    numpy_random_seed: 1731274824                                                                                                                                          
    GPU 0,1,2,3: Tesla V100S-PCIE-32GB                                                                                                                                     
    CUDA_HOME: /usr/local/cuda-11.0                                                                                                                                        
    NVCC: Cuda compilation tools, release 11.0, V11.0.221                                                                                                                  
    GCC: gcc (Ubuntu 7.5.0-6ubuntu2) 7.5.0                                                                                                                                 
    PyTorch: 1.11.0                                                                                                                                                        
    PyTorch compiling details: PyTorch built with:                                                                                                                         
  - GCC 7.3                                                                                                                                                                
  - C++ Version: 201402                                                                                                                                                    
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_6
1;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,
code=compute_37
    TorchVision: 0.12.0
    OpenCV: 4.7.0
    MMEngine: 0.10.4

Runtime environment:
    cudnn_benchmark: False
    mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
    dist_cfg: {'backend': 'nccl'}
    seed: 1731274824
    Distributed launcher: pytorch
    Distributed training: True
    GPU number: 2

Reproduces the problem - code sample

bash tools/dist_test.sh configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py ckpts/mv-grounding.pth 2

Reproduces the problem - command or script

It would cause error by using your provided command python tools/test.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py work_dirs/mv-3ddet/epoch_12.pth --launcher="pytorch".

So I use the following command to run testing:

bash tools/dist_test.sh configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py ckpts/mv-grounding.pth 2

Reproduces the problem - error message

It's always stucked with the following output.

Additional information

I want to test the official provided baseline results.

[Feature] How to access post-process camera poses?

What is the feature?

API for accessing camera poses.
The paper introduces some post-process steps for the camera poses. How do I get access to those camera poses?

Any other context?

No response

[Bug] ValueError: Plane vertices are not coplanar

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux
Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA H100 80GB HBM3
CUDA_HOME: /fs/applications/cuda/12.1.1
NVCC: Cuda compilation tools, release 12.1, V12.1.105
GCC: gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
PyTorch: 2.2.1+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

TorchVision: 0.17.1+cu121
OpenCV: 4.9.0
MMEngine: 0.10.3
MMDetection: 3.3.0
MMDetection3D: 1.4.0+
spconv2.0: False

Reproduces the problem - code sample

Reproduces the problem - command or script

python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py

Reproduces the problem - error message

Traceback (most recent call last):
  File "/home/user/EmbodiedScan/./tools/train.py", line 157, in <module>
    main()
  File "/home/user/EmbodiedScan/./tools/train.py", line 153, in main
    runner.train()
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1777, in train
    model = self.train_loop.run()  # type: ignore
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/runner/loops.py", line 96, in run
    self.run_epoch()
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
    self.run_iter(idx, data_batch)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/runner/loops.py", line 128, in run_iter
    outputs = self.runner.model.train_step(
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/model/wrappers/distributed.py", line 121, in train_step
    losses = self._run_forward(data, mode='loss')
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmengine/model/wrappers/distributed.py", line 161, in _run_forward
    results = self(**data, mode=mode)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward
    else self._run_ddp_forward(*inputs, **kwargs)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward
    return self.module(*inputs, **kwargs)  # type: ignore[index]
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 729, in forward
    return self.loss(inputs, data_samples, **kwargs)
  File "/home/user/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 572, in loss
    losses = self.bbox_head.loss(**head_inputs_dict,
  File "/home/user/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 643, in loss
    losses = self.loss_by_feat(*loss_inputs)
  File "/home/user/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 674, in loss_by_feat
    losses_cls, losses_bbox = multi_apply(
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/user/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 717, in loss_by_feat_single
    cls_reg_targets = self.get_targets(cls_scores_list,
  File "/home/user/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 258, in get_targets
    pos_inds_list, neg_inds_list) = multi_apply(self._get_targets_single,
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/user/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 398, in _get_targets_single
    assign_result = self.assigner.assign(
  File "/home/user/EmbodiedScan/embodiedscan/models/task_modules/assigners/hungarian_assigner.py", line 113, in assign
    cost = match_cost(pred_instances=pred_instances_3d,
  File "/home/user/EmbodiedScan/embodiedscan/models/losses/match_cost.py", line 108, in __call__
    overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
  File "/home/user/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, in overlaps
    _, iou3d = box3d_overlap(corners1, corners2, eps=eps)
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/pytorch3d/ops/iou_box3d.py", line 160, in box3d_overlap
    if not all((8, 3) == box.shape[1:] for box in [boxes1, boxes2]):
  File "/home/user/cache/conda/envs/embodiedscan/lib/python3.10/site-packages/pytorch3d/ops/iou_box3d.py", line 67, in _check_coplanar
ValueError: Plane vertices are not coplanar

Additional information

I keep running into ValueError: Plane vertices are not coplanar. Is this expected? How to avoid this problem?

[Bug] different datasets have very different efficiency in loading data_time

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/__init__.py:42: UserWarning: The environment variable `OMP_NUM_THREADS` not set. MinkowskiEngine will automatically set `OMP_NUM_THREADS=16`. If you want to set `OMP_NUM_THREADS` manually, please export it on the command line before running a python script. e.g. `export OMP_NUM_THREADS=12; python your_program.py`. It is recommended to set it below 24.
  "It is recommended to set it below 24.",
sys.platform: linux
Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.3, V11.3.109
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.12.1
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.3.2  (built against CUDA 11.5)
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

TorchVision: 0.13.1
OpenCV: 4.6.0
MMEngine: 0.10.3
MMDetection: 3.3.0
MMDetection3D: 1.4.0+962f093
spconv2.0: True

Reproduces the problem - code sample

Hi, I found it is extremely slow to run a test. So much time is used in data loading.

My environment:
4*RTX4090
datasets storage in local SSD

test batchsize:

test_dataloader = dict(batch_size=8,
                       num_workers=8,
                       persistent_workers=True,
                       drop_last=False,
                       sampler=dict(type='DefaultSampler', shuffle=False),
                       dataset=dict(type=dataset_type,
                                    data_root=data_root,
                                    ann_file='embodiedscan_infos_test.pkl',
                                    vg_file='embodiedscan_test_vg.json',
                                    metainfo=metainfo,
                                    pipeline=test_pipeline,
                                    test_mode=True,
                                    filter_empty_gt=True,
                                    box_type_3d='Euler-Depth'))

the data_time to load different datasets is very different.
data_time of scannet is about 3.4
data_time of 3rscan is about 0.4
data_time of matterport is about 14

which makes the test time so long. Does it normal to have more than 3 hours to run a test?
and How can I reduce data_time, especially in loading scannet and matterport data?

Reproduces the problem - command or script

Start test.py:

python3 -m torch.distributed.run --nproc_per_node=4 tools/test.py configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py work_dirs/mv-grounding/epoch_12.pth --launcher="pytorch"

Can you make it more clear about the dataset we need to prepare?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

When you talk about the dataset that is used such as scannet, mp3d and so on, it is better if you can provide us clear guide that can tell us which part of dataset we need. You can give me the download command such as --task habitat so that we can know.
Or if we need to download them entirely?

Suggest a potential alternative/fix

No response

Docker Image can help us a lot

What is the feature?

It will be more convenient if you can provide us docker image that we can pull from docker hub directly.
Just like habitat.

Any other context?

No response

run demo.py but got： if 'img' in data['inputs']: TypeError: 'NoneType' object is not subscriptable

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

python demo/demo.py

Reproduces the problem - code sample

if 'img' in data['inputs']:

TypeError: 'NoneType' object is not subscriptable

Reproduces the problem - command or script

run dem.py but get： if 'img' in data['inputs']:
TypeError: 'NoneType' object is not subscriptable ???

Reproduces the problem - error message

run dem.py but get： if 'img' in data['inputs']:
TypeError: 'NoneType' object is not subscriptable ???

Additional information

run dem.py but get： if 'img' in data['inputs']:
TypeError: 'NoneType' object is not subscriptable ???

[Feature] 请问论文中提到的SAM+SUSTech标准工具可以在github中提供一下吗？

What is the feature?

请问论文中提到的SAM+SUSTech标准工具可以在github中提供一下吗？

Any other context?

No response

[Docs] Request more detailed dataset structure.

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Awesome work!

Is it possible for the team to provide more detail on the structure of the external datasets (Scannet, 3RScan and Matterport3D)? Since the full versions of the individual datasets are very large, we would like to download only part of the datasets based on EmbodiedScan's actual needs.

Thanks!

Suggest a potential alternative/fix

No response

[Bug] ValueError: Plane vertices are not coplanar. (box3d_overlap)

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

GCC 9.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=
sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.3.2 (built against CUDA 11.5)
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_$
BGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unuse$
-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostic$
-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.$
, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.1
OpenCV: 4.9.0
MMEngine: 0.10.3

Reproduces the problem - code sample

In embodiedscan/structures/bbox_3d/euler_box3d.py#L134

_, iou3d = box3d_overlap(corners1, corners2, eps=eps)

Reproduces the problem - command or script

sh tools/mv-grounding.sh

Reproduces the problem - error message

04/01 21:32:20 - mmengine - INFO - Epoch(train)  [5][1300/2001]  base_lr: 5.0000e-04 lr: 5.0000e-04  eta: 10:10:27  time: 2.5333  data_time: 0.1731  memory: 29459  grad_norm: 35.8510  loss: 8.8971  loss_cls: 1.0159  loss_bbox: 0.4671  d0.loss_cls: 1.0471  d0.loss_bbox: 0.46$
5  d1.loss_cls: 1.0285  d1.loss_bbox: 0.4565  d2.loss_cls: 1.0182  d2.loss_bbox: 0.4596  d3.loss_cls: 1.0093  d3.loss_bbox: 0.4628  d4.loss_cls: 1.0077  d4.loss_bbox: 0.4638
Traceback (most recent call last):
  File "tools/train.py", line 133, in <module>
    main()
  File "tools/train.py", line 129, in main
    runner.train()
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train
    model = self.train_loop.run()  # type: ignore
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
    self.run_epoch()
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
    self.run_iter(idx, data_batch)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
    outputs = self.runner.model.train_step(
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 121, in train_step
    losses = self._run_forward(data, mode='loss')
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 161, in _run_forward
    results = self(**data, mode=mode)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1008, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 969, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 666, in forward
    return self.loss(inputs, data_samples, **kwargs)
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 507, in loss
    losses = self.bbox_head.loss(**head_inputs_dict,
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 637, in loss
    losses = self.loss_by_feat(*loss_inputs)
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 668, in loss_by_feat
    losses_cls, losses_bbox = multi_apply(
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 711, in loss_by_feat_single
    cls_reg_targets = self.get_targets(cls_scores_list,
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 258, in get_targets
    pos_inds_list, neg_inds_list) = multi_apply(self._get_targets_single,
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/dense_heads/grounding_head.py", line 398, in _get_targets_single
    assign_result = self.assigner.assign(
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/task_modules/assigners/hungarian_assigner.py", line 113, in assign
    cost = match_cost(pred_instances=pred_instances_3d,
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/models/losses/match_cost.py", line 108, in __call__
    overlaps = pred_bboxes.overlaps(pred_bboxes, gt_bboxes)
  File "/mnt/petrelfs/huangchenxi/EmbodiedScan/embodiedscan/structures/bbox_3d/euler_box3d.py", line 134, in overlaps
    _, iou3d = box3d_overlap(corners1, corners2, eps=eps)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/pytorch3d/ops/iou_box3d.py", line 159, in box3d_overlap
    _check_coplanar(boxes1, eps)
  File "/mnt/lustre/huangchenxi/anaconda3/envs/visual/lib/python3.8/site-packages/pytorch3d/ops/iou_box3d.py", line 66, in _check_coplanar
    raise ValueError(msg)
ValueError: Plane vertices are not coplanar
srun: error: SH-IDC1-10-140-24-25: task 1: Exited with exit code 1
srun: launch/slurm: _step_signal: Terminating StepId=3342784.0
slurmstepd: error: *** STEP 3342784.0 ON SH-IDC1-10-140-24-25 CANCELLED AT 2024-04-01T21:32:52 ***

Additional information

In facebookresearch/pytorch3d#992, they suggest increasing EPS. Will this problem occur under your default setting of 1e-4? If so, how do I adjust the EPS value? And this happened in my 5th epoch, with randomness, what is the reason for this?

[Feature] how to visualize the 3d bbox on the point cloud 3d scene

What is the feature?

Hi, I am trying to visualize the 3d bbox provieded in this dataset on the _vh_clean_2.ply of scannet,
but it seems the coordinate used in your provieded bbox is different with the one used in _vh_clean_2.ply of scannet,
what kind of rotation matrix or other operation do I need to do do that?

Thanks!

Any other context?

No response

[Bug] Why is there no generate_image_matterport3d.py? Is the post-processed data of matterport3d release?

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

Reproduces the problem - code sample

explorer = EmbodiedScanExplorer(
	data_root={
        'scannet' : '/Data/Datasets/embodiedscan/scannet',
        '3rscan' : '/Data/Datasets/embodiedscan/3rscan',
        'matterport3d': '/Data/Datasets/embodiedscan/matterport3d',
        },
	ann_file=['/Data/Datasets/embodiedscan/embodiedscan_infos_train.pkl', '/Data/Datasets/embodiedscan/embodiedscan_infos_val.pkl'],
	verbose=True,	# print log or not
)

Reproduces the problem - command or script

I ran explorer.list_scenes() but no matterport3d scenes appeared.

Reproduces the problem - error message

There are only scannet scenes and 3rscan scenes

Additional information

No response

[Docs] Will the [email protected] >1. in 3DVG?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

  iou = top_bbox.overlaps(top_bbox, gt_bboxes)  # (num_query, 1)

  for t in self.iou_thr:
      threshold = iou > t
      found = int(threshold.any())
      if view_dep:
          gt["View-Dep@" + str(t)] += 1
          pred["View-Dep@" + str(t)] += found
      else:
          gt["View-Indep@" + str(t)] += 1
          pred["View-Indep@" + str(t)] += found
      if hard:
          gt["Hard@" + str(t)] += 1
          pred["Hard@" + str(t)] += found
      else:
          gt["Easy@" + str(t)] += 1
          pred["Easy@" + str(t)] += found
      if unique:
          gt["Unique@" + str(t)] += 1
          pred["Unique@" + str(t)] += found
      else:
          gt["Multi@" + str(t)] += 1
          pred["Multi@" + str(t)] += found

      gt["Overall@" + str(t)] += 1
      pred["Overall@" + str(t)] += found

header = ["Type"]
header.extend(object_types)
ret_dict = {}

for t in self.iou_thr:
  table_columns = [["results"]]
  for object_type in object_types:
      metric = object_type + "@" + str(t)
      value = pred[metric] / max(gt[metric], 1)
      ret_dict[metric] = value
      table_columns.append([f"{value:.4f}"])

  table_data = [header]
  table_rows = list(zip(*table_columns))
  table_data += table_rows
  table = AsciiTable(table_data)
  table.inner_footing_row_border = True
  print_log("\n" + table.table, logger=logger)

I printed the shapes of top_bbox and gt_bboxes:

 top_bbox.shape      torch.Size([10, 9])
 gt_bboxes.shape     torch.Size([1, 9])

From what I understand, when gt is increased by one, pred can be increased by a maximum of found (could be num_query). It is possible that the value of pred is much larger than gt. In this case, the value = pred[metric] / max(gt[metric], 1) may be greater than 1.

I look forward to your reply.

Cannot work with 8 gpus, but work with 4. gpus

issue

Hi, thanks for your work!

when using the following command to run the code, I met a strange error: the code cannot work with 8 gpus even when changing batch size to 1 per gpu, but can work with 4 gpus.

python -m torch.distributed.launch --nproc_per_node=4 --master_port=25622 tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --launcher='pytorch' --work-dir='logs/det'

The machine I use is A6000, with 48G memory.

here is the logs

04/12 14:46:55 - mmengine - INFO - Epoch(train)  [1][  50/1946]  lr: 1.0000e-03  eta: 21:08:51  time: 3.2672  data_time: 0.3987  memory: 10248  grad_norm: 0.9408  loss: 2.3556  loss_center: 0.6253  loss_bbox: 0.7773  loss_cls: 0.9531
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119878 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119879 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119880 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119881 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119882 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119883 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 119885 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 119877) of binary: /root/anaconda3/envs/embodiedscan/bin/python
Traceback (most recent call last):
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/anaconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
=======================================================
tools/train.py FAILED
-------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
-------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-04-12_14:48:53
  host      : b70316d392fe
  rank      : 0 (local_rank: 0)
  exitcode  : -9 (pid: 119877)
  error_file: <N/A>
  traceback : Signal 9 (SIGKILL) received by PID 119877
=======================================================

enviroment

System environment:
    sys.platform: linux
    Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
    CUDA available: True
    MUSA available: False
    numpy_random_seed: 1791069987
    GPU 0,1,2,3,4,5,6,7: NVIDIA RTX A6000
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.8, V11.8.89
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    PyTorch: 1.11.0
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

    TorchVision: 0.12.0
    OpenCV: 4.9.0
    MMEngine: 0.10.3

Do you encounter similar errors, or could you give me some ideas about this one?

How to obtain the latest data?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

The trainval.zip file previously provided by the email does not contain the latest complex prompts and the test.pkl file. Should I fill out the application form again, or is there another way to obtain it?

Suggest a potential alternative/fix

No response

[Docs] How to test the demo?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

In your demo, config_path = '../config/detection/embodied-det3d_8xb1_embodiedscan-3d-284class-9dof-mlvl.py'
checkpoint_path = '../ckpt/continuous.pth' were used, however, I didn't find these two files in your repository.

And then I used 'configs/detection/cont-det3d_8xb1_embodiedscan-3d-284class-9dof.py' , 'cont-3ddet.pth' provided by you, but it crashed in nms_filter function, hope you could give more detailed guidance about how to test your demo, thanks a lot !

Suggest a potential alternative/fix

No response

[Docs] Calculation of 3D IoU-based average precision (AP)

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi, I noticed that when calculating the 3D IoU-based average precision (AP) with thresholds of 0.25 and 0.5, the box with the top-10 score is selected as the prediction box to be measured. Why is 10 chosen instead of 1?

The code from embodiedscan/eval/metrics/grounding_metric.py: Line 103-106:

box_index = target_scores.argsort(dim=-1, descending=True)[:10]
top_bbox = bboxes[box_index]

iou = top_bbox.overlaps(top_bbox, gt_bboxes)  # (num_query, 1)

Suggest a potential alternative/fix

No response

[Docs] Thanks for your awesome work! After filling out the questionnaire, how long does it take to receive the data download link?

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Thanks for your awesome work! After filling out the questionnaire, how long does it take to receive the data download link?

Suggest a potential alternative/fix

No response

[Feature] Support ScanRefer or nr3d/sr3d data set in Multi-View 3D Visual Grounding Task.

What is the feature?

Hi, Thank you very much for your great work. Have you tried training and verifying this benchmark on scanrefer or nr3d/sr3d datasets? If so, can you publish the code that supports training and verification in other 3D-grounding datasets? This can provide great convenience for researcher to further improve and evaluate based on your benchmark in Multi-View 3D Visual Grounding Task.

Any other context?

No response

[Bug] RuntimeError: CUDA out of memory. Tried to allocate 1048475.67 GiB

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.12.0
OpenCV: 4.9.0
MMEngine: 0.10.3

Reproduces the problem - code sample

Reproduces the problem - command or script

python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet

Reproduces the problem - error message

There show the fw_fn() function tried to allocate 1048475.67 GiB at GPU. I think it is a bug. how can I solve this problem?

Additional information

No response

[Docs] Relation between bbox_3d_lable in embodiedscan_infos_val.pkl and target_id in embodiedscan_val_full_vg.json

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

I want to extract the bbox of the objects in embodiedscan_val_full_vg.json from embodiedscan_infos_val.pkl. But the error appeared in incorrect info.

Here is my code :
`import os
import json
import pickle

vg_data = pickle.load(open("data/embodiedscan_infos_val.pkl","rb"))
metainfo = vg_data['metainfo']
bbox_info = vg_data['data_list'] ## list
annotation = json.load(open("data/embodiedscan_val_full_vg.json","r"))

Extract object mapping info

if isinstance(metainfo['categories'], list):
classes = metainfo['categories']
id_to_index = {i: i for i in range(len(classes))}
elif isinstance(metainfo['categories'], dict):
classes = list(metainfo['categories'].keys())
id_to_index = {
i: classes.index(classes)
for classes, i in metainfo['categories'].items()
}

Extract objcet bbox info

bbox_info_dict = {}
for info in bbox_info:
instances = {ins['bbox_label_3d']:ins['bbox_3d'] for ins in info['instances']}
bbox_info_dict[info['sample_idx']] = instances

Extract annotation info

for anno in annotation:
if 'scan' not in anno['scan_id']:
continue
scan_id = anno['scan_id']
tgt_ids = anno['target_id']
tgt_obj_name = anno['target']
tgt_bbox = bbox_info_dict[scan_id][tgt_ids]
`

Suggest a potential alternative/fix

No response

[Docs] The training command for MVDET

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

In the readme.md, this doc provide the command for training:
python tools/train.py configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet --launcher="pytorch"

But this command run get error, because the launcher="pytorch" is set for distributed training. The error is below:

"""
anaconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/dist/utils.py", line 102, in _init_dist_pytorch
rank = int(os.environ['RANK'])
File "/anaconda3/envs/embodiedscan/lib/python3.8/os.py", line 675, in getitem
raise KeyError(key) from None
KeyError: 'RANK'
"""

It seems that the distributed training parameters are not set during training, could you please share the detail running command about this, thank you very much.

Suggest a potential alternative/fix

No response

[Docs] Wondering about which part of ScanNet v2 is needed

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

I am preparing the data of EmbodiedScan. As ScanNet v2 is about 1.5T storage, I am wondering whether all the file is needed for preparing the data. Some other works may not that only a subset of ScanNet is needed.

Suggest a potential alternative/fix

like Vote2Cap-DETR: https://github.com/ch3cook-fdu/Vote2Cap-DETR/tree/master/data/scannet

openrobotlab / embodiedscan Goto Github PK

embodiedscan's People

Contributors

Stargazers

Watchers

Forkers

embodiedscan's Issues

Branch

📚 The doc issue

Suggest a potential alternative/fix

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Branch

📚 The doc issue

Suggest a potential alternative/fix

What is the feature?

Any other context?

What is the feature?

Any other context?

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Branch

📚 The doc issue

Suggest a potential alternative/fix

Prerequisite

Task

Branch

Environment