sense-x / co-detr Goto Github PK

View Code? Open in Web Editor NEW

853.0 853.0 86.0 14.59 MB

[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training

License: MIT License

Python 99.91% Dockerfile 0.05% Shell 0.04%

co-detr's People

Contributors

Stargazers

Watchers

Forkers

hotcore vt132 sunmingyang1987 josh3255 wf1024966 mucunwuxian jwyeeh-dev cv-det myjerry1996 henrfr zivkassnerab can-song redstonewill keyman9848 mbyase dokeefemain xiaonanzzz xueyingliu mhd-medfa k2m5t2 evolive2 tang799319844 fordsupr shivasv97 anh-vunguyen small-object-detection leolibin qiu023 manzhihuangnian roychao19477 shaunliew huichen98 jingrd anidaniel ganliqiang alifamrim danny0628 eyalsel chukit63 tjdeo1102 byungjinku chuanchuanccc piupiuisland hankhaohao souxun2015 eunchan24 chengfenggu jy-choi-git ahpu2014 zamen tommyngx litsunshine zongbowen serendipityxin trisi51 maodong2056 sadjadasghari hucutie huynt654 pokemotor jin9363 wormlove jpassionq wangfudong65 zzx135790 pazbunis zuyizhou uitamaki white0023 tnodecode hitbee-dev shizidushu sheffieldcao spaci-yanghaonan hcmus-sc203 omah03 masonperham ronghanche marcelasgv racketycomic yjingyu caodonghui426 leonbytes dunghuynhandy

co-detr's Issues

Why the loss weight is set as 10.0num_dec_layerlambda_2

bulid_dataset error

Hello, I encountered this issue during training, indicating that the data class does not have this attribute：

Traceback (most recent call last):
File "tools/train.py", line 245, in
main()
File "tools/train.py", line 219, in main
datasets = [build_dataset(cfg.data.train)]
File "/home/xxxxxx/Co-DETR/mmdet/datasets/builder.py", line 78, in build_dataset
dataset = MultiImageMixDataset(**cp_cfg)
TypeError: init() got an unexpected keyword argument 'ann_file'

Did you use Faster-rcnn head as aux head in this repo?

The best aux head setting is ATSS+Faster-rcnn in the paper, in which co-deformable-detr could achieve 49.5 in coco 1x. It seems that the config in this repo only contains ATSS head and still achieves 49.5 (in the paper only ATSS head setting has 48.7AP). Which setting did you use as a final version (i.e. the setting you used to train co-dino to 66.0AP)?

ViT-L (66.0 AP) pre-training & config file

请问ViT-L (66.0 AP)的模型的backbone是用的eva02的det任务给出的eva02_L_pt_m38m_p14to16 | 304M | Merged-38M | 56这个预训练模型嘛，如果不是是否可以说明一下具体使用的模型。论文中提到的Co-DINO-Deformable-DETR各部分是如何组成的呢，感谢！

discriminability score

Thanks for your great work and insightful analysis! But I'm confused that which feature map are you used for computing discriminability score since there at least four feature pyramid levels. Looking forward to your reply.

Self-dataset training problem

problem: AssertionError: The num_classes (4) in CoDeformDETRHead of MMDataParallel does not matches the length of CLASSES 80) in CocoDataset

object detection on small object

I want to know the performance between co_deformable_detr_swin_large_1x_coco and co_deformable_detr_swin_base_3x_coco, which is better on small object?
Thanks!

pytorch2onnx with_mask

Traceback (most recent call last):
File "tools/deployment/pytorch2onnx.py", line 337, in
skip_postprocess=args.skip_postprocess)
File "tools/deployment/pytorch2onnx.py", line 72, in pytorch2onnx
if model.with_mask:
File "/root/soft_file/anaconda3/envs/co_detr0904/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1186, in getattr
type(self).name, name))
AttributeError: 'CoDETR' object has no attribute 'with_mask'

ModuleNotFoundError: No module named 'mmcv._ext'

# test.py
from mmdet.apis import init_detector, inference_detector, show_result_pyplot
import mmcv

config_file = '../configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
# download the checkpoint from model zoo and put it in `checkpoints/`
# url: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
checkpoint_file = '../checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# test a single image
img = 'demo.jpg'
result = inference_detector(model, img)

Hello, I attempted to run your model using the code from the demo/inference_demo.ipynb file (test.py above). However, upon execution, I encountered the following issue:

Traceback (most recent call last):
  File "test.py", line 1, in <module>
    from mmdet.apis import init_detector, inference_detector, show_result_pyplot
  File "/Users/thaophan/Work/RnD/Co-DETR/mmdet/apis/__init__.py", line 2, in <module>
    from .inference import (async_inference_detector, inference_detector,
  File "/Users/thaophan/Work/RnD/Co-DETR/mmdet/apis/inference.py", line 8, in <module>
    from mmcv.ops import RoIPool
  File "/Users/thaophan/Work/RnD/Co-DETR/myenv/lib/python3.7/site-packages/mmcv/ops/__init__.py", line 2, in <module>
    from .active_rotated_filter import active_rotated_filter
  File "/Users/thaophan/Work/RnD/Co-DETR/myenv/lib/python3.7/site-packages/mmcv/ops/active_rotated_filter.py", line 10, in <module>
    ['active_rotated_filter_forward', 'active_rotated_filter_backward'])
  File "/Users/thaophan/Work/RnD/Co-DETR/myenv/lib/python3.7/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
    ext = importlib.import_module('mmcv.' + name)
  File "/Users/thaophan/.pyenv/versions/3.7.11/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'mmcv._ext'

python3 -c "import torch; print(torch.__version__)"
1.11.0
================================
pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.11.0/index.html
pip show mmcv-full

Name: mmcv-full
Version: 1.5.0
Summary: OpenMMLab Computer Vision Foundation
Home-page: https://github.com/open-mmlab/mmcv
Author: MMCV Contributors
Author-email: [email protected]
License:
Location: /Users/thaophan/Work/RnD/Co-DETR/myenv/lib/python3.7/site-packages
Requires: addict, numpy, opencv-python, packaging, Pillow, pyyaml, yapf
Required-by:
================================
macOS Monterey
Version 12.3.1
MacBook Pro (13-inch, 2019, Two Thunderbolt 3 ports)
Processor 1,4 GHz Quad-Core Intel Core i5
Memory 16 GB 2133 MHz LPDDR3
Graphics Intel Iris Plus Graphics 645 1536 MB

Can you help me to fix it ?
Thanks

The weights of the pretrained model swin_large_patch4_window12_384_22k.pth does not exist

Hello,

I wanted to test the last model object365 pretrained CO-Detr, but thhe weights of the pretrained model swin_large_patch4_window12_384_22k.pth does not exist as described in the config file:

How to finetune on LVIS?

Hi, thanks for your brilliant work!

It is amazing that you got 71.9 box AP and 59.7 mask AP on LVIS minival. But what is the difference between 71.9 box ap with Co-DINO Swin-L 36 LSJ LVIS 56.9 config? 56.9 is much lower than 71.9.

Can you show some details that how I can finetune on LVIS?

Performance of Swin-L

Hi @TempleX98, thanks for your awesome work.
Co-DINO with ViT-L achieves 66.0%AP. I would like to know the performance of Co-DINO with Swin-L. Thank you.

ViT-L checkpoint

Hi @TempleX98 , thanks for your awesome work.

[07/03/2023] Co-DETR with ViT-L (304M parameters) sets a new record of 65.6 AP on COCO test-dev, surpassing the previous best model InternImage-G (~3000M parameters).

Will you release the checkpoint of this ViT-L model? If yes, may I know the schedule? Thanks.

About the number of positive queries in the auxiliary heads

Hi, Thanks for your great work! I'm confused about the number of positive queries in the auxiliary heads. In your paper, you proposed to regard all the proposals assigned by the auxiliary head with one-to-many assignments as positive queries, and there were about 18.7 positive samples for the Faster-RCNN style assignment and 8.8 positive samples for the ATSS style assignment. But how did you deal with these various positive query numbers when you input these queries into the decoder? did you just pad these queries to the maximum positive query number within a batch? Hope you can help me with this little problem. Thanks in advance!

使用自己数据集训练co_dino_5scale_r50_1x_coco模型报错。。。

作者大大您好！使用自己的数据集，我在训练co_deformable_detr_r50_1x_coco时没有报错，但是训练co_dino_5scale_r50_1x_coco时出现了以下错误：

`During handling of the above exception, another exception occurred:

Traceback (most recent call last):
return DETECTORS.build( File "tools/train.py", line 245, in

File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 215, in build
main()
File "tools/train.py", line 213, in main
return self.build_func(*args, **kwargs, registry=self)
File "/opt/conda/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
main()
File "tools/train.py", line 213, in main
main()
File "tools/train.py", line 213, in main
main()return build_from_cfg(cfg, registry, default_args)model = build_detector(

File "tools/train.py", line 213, in main
File "/opt/conda/lib/python3.8/site-packages/mmdet/models/builder.py", line 58, in build_detector
File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg
model = build_detector(
File "/opt/conda/lib/python3.8/site-packages/mmdet/models/builder.py", line 58, in build_detector
return DETECTORS.build(raise type(e)(f'{obj_cls.name}: {e}')

File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 215, in build
AssertionError: CoDETR: CoDINOHead: The classification weight for loss and matcher should beexactly the same.
`

重点是：AssertionError: CoDETR: CoDINOHead: The classification weight for loss and matcher should beexactly the same.这一行报错不知道啥原因。

希望作者大大及各位大佬帮忙看一下问题出现在什么地方？感谢！！！

code without support mmdetection2d

Hi,
Is this code available without support mmdetection support? like in this format (https://github.com/fundamentalvision/Deformable-DETR)

torch.distributed.elastic.multiprocessing.api:failed

When I ran it, I got this error.
I just changed the config : bash tools/dist_train.sh projects/configs/co_dino/co_dino_5scale_swin_large_16e_o365tococo.py 4 run

Training on custom dataset

Thank you for your work. I'm training using my custom dataset but I got that error

KeyError: 'CoDETR is not in the models registry'

使用Co-DETR训练自定义数据集误检率很高

作者大大您好，我使用co_deformable_detr和co_dino训练自己的数据集都出现误检率很高的情况，dets差不多是gts的50倍左右，请问一下是我配置参数的问题吗？是否可以调整参数

train co_deformable_detr_r50_1x_coco, the AP lower.

Hi Thank you for your great work!
When I used "tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 1 ./weight/" trained the co_deformable_detr_r50_1x_coco on coco2017, not modify the parameters, and not using the pretrain model, the AP is lower. Should I use the pretrain model for training? Or others need modify? I am looking forward to your reply.
I test the coco result:
{ Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.461
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.633
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.502
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.280
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.499
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.604
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.433
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.697
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.826 }
some log(loss is bigger than yours):
{"mode": "train", "epoch": 1, "iter": 50, "lr": 0.0002, "memory": 10678, "data_time": 0.07217, "enc_loss_cls": 2.1538, "enc_loss_bbox": 1.30198, "enc_loss_iou": 1.16616, "loss_cls": 2.14317, "loss_bbox": 2.60367, "loss_iou": 2.14343, "d0.loss_cls": 1.96131, "d0.loss_bbox": 2.63388, "d0.loss_iou": 2.16948, "d1.loss_cls": 2.15417, "d1.loss_bbox": 2.60833, "d1.loss_iou": 2.15319, "d2.loss_cls": 2.23642, "d2.loss_bbox": 2.60134, "d2.loss_iou": 2.14901, "d3.loss_cls": 2.1437, "d3.loss_bbox": 2.60013, "d3.loss_iou": 2.14763, "d4.loss_cls": 2.11218, "d4.loss_bbox": 2.60227, "d4.loss_iou": 2.14508, "loss_rpn_cls": 3.50659, "loss_rpn_bbox": 0.47681, "loss_cls0": 5.12987, "acc0": 94.27344, "loss_bbox0": 1.34195, "loss_cls1": 14.23468, "loss_bbox1": 13.64706, "loss_centerness1": 7.91569, "loss_cls_aux0": 1.76192, "loss_bbox_aux0": 1.0423, "loss_iou_aux0": 0.56317, "d0.loss_cls_aux0": 1.65817, "d0.loss_bbox_aux0": 1.03151, "d0.loss_iou_aux0": 0.51077, "d1.loss_cls_aux0": 1.65367, "d1.loss_bbox_aux0": 1.03382, "d1.loss_iou_aux0": 0.52373, "d2.loss_cls_aux0": 1.72817, "d2.loss_bbox_aux0": 1.03599, "d2.loss_iou_aux0": 0.53485, "d3.loss_cls_aux0": 1.75864, "d3.loss_bbox_aux0": 1.0379, "d3.loss_iou_aux0": 0.5451, "d4.loss_cls_aux0": 1.75284, "d4.loss_bbox_aux0": 1.03995, "d4.loss_iou_aux0": 0.55428, "loss_cls_aux1": 1.79561, "loss_bbox_aux1": 1.18594, "loss_iou_aux1": 1.14411, "d0.loss_cls_aux1": 1.69971, "d0.loss_bbox_aux1": 1.18362, "d0.loss_iou_aux1": 1.14339, "d1.loss_cls_aux1": 1.71459, "d1.loss_bbox_aux1": 1.18391, "d1.loss_iou_aux1": 1.14331, "d2.loss_cls_aux1": 1.76991, "d2.loss_bbox_aux1": 1.18443, "d2.loss_iou_aux1": 1.14341, "d3.loss_cls_aux1": 1.80116, "d3.loss_bbox_aux1": 1.18488, "d3.loss_iou_aux1": 1.14361, "d4.loss_cls_aux1": 1.78536, "d4.loss_bbox_aux1": 1.18529, "d4.loss_iou_aux1": 1.14385, "loss": 136.48589, "grad_norm": 158.88891, "time": 0.77338}
{"mode": "train", "epoch": 1, "iter": 100, "lr": 0.0002, "memory": 10678, "data_time": 0.00637, "enc_loss_cls": 1.76474, "enc_loss_bbox": 1.1683, "enc_loss_iou": 1.16472, "loss_cls": 2.05801, "loss_bbox": 2.06883, "loss_iou": 1.80247, "d0.loss_cls": 1.88228, "d0.loss_bbox": 2.09778, "d0.loss_iou": 1.81527, "d1.loss_cls": 1.93532, "d1.loss_bbox": 2.0773, "d1.loss_iou": 1.79806, "d2.loss_cls": 1.96658, "d2.loss_bbox": 2.06948, "d2.loss_iou": 1.80279, "d3.loss_cls": 2.03127, "d3.loss_bbox": 2.06957, "d3.loss_iou": 1.80236, "d4.loss_cls": 2.05364, "d4.loss_bbox": 2.06872, "d4.loss_iou": 1.80287, "loss_rpn_cls": 2.22799, "loss_rpn_bbox": 0.37238, "loss_cls0": 3.29344, "acc0": 96.75781, "loss_bbox0": 1.24619, "loss_cls1": 13.36668, "loss_bbox1": 13.64313, "loss_centerness1": 7.90339, "loss_cls_aux0": 1.54687, "loss_bbox_aux0": 1.09403, "loss_iou_aux0": 0.55488, "d0.loss_cls_aux0": 1.37439, "d0.loss_bbox_aux0": 1.09063, "d0.loss_iou_aux0": 0.53468, "d1.loss_cls_aux0": 1.39558, "d1.loss_bbox_aux0": 1.0916, "d1.loss_iou_aux0": 0.54084, "d2.loss_cls_aux0": 1.44337, "d2.loss_bbox_aux0": 1.09214, "d2.loss_iou_aux0": 0.54455, "d3.loss_cls_aux0": 1.49814, "d3.loss_bbox_aux0": 1.09262, "d3.loss_iou_aux0": 0.54755, "d4.loss_cls_aux0": 1.52944, "d4.loss_bbox_aux0": 1.09319, "d4.loss_iou_aux0": 0.55106, "loss_cls_aux1": 1.59381, "loss_bbox_aux1": 1.14691, "loss_iou_aux1": 1.17268, "d0.loss_cls_aux1": 1.40565, "d0.loss_bbox_aux1": 1.14653, "d0.loss_iou_aux1": 1.17178, "d1.loss_cls_aux1": 1.44506, "d1.loss_bbox_aux1": 1.14654, "d1.loss_iou_aux1": 1.17201, "d2.loss_cls_aux1": 1.4978, "d2.loss_bbox_aux1": 1.14658, "d2.loss_iou_aux1": 1.17212, "d3.loss_cls_aux1": 1.55235, "d3.loss_bbox_aux1": 1.14668, "d3.loss_iou_aux1": 1.17229, "d4.loss_cls_aux1": 1.57807, "d4.loss_bbox_aux1": 1.14692, "d4.loss_iou_aux1": 1.1726, "loss": 122.95549, "grad_norm": 107.07448, "time": 0.70219}

Is there any other code released that uses other methods as an auxiliary method(except atss)?

Is there any other code released that uses other methods as an auxiliary method(except atss)?
Thank for your reply!

Can find Co-DINO Swin-L LVIS config~~~

Hi, I noticed that you shared the following model,

Co-DINO Swin-L 36 LSJ LVIS 56.9 config

But I want to use this model to detect classes in LVIS, is there any way that I could use it?

Thanks~~

train with customize dataset:I have changed the classes and num_classes, but got this mistake

AssertionError: The num_classes (10) in CoDeformDETRHead of MMDistributedDataParallel does not matches the length of CLASSES 1) in CocoDataset

code release

release code plz!!
this is a excellent work !!

yolo head作为辅助头

请问，你们有试过yolo head作为辅助头嘛？

Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3

if i did run pip install mmdet==2.25.3, this project has error: No module named mmdet. When i run pip install mmdet==2.25.3 , i try to train my dataset,then has error:KeyError: 'CoDETR is not in the models registry'

Unable to reproduce Co-DINO results on LVIS benchmark - Config guidance needed

Hello, thanks for your awesome work!

I am trying to reproduce the results of Co-DINO on the LVIS benchmark. Unfortunately, the config file is not released, and I am unable to achieve the claimed AP of 56.9. With my config, I can only reach an AP of 55.9.

I have added the following 3 files to projects/configs/co_dino/, and use the projects/configs/co_dino/co_dino_5scale_lsj_swin_large_1x_lvis.py as the config file.

Here are the config files I added:

projects/configs/co_dino/co_dino_5scale_r50_1x_lvis.py

_base_ = [
    '../_base_/datasets/lvis_v1_instance.py',
    '../_base_/default_runtime.py'
]
# model settings
num_dec_layer = 6
lambda_2 = 2.0

model = dict(
    type='CoDETR',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=False),
        norm_eval=True,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='ChannelMapper',
        in_channels=[256, 512, 1024, 2048],
        kernel_size=1,
        out_channels=256,
        act_cfg=None,
        norm_cfg=dict(type='GN', num_groups=32),
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            octave_base_scale=4,
            scales_per_octave=3,
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64, 128]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0*num_dec_layer*lambda_2),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0*num_dec_layer*lambda_2)),
    query_head=dict(
        type='CoDINOHead',
        num_query=900,
        num_classes=1203,
        num_feature_levels=5,
        in_channels=2048,
        sync_cls_avg_factor=True,
        as_two_stage=True,
        with_box_refine=True,
        mixed_selection=True,
        dn_cfg=dict(
            type='CdnQueryGenerator',
            noise_scale=dict(label=0.5, box=1.0),  # 0.5, 0.4 for DN-DETR
            group_cfg=dict(dynamic=True, num_groups=None, num_dn_queries=100)),
        transformer=dict(
            type='CoDinoTransformer',
            with_pos_coord=True,
            with_coord_feat=False,
            num_co_heads=2,
            num_feature_levels=5,
            encoder=dict(
                type='DetrTransformerEncoder',
                num_layers=6,
                with_cp=4, # number of layers that use checkpoint
                transformerlayers=dict(
                    type='BaseTransformerLayer',
                    attn_cfgs=dict(
                        type='MultiScaleDeformableAttention', embed_dims=256, num_levels=5, dropout=0.0),
                    feedforward_channels=2048,
                    ffn_dropout=0.0,
                    operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
            decoder=dict(
                type='DinoTransformerDecoder',
                num_layers=6,
                return_intermediate=True,
                transformerlayers=dict(
                    type='DetrTransformerDecoderLayer',
                    attn_cfgs=[
                        dict(
                            type='MultiheadAttention',
                            embed_dims=256,
                            num_heads=8,
                            dropout=0.0),
                        dict(
                            type='MultiScaleDeformableAttention',
                            embed_dims=256,
                            num_levels=5,
                            dropout=0.0),
                    ],
                    feedforward_channels=2048,
                    ffn_dropout=0.0,
                    operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
                                     'ffn', 'norm')))),
        positional_encoding=dict(
            type='SinePositionalEncoding',
            num_feats=128,
            temperature=20,
            normalize=True),
        loss_cls=dict(
            type='QualityFocalLoss',
            use_sigmoid=True,
            beta=2.0,
            loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=5.0),
        loss_iou=dict(type='GIoULoss', loss_weight=2.0)),
    roi_head=[dict(
        type='CoStandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32, 64],
            finest_scale=56),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=1203,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0., 0., 0., 0.],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            reg_decoded_bbox=True,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0*num_dec_layer*lambda_2),
            loss_bbox=dict(type='GIoULoss', loss_weight=10.0*num_dec_layer*lambda_2)))],
    bbox_head=[dict(
        type='CoATSSHead',
        num_classes=1203,
        in_channels=256,
        stacked_convs=1,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            ratios=[1.0],
            octave_base_scale=8,
            scales_per_octave=1,
            strides=[4, 8, 16, 32, 64, 128]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[0.1, 0.1, 0.2, 0.2]),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=1.0*num_dec_layer*lambda_2),
        loss_bbox=dict(type='GIoULoss', loss_weight=2.0*num_dec_layer*lambda_2),
        loss_centerness=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0*num_dec_layer*lambda_2)),],
    # model training and testing settings
    train_cfg=[
        dict(
            assigner=dict(
                type='HungarianAssigner',
                cls_cost=dict(type='FocalLossCost', weight=2.0),
                reg_cost=dict(type='BBoxL1Cost', weight=5.0, box_format='xywh'),
                iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0))),
        dict(
            rpn=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.7,
                    neg_iou_thr=0.3,
                    min_pos_iou=0.3,
                    match_low_quality=True,
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RandomSampler',
                    num=256,
                    pos_fraction=0.5,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=False),
                allowed_border=-1,
                pos_weight=-1,
                debug=False),
            rpn_proposal=dict(
                nms_pre=4000,
                max_per_img=1000,
                nms=dict(type='nms', iou_threshold=0.7),
                min_bbox_size=0),
            rcnn=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)),
        dict(
            assigner=dict(type='ATSSAssigner', topk=9),
            allowed_border=-1,
            pos_weight=-1,
            debug=False),],
    test_cfg=[
        dict(
            max_per_img=1000,
            nms=dict(type='soft_nms', iou_threshold=0.8)
        ),
        dict(
            rpn=dict(
                nms_pre=1000,
                max_per_img=1000,
                nms=dict(type='nms', iou_threshold=0.7),
                min_bbox_size=0),
            rcnn=dict(
                score_thr=0.0,
                nms=dict(type='nms', iou_threshold=0.5),
                max_per_img=100)),
        dict(
            nms_pre=1000,
            min_bbox_size=0,
            score_thr=0.0,
            nms=dict(type='nms', iou_threshold=0.6),
            max_per_img=100),
        # soft-nms is also supported for rcnn testing
        # e.g., nms=dict(type='soft_nms', iou_threshold=0.5, min_score=0.05)
    ])
#find_unused_parameters = True
#fp16 = dict(loss_scale=dict(init_scale=512))
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# train_pipeline, NOTE the img_scale and the Pad's size_divisor is different
# from the default setting in mmdet.
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(
        type='AutoAugment',
        policies=[
            [
                dict(
                    type='Resize',
                    img_scale=[(480, 1333), (512, 1333), (544, 1333),
                               (576, 1333), (608, 1333), (640, 1333),
                               (672, 1333), (704, 1333), (736, 1333),
                               (768, 1333), (800, 1333)],
                    multiscale_mode='value',
                    keep_ratio=True)
            ],
            [
                dict(
                    type='Resize',
                    # The radio of all image in train dataset < 7
                    # follow the original impl
                    img_scale=[(400, 4200), (500, 4200), (600, 4200)],
                    multiscale_mode='value',
                    keep_ratio=True),
                dict(
                    type='RandomCrop',
                    crop_type='absolute_range',
                    crop_size=(384, 600),
                    allow_negative_crop=True),
                dict(
                    type='Resize',
                    img_scale=[(480, 1333), (512, 1333), (544, 1333),
                               (576, 1333), (608, 1333), (640, 1333),
                               (672, 1333), (704, 1333), (736, 1333),
                               (768, 1333), (800, 1333)],
                    multiscale_mode='value',
                    override=True,
                    keep_ratio=True)
            ]
        ]),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=1),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
# test_pipeline, NOTE the Pad's size_divisor is different from the default
# setting (size_divisor=32). While there is little effect on the performance
# whether we use the default setting or use size_divisor=1.
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=1),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(filter_empty_gt=False, pipeline=train_pipeline),
    val=dict(pipeline=test_pipeline),
    test=dict(pipeline=test_pipeline))
# optimizer
optimizer = dict(
    type='AdamW',
    lr=2e-4,
    weight_decay=0.0001,
    # custom_keys of sampling_offsets and reference_points in DeformDETR
    paramwise_cfg=dict(custom_keys={'backbone': dict(lr_mult=0.1)}))

optimizer_config = dict(grad_clip=dict(max_norm=0.1, norm_type=2))
# learning policy
lr_config = dict(policy='step', step=[11])
runner = dict(type='EpochBasedRunner', max_epochs=12)

# NOTE: `auto_scale_lr` is for automatically scaling LR,
# USER SHOULD NOT CHANGE ITS VALUES.
# base_batch_size = (8 GPUs) x (2 samples per GPU)
auto_scale_lr = dict(base_batch_size=16)

projects/configs/co_dino/co_dino_5scale_lsj_r50_1x_lvis.py

_base_ = [
    'co_dino_5scale_r50_1x_lvis.py'
]

model = dict(with_attn_mask=False)

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)

image_size = (1536, 1536)
load_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    dict(
        type='Resize',
        img_scale=image_size,
        ratio_range=(0.1, 2.0),
        multiscale_mode='range',
        keep_ratio=True),
    dict(
        type='RandomCrop',
        crop_type='absolute_range',
        crop_size=image_size,
        recompute_bbox=True,
        allow_negative_crop=True),
    dict(type='FilterAnnotations', min_gt_bbox_wh=(1e-2, 1e-2)),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Pad', size=image_size, pad_val=dict(img=(114, 114, 114))),
]
train_pipeline = [
    dict(type='CopyPaste', max_num_pasted=100),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=image_size,
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Pad', size=image_size, pad_val=dict(img=(114, 114, 114))),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
dataset_type = 'LVISV1Dataset'
data_root = 'data/lvis_v1/'
img_data_root = 'data/coco/'
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='MultiImageMixDataset',
        dataset=dict(
            type=dataset_type,
            ann_file=data_root + 'lvis_v1_train.json',
            img_prefix=img_data_root,
            filter_empty_gt=False,
            pipeline=load_pipeline),
        pipeline=train_pipeline),
    val=dict(pipeline=test_pipeline),
    test=dict(pipeline=test_pipeline))

projects/configs/co_dino/co_dino_5scale_lsj_swin_large_1x_lvis.py

_base_ = [
    'co_dino_5scale_lsj_r50_1x_lvis.py'
]
pretrained = 'models/swin_large_patch4_window12_384_22k.pth'
# model settings
model = dict(
    backbone=dict(
        _delete_=True,
        type='SwinTransformerV1',
        embed_dim=192,
        depths=[2, 2, 18, 2],
        num_heads=[6, 12, 24, 48],
        out_indices=(0, 1, 2, 3),
        window_size=12,
        ape=False,
        drop_path_rate=0.3,
        patch_norm=True,
        use_checkpoint=False,
        pretrained=pretrained),
    neck=dict(in_channels=[192, 192*2, 192*4, 192*8]),
    query_head=dict(
        transformer=dict(
            encoder=dict(
                # number of layers that use checkpoint
                with_cp=6))))


img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
image_size = (1536, 1536)
load_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    dict(
        type='Resize',
        img_scale=image_size,
        ratio_range=(0.1, 2.0),
        multiscale_mode='range',
        keep_ratio=True),
    dict(
        type='RandomCrop',
        crop_type='absolute_range',
        crop_size=image_size,
        recompute_bbox=True,
        allow_negative_crop=True),
    dict(type='FilterAnnotations', min_gt_bbox_wh=(1e-2, 1e-2)),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Pad', size=image_size, pad_val=dict(img=(114, 114, 114))),
]
train_pipeline = [
    dict(type='CopyPaste', max_num_pasted=100),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=image_size,
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Pad', size=image_size, pad_val=dict(img=(114, 114, 114))),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
dataset_type = 'LVISV1Dataset'
data_root = 'data/lvis_v1/'
img_data_root = 'data/coco/'
data = dict(
    samples_per_gpu=1,
    workers_per_gpu=1,
    train=dict(
        type='MultiImageMixDataset',
        dataset=dict(
            type=dataset_type,
            ann_file=data_root + 'lvis_v1_train.json',
            img_prefix=img_data_root,
            filter_empty_gt=False,
            pipeline=load_pipeline),
        pipeline=train_pipeline),
    val=dict(pipeline=test_pipeline),
    test=dict(pipeline=test_pipeline))

I would be grateful if the author could provide some insights or guidance on how to achieve the claimed result of AP=56.9. Any help would be highly appreciated!

Thank you!

pytorch to onnx

/home/aigroup/chenzx/ws_internImage/bin/python3.8 /home/aigroup/chenzx/ws_internImage/code/Co-DETR/tools/deployment/pytorch2onnx.py
/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/tools/deployment/pytorch2onnx.py:284: UserWarning: Arguments like --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases.
warnings.warn('Arguments like --mean, --std, --dataset would be
/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/mmcv/onnx/symbolic.py:481: UserWarning: DeprecationWarning: This function will be deprecated in future. Welcome to use the unified model deployment toolbox MMDeploy: https://github.com/open-mmlab/mmdeploy
warnings.warn(msg)
/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2023-08-08 16:30:39,106 - mmcv - INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01}
2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_conv.weight - torch.Size([256, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_conv.bias - torch.Size([256]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_cls.weight - torch.Size([9, 256, 1, 1]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_cls.bias - torch.Size([9]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_reg.weight - torch.Size([36, 256, 1, 1]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,108 - mmcv - INFO -
rpn_reg.bias - torch.Size([36]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,185 - mmcv - INFO - initialize Shared2FCBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'distribution': 'uniform', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}]
2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.fc_cls.weight - torch.Size([81, 1024]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.fc_cls.bias - torch.Size([81]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.fc_reg.weight - torch.Size([320, 1024]):
NormalInit: mean=0, std=0.001, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.fc_reg.bias - torch.Size([320]):
NormalInit: mean=0, std=0.001, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.shared_fcs.0.weight - torch.Size([1024, 12544]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.shared_fcs.0.bias - torch.Size([1024]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.shared_fcs.1.weight - torch.Size([1024, 1024]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-08-08 16:30:39,236 - mmcv - INFO -
bbox_head.shared_fcs.1.bias - torch.Size([1024]):
XavierInit: gain=1, distribution=uniform, bias=0

/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
2023-08-08 16:30:39,248 - mmcv - INFO - initialize CoATSSHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01, 'override': {'type': 'Normal', 'name': 'atss_cls', 'std': 0.01, 'bias_prob': 0.01}}
2023-08-08 16:30:39,255 - mmcv - INFO -
cls_convs.0.conv.weight - torch.Size([256, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
cls_convs.0.gn.weight - torch.Size([256]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
cls_convs.0.gn.bias - torch.Size([256]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
reg_convs.0.conv.weight - torch.Size([256, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
reg_convs.0.gn.weight - torch.Size([256]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
reg_convs.0.gn.bias - torch.Size([256]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_cls.weight - torch.Size([80, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=-4.59511985013459

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_cls.bias - torch.Size([80]):
NormalInit: mean=0, std=0.01, bias=-4.59511985013459

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_reg.weight - torch.Size([4, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_reg.bias - torch.Size([4]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_centerness.weight - torch.Size([1, 256, 3, 3]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
atss_centerness.bias - torch.Size([1]):
NormalInit: mean=0, std=0.01, bias=0

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.0.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.1.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.2.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.3.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.4.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

2023-08-08 16:30:39,255 - mmcv - INFO -
scales.5.scale - torch.Size([]):
The value is the same before and after calling init_weights of CoATSSHead

load checkpoint from local path: /home/aigroup/chenzx/ws_internImage/code/Co-DETR/model/co_dino_5scale_swin_large_3x_coco.pth
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:423: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if W % self.patch_size[1] != 0:
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:425: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if H % self.patch_size[0] != 0:
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:362: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
Hp = int(np.ceil(H / self.window_size)) * self.window_size
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:363: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
Wp = int(np.ceil(W / self.window_size)) * self.window_size
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:203: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert L == H * W, "input feature has wrong size"
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:66: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
B = int(windows.shape[0] / (H * W / window_size / window_size))
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:241: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_r > 0 or pad_b > 0:
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:272: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert L == H * W, "input feature has wrong size"
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:277: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
pad_input = (H % 2 == 1) or (W % 2 == 1)
/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/dense_heads/swin_transformer.py:278: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_input:
============= Diagnostic Run torch.onnx.export version 2.0.0+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

Traceback (most recent call last):
File "/home/aigroup/chenzx/ws_internImage/code/Co-DETR/tools/deployment/pytorch2onnx.py", line 320, in
pytorch2onnx(
File "/home/aigroup/chenzx/ws_internImage/code/Co-DETR/tools/deployment/pytorch2onnx.py", line 90, in pytorch2onnx
torch.onnx.export(
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/onnx/utils.py", line 506, in export
_export(
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/onnx/utils.py", line 1548, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/onnx/utils.py", line 1113, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/onnx/utils.py", line 989, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/onnx/utils.py", line 893, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/jit/_trace.py", line 1274, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/jit/_trace.py", line 133, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/jit/_trace.py", line 124, in wrapper
outs.append(self.inner(*trace_inputs))
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
return old_func(*args, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/detectors/base.py", line 169, in forward
return self.onnx_export(img[0], img_metas[0])
File "/home/aigroup/chenzx/ws_internImage/code/Co-DETR/mmdet/models/detectors/co_detr.py", line 382, in onnx_export
outs = self.query_head(x)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/aigroup/chenzx/ws_internImage/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'img_metas'

如何可视化val数据集的模型输出结果

sh tools/dist_test.sh projects/configs/co_dino/co_dino_5scale_swin_large_16e_o365tococo.py results/co_dino_5scale_swin_large_16e_o365tococo/epoch_16.pth 8 --eval bbox --show-dir results/co_dino_5scale_swin_large_16e_o365tococo/visualize好像无效

Training Co-DINO, The classification weight for loss and matcher should beexactly the same

AssertionError: The classification weight for loss and matcher should beexactly the same.

AssertionError: CoDINOHead: The classification weight for loss and matcher should beexactly the same.

AssertionError: CoDETR: CoDINOHead: The classification weight for loss and matcher should beexactly the same.

pytorch2onnx，co_detr.py funtion onnx_export error

WARNING: The shape inference of mmcv::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.

onnx infer error
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /data/co_detr_train/Co-DETR/tools/deployment/co_detr_epoch_16.onnx failed:Fatal error: mmcv:grid_sampler(-1) is not a registered function/op

About the part of instance segmentaion

Firstly, thanks for you great work, and i want to know about when the code of instance segmentation will release？

ModuleNotFoundError: No module named 'mmengine'

When create and run a docker image using a Dockerfile, the following error occurs.

ModuleNotFoundError: No module named 'mmengine'

After that, install and run mmengine using pip, the following error occurs.

AssertionError: MMCV==1.3.17 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.

LVIS Checkpoints

Hi Authors, thank you for the great work.

We finetune Co-DETR on LVIS and achieve the best results without TTA: 71.9 box AP and 59.7 mask AP on LVIS minival, 67.9 box AP and 56.0 mask AP on LVIS val. For instance segmentation, we report the performance of the auxiliary mask branch.

Where can I find this model checkpoints?

Max_epochs in Co-DINO

The max_epochs is 24 in Co-DINO/Swin-L/36/DETR/COCO. Is there a mistake here, since max_epochs should be 36.

Inference code demo

Could you please post the code for inference on images and videos?

换A800卡之后，在自定义数据集上训练，loss=0

尊敬的作者大大，我之前用P40的8卡训练都没有问题，loss下降正常。但是换了单块A800的卡之后，同样的工程代码、同样的docker镜像，训练的时候出现了Loss都是0的情况：
2023-08-08 21:53:37,574 - mmdet - INFO - Epoch [1][800/818] lr: 2.000e-04, eta: 1:15:35, time: 0.496, data_time: 0.002, memory: 8800, enc_loss_cls: 0.0000, enc_loss_bbox: 0.0000, enc_loss_iou: 0.0000, loss_cls: 0.0000, loss_bbox: 0.0000, loss_iou: 0.0000, d0.loss_cls: 0.0000, d0.loss_bbox: 0.0000, d0.loss_iou: 0.0000, d1.loss_cls: 0.0000, d1.loss_bbox: 0.0000, d1.loss_iou: 0.0000, d2.loss_cls: 0.0000, d2.loss_bbox: 0.0000, d2.loss_iou: 0.0000, d3.loss_cls: 0.0000, d3.loss_bbox: 0.0000, d3.loss_iou: 0.0000, d4.loss_cls: 0.0000, d4.loss_bbox: 0.0000, d4.loss_iou: 0.0000, loss_rpn_cls: 0.0000, loss_rpn_bbox: 0.0000, loss_cls0: 0.0000, acc0: 100.0000, loss_bbox0: 0.0000, loss_cls1: 0.0000, loss_bbox1: 0.0000, loss_centerness1: 0.0000, loss_cls_aux0: 0.0000, loss_bbox_aux0: 0.0000, loss_iou_aux0: 0.0000, d0.loss_cls_aux0: 0.0000, d0.loss_bbox_aux0: 0.0000, d0.loss_iou_aux0: 0.0000, d1.loss_cls_aux0: 0.0000, d1.loss_bbox_aux0: 0.0000, d1.loss_iou_aux0: 0.0000, d2.loss_cls_aux0: 0.0000, d2.loss_bbox_aux0: 0.0000, d2.loss_iou_aux0: 0.0000, d3.loss_cls_aux0: 0.0000, d3.loss_bbox_aux0: 0.0000, d3.loss_iou_aux0: 0.0000, d4.loss_cls_aux0: 0.0000, d4.loss_bbox_aux0: 0.0000, d4.loss_iou_aux0: 0.0000, loss_cls_aux1: 0.0000, loss_bbox_aux1: 0.0000, loss_iou_aux1: 0.0000, d0.loss_cls_aux1: 0.0000, d0.loss_bbox_aux1: 0.0000, d0.loss_iou_aux1: 0.0000, d1.loss_cls_aux1: 0.0000, d1.loss_bbox_aux1: 0.0000, d1.loss_iou_aux1: 0.0000, d2.loss_cls_aux1: 0.0000, d2.loss_bbox_aux1: 0.0000, d2.loss_iou_aux1: 0.0000, d3.loss_cls_aux1: 0.0000, d3.loss_bbox_aux1: 0.0000, d3.loss_iou_aux1: 0.0000, d4.loss_cls_aux1: 0.0000, d4.loss_bbox_aux1: 0.0000, d4.loss_iou_aux1: 0.0000, loss: 0.0000, grad_norm: 0.0001

同时训练最后会报错如下：
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 365/365, 13.9 task/s, elapsed: 26s, ETA: 0sTraceback (most recent call last):
File "/home/haida_huanglei/mjt/Co-DETR-main/tools/train.py", line 245, in
main()
File "/home/haida_huanglei/mjt/Co-DETR-main/tools/train.py", line 234, in main
train_detector(
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/apis/train.py", line 245, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
self._do_evaluate(runner)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/core/evaluation/eval_hooks.py", line 63, in _do_evaluate
key_score = self.evaluate(runner, results)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 363, in evaluate
eval_res = self.dataloader.dataset.evaluate(
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 641, in evaluate
result_files, tmp_dir = self.format_results(results, jsonfile_prefix)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 383, in format_results
result_files = self.results2json(results, jsonfile_prefix)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 315, in results2json
json_results = self._det2json(results)
File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 252, in _det2json
data['category_id'] = self.cat_ids[label]
IndexError: list index out of range

楼主大大，我看之前也有人提过相同的问题，说是与class类别有关，但是类别我都改过了。而且P40跟A800上工程代码和镜像都是一样的，就是在A800的卡上训练不起来。不知道跟GPU卡的数量有没有关系。这几个参数我是这样设置的：
samples_per_gpu=1,
workers_per_gpu=2,
lr=2e-4，

不知道是什么原因，烦请作者大大指导一下~

Will you provide the training config file for ViT-L (66.0 AP)?

Thanks for your help.

Pytorch to onnx fail

RuntimeError: Only tuples, lists and Variables are supported as JIT inputs/outputs. Dictionaries and strings are also accepted, but their usage is not recommended. Here, received an input of unsupported type: numpy.ndarray

Can anyone help me with this issue? Thanks.

测试Flops时报错

报错行输出：flops, params = get_model_complexity_info(model, (3, 256, 256), as_strings=True, print_per_layer_stat=True)
File "C:\Users\j\anaconda3\envs\Co_DETR\lib\site-packages\ptflops\flops_counter.py", line 37, in get_model_complexity_info
报错：NotImplementedError
修改参数：parser.add_argument('--config', default="D:/YOLO/Co-DETR-main/projects/configs/co_dino/co_dino_5scale_r50_1x_coco.py", help='train config file path')
希望作者给予回复，十分感谢

在自定义数据集上训练，loss=0

我使用您的模型训练自己的数据集，在配置好环境后开始训练。
但训练过程中最后一次训练的信息如下
2023-08-08 21:53:37,574 - mmdet - INFO - Epoch [1][800/818] lr: 2.000e-04, eta: 1:15:35, time: 0.496, data_time: 0.002, memory: 8800, enc_loss_cls: 0.0000, enc_loss_bbox: 0.0000, enc_loss_iou: 0.0000, loss_cls: 0.0000, loss_bbox: 0.0000, loss_iou: 0.0000, d0.loss_cls: 0.0000, d0.loss_bbox: 0.0000, d0.loss_iou: 0.0000, d1.loss_cls: 0.0000, d1.loss_bbox: 0.0000, d1.loss_iou: 0.0000, d2.loss_cls: 0.0000, d2.loss_bbox: 0.0000, d2.loss_iou: 0.0000, d3.loss_cls: 0.0000, d3.loss_bbox: 0.0000, d3.loss_iou: 0.0000, d4.loss_cls: 0.0000, d4.loss_bbox: 0.0000, d4.loss_iou: 0.0000, loss_rpn_cls: 0.0000, loss_rpn_bbox: 0.0000, loss_cls0: 0.0000, acc0: 100.0000, loss_bbox0: 0.0000, loss_cls1: 0.0000, loss_bbox1: 0.0000, loss_centerness1: 0.0000, loss_cls_aux0: 0.0000, loss_bbox_aux0: 0.0000, loss_iou_aux0: 0.0000, d0.loss_cls_aux0: 0.0000, d0.loss_bbox_aux0: 0.0000, d0.loss_iou_aux0: 0.0000, d1.loss_cls_aux0: 0.0000, d1.loss_bbox_aux0: 0.0000, d1.loss_iou_aux0: 0.0000, d2.loss_cls_aux0: 0.0000, d2.loss_bbox_aux0: 0.0000, d2.loss_iou_aux0: 0.0000, d3.loss_cls_aux0: 0.0000, d3.loss_bbox_aux0: 0.0000, d3.loss_iou_aux0: 0.0000, d4.loss_cls_aux0: 0.0000, d4.loss_bbox_aux0: 0.0000, d4.loss_iou_aux0: 0.0000, loss_cls_aux1: 0.0000, loss_bbox_aux1: 0.0000, loss_iou_aux1: 0.0000, d0.loss_cls_aux1: 0.0000, d0.loss_bbox_aux1: 0.0000, d0.loss_iou_aux1: 0.0000, d1.loss_cls_aux1: 0.0000, d1.loss_bbox_aux1: 0.0000, d1.loss_iou_aux1: 0.0000, d2.loss_cls_aux1: 0.0000, d2.loss_bbox_aux1: 0.0000, d2.loss_iou_aux1: 0.0000, d3.loss_cls_aux1: 0.0000, d3.loss_bbox_aux1: 0.0000, d3.loss_iou_aux1: 0.0000, d4.loss_cls_aux1: 0.0000, d4.loss_bbox_aux1: 0.0000, d4.loss_iou_aux1: 0.0000, loss: 0.0000, grad_norm: 0.0001
这里所有的损失函数的值都为0，但通常情况下，损失函数的值不会全部为0，您能给些建议吗？
同时训练最后会报错如下：

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 365/365, 13.9 task/s, elapsed: 26s, ETA:     0sTraceback (most recent call last):
  File "/home/haida_huanglei/mjt/Co-DETR-main/tools/train.py", line 245, in <module>
    main()
  File "/home/haida_huanglei/mjt/Co-DETR-main/tools/train.py", line 234, in main
    train_detector(
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/apis/train.py", line 245, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_epoch')
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
    self._do_evaluate(runner)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/core/evaluation/eval_hooks.py", line 63, in _do_evaluate
    key_score = self.evaluate(runner, results)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmcv/runner/hooks/evaluation.py", line 363, in evaluate
    eval_res = self.dataloader.dataset.evaluate(
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 641, in evaluate
    result_files, tmp_dir = self.format_results(results, jsonfile_prefix)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 383, in format_results
    result_files = self.results2json(results, jsonfile_prefix)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 315, in results2json
    json_results = self._det2json(results)
  File "/home/haida_huanglei/anaconda3/envs/mjt/lib/python3.9/site-packages/mmdet/datasets/coco.py", line 252, in _det2json
    data['category_id'] = self.cat_ids[label]
IndexError: list index out of range

这可能是哪方面的问题，我该怎么处理呢?期待您的答复，不胜感激。

When I trained my datasets, the loss is always 0

2023-07-13 17:05:51,465 - mmdet - INFO - Epoch [1][450/3061] lr: 2.000e-05, eta: 1 day, 12:59:23, time: 1.170, data_time: 0.004, memory: 28919, enc_loss_cls: 0.0000, enc_loss_bbox: 0.0000, enc_loss_iou: 0.0000, loss_cls: 0.0000, loss_bbox: 0.0000, loss_iou: 0.0000, d0.loss_cls: 0.0000, d0.loss_bbox: 0.0000, d0.loss_iou: 0.0000, d1.loss_cls: 0.0000, d1.loss_bbox: 0.0000, d1.loss_iou: 0.0000, d2.loss_cls: 0.0000, d2.loss_bbox: 0.0000, d2.loss_iou: 0.0000, d3.loss_cls: 0.0000, d3.loss_bbox: 0.0000, d3.loss_iou: 0.0000, d4.loss_cls: 0.0000, d4.loss_bbox: 0.0000, d4.loss_iou: 0.0000, loss_rpn_cls: 0.0000, loss_rpn_bbox: 0.0000, loss_cls0: 0.0000, acc0: 100.0000, loss_bbox0: 0.0000, loss_cls1: 0.0000, loss_bbox1: 0.0000, loss_centerness1: 0.0000, loss_cls_aux0: 0.0000, loss_bbox_aux0: 0.0000, loss_iou_aux0: 0.0000, d0.loss_cls_aux0: 0.0000, d0.loss_bbox_aux0: 0.0000, d0.loss_iou_aux0: 0.0000, d1.loss_cls_aux0: 0.0000, d1.loss_bbox_aux0: 0.0000, d1.loss_iou_aux0: 0.0000, d2.loss_cls_aux0: 0.0000, d2.loss_bbox_aux0: 0.0000, d2.loss_iou_aux0: 0.0000, d3.loss_cls_aux0: 0.0000, d3.loss_bbox_aux0: 0.0000, d3.loss_iou_aux0: 0.0000, d4.loss_cls_aux0: 0.0000, d4.loss_bbox_aux0: 0.0000, d4.loss_iou_aux0: 0.0000, loss_cls_aux1: 0.0000, loss_bbox_aux1: 0.0000, loss_iou_aux1: 0.0000, d0.loss_cls_aux1: 0.0000, d0.loss_bbox_aux1: 0.0000, d0.loss_iou_aux1: 0.0000, d1.loss_cls_aux1: 0.0000, d1.loss_bbox_aux1: 0.0000, d1.loss_iou_aux1: 0.0000, d2.loss_cls_aux1: 0.0000, d2.loss_bbox_aux1: 0.0000, d2.loss_iou_aux1: 0.0000, d3.loss_cls_aux1: 0.0000, d3.loss_bbox_aux1: 0.0000, d3.loss_iou_aux1: 0.0000, d4.loss_cls_aux1: 0.0000, d4.loss_bbox_aux1: 0.0000, d4.loss_iou_aux1: 0.0000, loss: 0.0000, grad_norm: 0.0003

麻烦问下，SwinTransformerv1跟mmdet里的swin.py里的SwinTranformer有啥区别啊？

code release

May I ask when the code could be release and we can't wait to try it

Could you provide the code of FCOS as aux head?

The implementation of anchor-free aux head is kind of different from those with anchors, could you please provide the code for CoFCOS head? Thanks!

请问test时,如何保存推理val数据集产生的对应的包含bbox的json文件呢（类似于coco数据集的annotations.json）

推理的时候是需要加soft-nms吗？

'DataContainer' object is not subscriptable

I can train with dist_train.sh, but when the round of epoch training is over the evaluation will be a problem. I have adjusted the torch version and mmcv version, but I still have the problem. Hope you can give a solution, thanks!
File "tools/test.py", line 277, in
main()
File "tools/test.py", line 250, in main
or cfg.evaluation.get('gpu_collect', False))
File "/home/fdeng/project/detection/Co-DETR/mmdet/apis/test.py", line 109, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/fdeng/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fdeng/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/home/fdeng/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File "/home/fdeng/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/fdeng/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
return old_func(*args, **kwargs)
File "/home/fdeng/project/detection/Co-DETR/mmdet/models/detectors/base.py", line 174, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/fdeng/project/detection/Co-DETR/mmdet/models/detectors/base.py", line 137, in forward_test
img_meta[img_id]['batch_input_shape'] = tuple(img.size()[-2:])
TypeError: 'DataContainer' object is not subscriptable

计算flops

您好！我在训练完成后，想要测试模型的flops 使用tools/analysis_tools/get_flops.py进行计算，其中parser.add_argument('--config', default="D:/YOLO/Co-DETR-main/work_dirs/co_deformable_detr_r50_1x_coco/co_deformable_detr_r50_1x_coco.py", help='train config file path')。运行后报错：f'{obj_type} is not in the {registry.name} registry')
KeyError: 'CoDETR is not in the models registry' 请问是哪里出错了呢？望解答，万分感激

mmcv version

could you tell me the version of your mmcv.
if I use the latest version(>=2.0.0), the error is "No module named 'mmcv.parallel' "
if I use the 1.4.0 version, the error is "No module named 'mmcv._ext' "
Thanks for your help in advance.