Giter Club home page Giter Club logo

openpsg's Introduction

Panoptic Scene Graph Generation

           
           

Panoptic Scene Graph Generation
Jingkang YangYi Zhe AngZujin GuoKaiyang ZhouWayne ZhangZiwei Liu
S-Lab, Nanyang Technological University & SenseTime Research


Updates

  • Oct 31, 2022: We release the full dataset here that we used in the paper. The competition version is here.
  • Oct 9, 2022: The preliminary round of PSG challenge ends. We will release the entire dataset after the final round starts. Before that, if you want to access the PSG dataset (competition version), please email me.
  • Sep 4, 2022: We introduce the PSG Classification Task for NTU CE7454 Coursework, as described here.
  • Aug 21, 2022: We provide guidance on PSG challenge registration here.
  • Aug 12, 2022: Replicate demo and Cloud API is added, try it here!
  • Aug 10, 2022: We launched Hugging Face demo 🤗. Try it with your scene!
  • Aug 5, 2022: The PSG Challenge will be available on International Algorithm Case Competition ! All the data will be available there then! Stay tuned!
  • July 25, 2022: 💥 We are preparing a PSG competition with ECCV'22 SenseHuman Workshop and International Algorithm Case Competition, starting from Aug 6, with a prize pool of 🤑 US$150K 🤑. Join us on our Slack to stay updated!
  • July 25, 2022: PSG paper is available on arXiv.
  • July 3, 2022: PSG is accepted by ECCV'22.

What is PSG Task?

The Panoptic Scene Graph Generation (PSG) Task aims to interpret a complex scene image with a scene graph representation, with each node in the scene graph grounded by its pixel-accurate segmentation mask in the image.

To promote comprehensive scene understanding, we take into account all the content in the image, including "things" and "stuff", to generate the scene graph.

psg.jpg
PSG Task: To generate a scene graph that is grounded by its panoptic segmentation

PSG addresses many SGG problems

We believe that the biggest problem of classic scene graph generation (SGG) comes from noisy datasets. Classic scene graph generation datasets adopt a bounding box-based object grounding, which inevitably causes a number of issues:

  • Coarse localization: bounding boxes cannot reach pixel-level accuracy,
  • Inability to ground comprehensively: bounding boxes cannot ground backgrounds,
  • Tendency to provide trivial information: current datasets usually capture frivolous objects like head to form trivial relations like person-has-head, due to too much freedom given during bounding box annotation.
  • Duplicate groundings: the same object could be grounded by multiple separate bounding boxes.

All of the problems above can be easily addressed by the PSG dataset, which grounds the objects using panoptic segmentation with an appropriate granularity of object categories (adopted from COCO).

In fact, the PSG dataset contains 49k overlapping images from COCO and Visual Genome. In a nutshell, we asked annotators to annotate relations based on COCO panoptic segmentations, i.e., relations are mask-to-mask.

psg.jpg
Comparison between the classic VG-150 and PSG.

Clear Predicate Definition

We also find that a good definition of predicates is unfortunately ignored in the previous SGG datasets. To better formulate PSG task, we carefully define 56 predicates for PSG dataset. We try hard to avoid trivial or duplicated relations, and find that the designed 56 predicates are enough to cover the entire PSG dataset (or common everyday scenarios).

Type Predicates
Positional Relations (6) over, in front of, beside, on, in, attached to.
Common Object-Object Relations (5) hanging from, on the back of, falling off, going down, painted on.
Common Actions (31) walking on, running on, crossing, standing on, lying on, sitting on, leaning on, flying over, jumping over, jumping from, wearing, holding, carrying, looking at, guiding, kissing, eating, drinking, feeding, biting, catching, picking (grabbing), playing with, chasing, climbing, cleaning (washing, brushing), playing, touching, pushing, pulling, opening.
Human Actions (4) cooking, talking to, throwing (tossing), slicing.
Actions in Traffic Scene (4) driving, riding, parked on, driving on.
Actions in Sports Scene (3) about to hit, kicking, swinging.
Interaction between Background (3) entering, exiting, enclosing (surrounding, warping in)

Get Started

To setup the environment, we use conda to manage our dependencies.

Our developers use CUDA 10.1 to do experiments.

You can specify the appropriate cudatoolkit version to install on your machine in the environment.yml file, and then run the following to create the conda environment:

conda env create -f environment.yml

You shall manually install the following dependencies.

# Install mmcv
## CAUTION: The latest versions of mmcv 1.5.3, mmdet 2.25.0 are not well supported, due to bugs in mmdet.
pip install mmcv-full==1.4.3 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html

# Install mmdet
pip install openmim
mim install mmdet==2.20.0

# Install coco panopticapi
pip install git+https://github.com/cocodataset/panopticapi.git

# For visualization
conda install -c conda-forge pycocotools
pip install detectron2==0.5 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html

# If you're using wandb for logging
pip install wandb
wandb login

# If you develop and run openpsg directly, install it from source:
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.

Datasets and pretrained models are provided. Please unzip the files if necessary.

Before October 2022, we only release part of the PSG data for competition, where part of the test set annotations are wiped out. Users should change the json filename in psg.py (Line 4-5) to a correct filename for training or submission.

For the PSG competition, we provide psg_train_val.json (45697 training data + 1000 validation data with GT). Participant should use psg_val_test.json (1000 validation data with GT + 1177 test data without GT) to submit. Example submit script is here. You can use grade.sh to simulate the competition's grading mechanism locally.

Our codebase accesses the datasets from ./data/ and pretrained models from ./work_dirs/checkpoints/ by default.

If you want to play with VG, please download the VG dataset here, and put it into ./data dir. We have pipeline here to process the dataset.

├── ...
├── configs
├── data
│   ├── coco
│   │   ├── panoptic_train2017
│   │   ├── panoptic_val2017
│   │   ├── train2017
│   │   └── val2017
│   └── psg
│       ├── psg_train_val.json
│       ├── psg_val_test.json
│       └── ...
├── openpsg
├── scripts
├── tools
├── work_dirs
│   ├── checkpoints
│   ├── psgtr_r50
│   └── ...
├── ...

We suggest our users to play with ./tools/Visualize_Dataset.ipynb to quickly get familiar with PSG dataset.

To train or test PSG models, please see https://github.com/Jingkang50/OpenPSG/tree/main/scripts for scripts of each method. Some example scripts are below.

Training

# Single GPU for two-stage methods, debug mode
PYTHONPATH='.':$PYTHONPATH \
python -m pdb -c continue tools/train.py \
  configs/psg/motif_panoptic_fpn_r50_fpn_1x_sgdet_psg.py

# Multiple GPUs for one-stage methods, running mode
PYTHONPATH='.':$PYTHONPATH \
python -m torch.distributed.launch \
--nproc_per_node=8 --master_port=29500 \
  tools/train.py \
  configs/psgformer/psgformer_r50_psg.py \
  --gpus 8 \
  --launcher pytorch

Testing

# sh scripts/imp/test_panoptic_fpn_r50_sgdet.sh
PYTHONPATH='.':$PYTHONPATH \
python tools/test.py \
  configs/imp/panoptic_fpn_r50_fpn_1x_sgdet_psg.py \
  path/to/checkpoint.pth \
  --eval sgdet

Submitting for PSG Competition

# sh scripts/imp/submit_panoptic_fpn_r50_sgdet.sh
PYTHONPATH='.':$PYTHONPATH \
python tools/test.py \
  configs/imp/panoptic_fpn_r50_fpn_1x_sgdet_psg.py \
  path/to/checkpoint.pth \
  --submit

OpenPSG: Benchmarking PSG Task

Supported methods (Welcome to Contribute!)

Two-Stage Methods (4)
  • IMP (CVPR'17)
  • MOTIFS (CVPR'18)
  • VCTree (CVPR'19)
  • GPSNet (CVPR'20)
One-Stage Methods (2)
  • PSGTR (ECCV'22)
  • PSGFormer (ECCV'22)

Supported datasets (Welcome to Contribute!)

  • VG-150 (IJCV'17)
  • PSG (ECCV'22)

Model Zoo

Method Backbone #Epoch R/mR@20 R/mR@50 R/mR@100 ckpt SHA256
IMP ResNet-50 12 16.5 / 6.52 18.2 / 7.05 18.6 / 7.23 link 7be2842b6664e2b9ef6c7c05d27fde521e2401ffe67dbb936438c69e98f9783c
MOTIFS ResNet-50 12 20.0 / 9.10 21.7 / 9.57 22.0 / 9.69 link 956471959ca89acae45c9533fb9f9a6544e650b8ea18fe62cdead495b38751b8
VCTree ResNet-50 12 20.6 / 9.70 22.1 / 10.2 22.5 / 10.2 link e5fdac7e6cc8d9af7ae7027f6d0948bf414a4a605ed5db4d82c5d72de55c9b58
GPSNet ResNet-50 12 17.8 / 7.03 19.6 / 7.49 20.1 / 7.67 link 98cd7450925eb88fa311a20fce74c96f712e45b7f29857c5cdf9b9dd57f59c51
PSGTR ResNet-50 60 28.4 / 16.6 34.4 / 20.8 36.3 / 22.1 link 1c4ddcbda74686568b7e6b8145f7f33030407e27e390c37c23206f95c51829ed
PSGFormer ResNet-50 60 18.0 / 14.8 19.6 / 17.0 20.1 / 17.6 link 2f0015ce67040fa00b65986f6ce457c4f8cc34720f7e47a656b462b696a013b7

Contributing

We appreciate all contributions to improve OpenPSG. We sincerely welcome community users to participate in these projects. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgements

OpenPSG is developed based on MMDetection. Most of the two-stage SGG implementations refer to MMSceneGraph and Scene-Graph-Benchmark.pytorch. We sincerely appreciate the efforts of the developers from the previous codebases.

Citation

If you find our repository useful for your research, please consider citing our paper:

@inproceedings{yang2022psg,
    author = {Yang, Jingkang and Ang, Yi Zhe and Guo, Zujin and Zhou, Kaiyang and Zhang, Wayne and Liu, Ziwei},
    title = {Panoptic Scene Graph Generation},
    booktitle = {ECCV}
    year = {2022}
}

@inproceedings{yang2023pvsg,
    author = {Yang, Jingkang and Peng, Wenxuan and Li, Xiangtai and Guo, Zujin and Chen, Liangyu and Li, Bo and Ma, Zheng and Zhou, Kaiyang and Zhang, Wayne and Loy, Chen Change and Liu, Ziwei},
    title = {Panoptic Video Scene Graph Generation},
    booktitle = {CVPR},
    year = {2023},
}

openpsg's People

Contributors

chenxwh avatar cliangyu avatar gseancdat avatar jingkang50 avatar yizhe-ang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

openpsg's Issues

Could not write output to result.pkl by test.py

Script:
python tools/test.py configs/psgtr/psgtr_r50_psg.py work_dirs/checkpoints/PSGTR/epoch_psgtr_baseline.pth --out work_dirs/psgtr_r50_psg/result.pkl --eval sgdet

The log shows below:
creating index...
index created!
load checkpoint from local path: work_dirs/checkpoints/PSGTR/epoch_psgtr_baseline.pth
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 2177/2177, 0.7 task/s, elapsed: 3003s, ETA: 0s
writing results to work_dirs/psgtr_r50_psg/result.pkl
Killed

I could not find the result.pkl in the path.

`UnboundLocalError: local variable 'keep_tri' referenced before assignment`

use_mask=True,

I changed the above line into use_mask=False to try to only get the bounding box predictions rather than panoptic masks, but got this error local variable 'keep_tri' referenced before assignment on the following line at inference time after training the first epoch.

det_bboxes = torch.cat((s_det_bboxes[keep_tri], o_det_bboxes[keep_tri]), 0)

I noticed keep_tri is defined on line 992, only when use_mask==True.

keep_tri = torch.ones_like(r_labels,dtype=torch.bool)

Could you kindly help me to resolve this issue or give any suggestions? Thank you so much :)

[Click Here for Full Error Log]
Traceback (most recent call last):
  File "tools/train.py", line 196, in <module>
    main()
  File "tools/train.py", line 192, in main
    meta=meta)
  File "/home/shunchiz/mmdet/mmdet/apis/train.py", line 209, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_epoch')
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
    self._do_evaluate(runner)
  File "/home/shunchiz/mmdet/mmdet/core/evaluation/eval_hooks.py", line 119, in _do_evaluate
    gpu_collect=self.gpu_collect)
  File "/home/shunchiz/mmdet/mmdet/apis/test.py", line 98, in multi_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
    return old_func(*args, **kwargs)
  File "/home/shunchiz/mmdet/mmdet/models/detectors/base.py", line 174, in forward
    Traceback (most recent call last):
return self.forward_test(img, img_metas, **kwargs)
  File "/home/shunchiz/mmdet/mmdet/models/detectors/base.py", line 147, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/home/shunchiz/OpenPSG/openpsg/models/frameworks/psgtr.py", line 144, in simple_test
    rescale=rescale)
  File "/home/shunchiz/mmdet/mmdet/models/dense_heads/base_dense_head.py", line 360, in simple_test
    return self.simple_test_bboxes(feats, img_metas, rescale=rescale)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 1187, in simple_test_bboxes
  File "tools/train.py", line 196, in <module>
    main()
  File "tools/train.py", line 192, in main
    meta=meta)
  File "/home/shunchiz/mmdet/mmdet/apis/train.py", line 209, in train_detector
    runner.run(data_loaders, cfg.workflow)
    results_list = self.get_bboxes(*outs, img_metas, rescale=rescale)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 198, in new_func
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
    return old_func(*args, **kwargs)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 899, in get_bboxes
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_epoch')
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
    scale_factor, rescale)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 1171, in _get_bboxes_single
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
    self._do_evaluate(runner)
  File "/home/shunchiz/mmdet/mmdet/core/evaluation/eval_hooks.py", line 119, in _do_evaluate
    gpu_collect=self.gpu_collect)
    det_bboxes = torch.cat((s_det_bboxes[keep_tri], o_det_bboxes[keep_tri]), 0)
UnboundLocalError: local variable 'keep_tri' referenced before assignment
  File "/home/shunchiz/mmdet/mmdet/apis/test.py", line 98, in multi_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
    return old_func(*args, **kwargs)
  File "/home/shunchiz/mmdet/mmdet/models/detectors/base.py", line 174, in forward
    return self.forward_test(img, img_metas, **kwargs)
  File "/home/shunchiz/mmdet/mmdet/models/detectors/base.py", line 147, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/home/shunchiz/OpenPSG/openpsg/models/frameworks/psgtr.py", line 144, in simple_test
    rescale=rescale)
  File "/home/shunchiz/mmdet/mmdet/models/dense_heads/base_dense_head.py", line 360, in simple_test
    return self.simple_test_bboxes(feats, img_metas, rescale=rescale)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 1187, in simple_test_bboxes
    results_list = self.get_bboxes(*outs, img_metas, rescale=rescale)
  File "/home/shunchiz/miniconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 198, in new_func
    return old_func(*args, **kwargs)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 899, in get_bboxes
    scale_factor, rescale)
  File "/home/shunchiz/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 1171, in _get_bboxes_single
    det_bboxes = torch.cat((s_det_bboxes[keep_tri], o_det_bboxes[keep_tri]), 0)
UnboundLocalError: local variable 'keep_tri' referenced before assignment

[Error]

when I run test.py --submit, i meet this error, how to modify code, Wait for you reply!!!

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 2177/2177, 6.7 task/s, elapsed: 323s, ETA: 0s
Loading testing groundtruth...

[>>>>>>>>>>>>>>>>>>>>>> ] 1000/2177, 185.4 task/s, elapsed: 5s, ETA: 6sTraceback (most recent call last):
File "tools/test.py", line 245, in
main()
File "tools/test.py", line 236, in main
metric = dataset.evaluate(outputs, **eval_kwargs)
File "/dahuafs/userdata/46623/PSG/OpenPSG-main/openpsg/datasets/psg.py", line 388, in evaluate
rel_pair_idxes=ann['rels'][:, :2],
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

How are the refine_bboxes and the objects mapped?

While inferencing, given an image, the model returns a dictionary having a key 'refine_bboxes', but how to identify which bounding box in the results['refine_bboxes'] maps to which object?

Also, a single box is identified by 5 coordinates, which all coordinates are included in this 5-dimensional array?

Dataset access after the challenge

Hi, I am interested in your work. However, the registration is closed when I try to participate in the challenge. Will the dataset be accessed after the challenge?

[Feature] Support PSG Challenge Test Code

To support PSG challenge on codalab, we need to provide the following things:

OpenPSG codebase:

  • add an arg "--submit" in tools/test.py, so that the participants can easily generate the submission file.
  • add tools/grade.py, which should be deployed in the codalab server.

Note: We should make the minimum modification to realize these new features.

Reference Data:

To put ground-truth data on codalab, we should provide a data_val directory and a psg_val.json with validation data only.
We should then fix the config option in the tools/grade.py with a new config file of configs/_base_/datasets/psg_val.py.

The deadline of this feature is the following Monday, 5pm. Then @Jingkang50 will set up the competition.

We shall track our ideas and designs in this issue.

Questions about training psgformer on multiple GPUs ?

Hi,
when i train psgformer on multiple GPUs (2x3090), i get the error with NCCL. :)

RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1603729047590/work/torch/lib/c10d/ProcessGroupNCCL.cpp:748, internal error, NCCL version 2.7.8

RuntimeError: Connection reset by peer

(openpsg) zztao@zztao-Precision-5820-Tower:~/dev/OpenPSG$ PYTHONPATH='.':$PYTHONPATH python -m torch.distributed.launch --nproc_per_node=2 --master_port=29500 tools/train.py configs/psgformer/psgformer_r50_psg.py --gpus 2 --launcher pytorch


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


zztao-Precision-5820-Tower:35819:35819 [0] NCCL INFO NCCL_SOCKET_IFNAME set by environment to eth0
zztao-Precision-5820-Tower:35819:35819 [0] NCCL INFO NCCL_SOCKET_IFNAME set to eth0

zztao-Precision-5820-Tower:35819:35819 [0] bootstrap.cc:32 NCCL WARN Bootstrap : no socket interface found
zztao-Precision-5820-Tower:35819:35819 [0] NCCL INFO init.cc:101 -> 3
zztao-Precision-5820-Tower:35819:35819 [0] NCCL INFO init.cc:123 -> 3
zztao-Precision-5820-Tower:35819:35819 [0] NCCL INFO init.cc:140 -> 3
Traceback (most recent call last):
File "tools/train.py", line 225, in
main()
File "tools/train.py", line 124, in main
init_dist(args.launcher, **cfg.dist_params)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
_init_dist_pytorch(backend, **kwargs)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 32, in _init_dist_pytorch
dist.init_process_group(backend=backend, **kwargs)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 442, in init_process_group
barrier()
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 1947, in barrier
work = _default_pg.barrier()
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1603729047590/work/torch/lib/c10d/ProcessGroupNCCL.cpp:748, internal error, NCCL version 2.7.8
Traceback (most recent call last):
File "tools/train.py", line 225, in
main()
File "tools/train.py", line 124, in main
init_dist(args.launcher, **cfg.dist_params)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
_init_dist_pytorch(backend, **kwargs)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 32, in _init_dist_pytorch
dist.init_process_group(backend=backend, **kwargs)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 442, in init_process_group
barrier()
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 1947, in barrier
work = _default_pg.barrier()
RuntimeError: Connection reset by peer
Traceback (most recent call last):
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/zztao/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zztao/anaconda3/envs/openpsg/bin/python', '-u', 'tools/train.py', '--local_rank=1', 'configs/psgformer/psgformer_r50_psg.py', '--gpus', '2', '--launcher', 'pytorch']' returned non-zero exit status 1.

ERROR:root:Evaluation failed! The length of results is not equal to the dataset len: 1000 != 2177

After running test.py --submit, I got panseg folder with only 1K png and one json file.When I submit in the competition , it reported an error.Here is the log.
[>>>>>>>>>> ] 816/2177, 74.1 task/s, elapsed: 11s, ETA: 18s
[>>>>>>>>>> ] 817/2177, 74.1 task/s, elapsed: 11s, ETA: 18s
[>>>>>>>>>> ] 818/2177, 74.2 task/s, elapsed: 11s, ETA: 18s
[>>>>>>>>>> /usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:3373: RuntimeWarning: Mean of empty slice.
out=out, **kwargs
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:170: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type
ret / rcount
ERROR:root:Evaluation failed! The length of results is not equal to the dataset len: 1000 != 2177
INFO:root:Response data:{
"code": -1,
"message": "failure",
"id": "2267",
"data": {}
}
INFO:root:{
"code": -1,
"message": "failure",
"id": "2267",
"data": {}
}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection
1
: www.cvmart.net:443

The test results of the PSGTR model are incorrect

I use the epoch_60.pth of PSGTR model to run test scripts
1663815778256

I change the detection_method='bbox'
1663815965952

This is the result in the log file you gave
1663816185885
There is a big difference between them,What is the problem behind this phenomenon?

Field Definition of "psg_cls_advanced.json"

Hi,

Where can I find the definition of each field in the file "psg_cls_advanced.json"? For example:

  • What do "category_id", "area", and "gqa_category_id" stand for in the "segments_info" section?
  • How does the "relations" section work? There are multiple groups of "relations", so which objects does this depict the relations between?
  • In the "annotations" part, what is "bbox_mode"? Is the definition of "category_id" the same as in the "segments_info" section?

In the "psg_cls_basic.json", do "relations" fields include ALL relationships between objects in an image?

An introduction to this kind of terminology and how these things work together will be very helpful to fresh CVers.

Best,
Mingzhe

Cannot use `rel_loss_cls = dict(use_sigmoid=True)`

When add line rel_loss_cls = dict(use_sigmoid=True) in PSGTr config file, the following error occurs:

RuntimeError: The size of tensor a (57) must match the size of tensor b (56) at non-singleton dimension 1

Full Log
Traceback (most recent call last):
  File "tools/train.py", line 225, in <module>
    main()
  File "tools/train.py", line 220, in main
    meta=meta,
  File "/opt/conda/lib/python3.7/site-packages/mmdet/apis/train.py", line 209, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/opt/conda/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
    losses = self(**data)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 172, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/shunchi/OpenPSG/openpsg/models/frameworks/psgtr.py", line 139, in forward_train
    gt_bboxes_ignore)
  File "/home/shunchi/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 871, in forward_train
    losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/home/shunchi/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 399, in loss
    img_metas_list, all_gt_bboxes_ignore_list)
  File "/opt/conda/lib/python3.7/site-packages/mmdet/core/utils/misc.py", line 30, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/home/shunchi/OpenPSG/openpsg/models/relation_heads/psgtr_head.py", line 548, in loss_single
    avg_factor=cls_avg_factor)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmdet/models/losses/cross_entropy_loss.py", line 250, in forward
    **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/mmdet/models/losses/cross_entropy_loss.py", line 108, in binary_cross_entropy
    pred, label.float(), pos_weight=class_weight, reduction='none')
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2982, in binary_cross_entropy_with_logits
    return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
RuntimeError: The size of tensor a (57) must match the size of tensor b (56) at non-singleton dimension 1

I wonder whether it's because the rel_cls_out_channels becomes num_relations instead of num_relations + 1. So, what is the difference between use_sigmoid = True and use_sigmoid = False and why it influences rel_cls_out_channels?

Thank you so much!

About the results in your paper

There are 2186 images of test in the psg.json,in the log file 2177 images in the testing and verification phase, but 5000 images of val2017 in the coco file. In your paper, are the results obtained from 5000 test/verification photos or 2177 test/verification photos?
(I mean, how many pictures of training set and test set/verification set are used to get the results in your paper?the psg.json only marks 2186 test images)

Why are the test results different from the article?

Hi author, I used the PSGRT model weights in the article to test the results. The results are quite different from the original article, I would like to know if the published dataset is different or some other reason?How can I get the results in the article?

Trouble downloading the dataset (identity number?)

Hi, I tried downloading the dataset from the link provided in the readme file (cvmart.net), but after registering in the website, it still says that the dataset is only available after registering in the competition. To be able to register, a field that translates to "identity number" is required. Unfortunately I don't speak Chinese and the website is not available in any other language, and I'm not sure I know what kind of id is accepted.

some question about simple_test_sg_bboxes

In simple_test_sg_bboxes function, setting the predicated dist all to one-hot is confusing, how do I get the predicted logits from the pre-trained model? And how can I pre-train the model?

How to record the best results?

The test set and validation set in this experiment are the same, so after running 12 epochs, the checkpoints are saved as lastest.pth. Is the lastest.pth the best? If not, can the results record the best of the results validated after each epoch? If so how to save the best model instead of the last model?

Some confusion on code.

In line 323 and 329 on psgtr_head.py file, it is confused to use function 'view' to change shape (100, 3, ...) to (3, 100, ...). Do you mean to convert query unit into batch unit? I don't think the 'view' function can achieve this goal. Why isn't it suitable for permute or transpose?

Results are nearly zero when training the PSGTR

Results are always nearly zero on the resnet50 backbone. What is the problem behind this phenomenon?

image

It seems that the model is properly/successfully loaded and I set the lr=0.0001 as the log you provided

image

image

Loading pretrained model: The model and loaded state dict do not match exactly.

Hi I am testing the model with pretrained model provided by you and I have the following warning:
creating index... index created! load checkpoint from local path: work_dirs/checkpoints/detr_pan_r50.pth The model and loaded state dict do not match exactly

unexpected key in source state_dict: bbox_head.class_embed.weight, bbox_head.class_embed.bias, bbox_head.bbox_embed.layers.0.weight, bbox_head.bbox_embed.layers.0.bias, bbox_head.bbox_embed.layers.1.weight, bbox_head.bbox_embed.layers.1.bias, bbox_head.bbox_embed.layers.2.weight, bbox_head.bbox_embed.layers.2.bias, bbox_head.bbox_attention.q_linear.weight, bbox_head.bbox_attention.q_linear.bias, bbox_head.bbox_attention.k_linear.weight, bbox_head.bbox_attention.k_linear.bias, bbox_head.mask_head.lay1.weight, bbox_head.mask_head.lay1.bias, bbox_head.mask_head.gn1.weight, bbox_head.mask_head.gn1.bias, bbox_head.mask_head.lay2.weight, bbox_head.mask_head.lay2.bias, bbox_head.mask_head.gn2.weight, bbox_head.mask_head.gn2.bias, bbox_head.mask_head.lay3.weight, bbox_head.mask_head.lay3.bias, bbox_head.mask_head.gn3.weight, bbox_head.mask_head.gn3.bias, bbox_head.mask_head.lay4.weight, bbox_head.mask_head.lay4.bias, bbox_head.mask_head.gn4.weight, bbox_head.mask_head.gn4.bias, bbox_head.mask_head.lay5.weight, bbox_head.mask_head.lay5.bias, bbox_head.mask_head.gn5.weight, bbox_head.mask_head.gn5.bias, bbox_head.mask_head.out_lay.weight, bbox_head.mask_head.out_lay.bias, bbox_head.mask_head.adapter1.weight, bbox_head.mask_head.adapter1.bias, bbox_head.mask_head.adapter2.weight, bbox_head.mask_head.adapter2.bias, bbox_head.mask_head.adapter3.weight, bbox_head.mask_head.adapter3.bias

missing keys in source state_dict: bbox_head.obj_cls_embed.weight, bbox_head.obj_cls_embed.bias, bbox_head.obj_box_embed.layers.0.weight, bbox_head.obj_box_embed.layers.0.bias, bbox_head.obj_box_embed.layers.1.weight, bbox_head.obj_box_embed.layers.1.bias, bbox_head.obj_box_embed.layers.2.weight, bbox_head.obj_box_embed.layers.2.bias, bbox_head.sub_cls_embed.weight, bbox_head.sub_cls_embed.bias, bbox_head.sub_box_embed.layers.0.weight, bbox_head.sub_box_embed.layers.0.bias, bbox_head.sub_box_embed.layers.1.weight, bbox_head.sub_box_embed.layers.1.bias, bbox_head.sub_box_embed.layers.2.weight, bbox_head.sub_box_embed.layers.2.bias, bbox_head.rel_cls_embed.weight, bbox_head.rel_cls_embed.bias, bbox_head.sub_bbox_attention.q_linear.weight, bbox_head.sub_bbox_attention.q_linear.bias, bbox_head.sub_bbox_attention.k_linear.weight, bbox_head.sub_bbox_attention.k_linear.bias, bbox_head.obj_bbox_attention.q_linear.weight, bbox_head.obj_bbox_attention.q_linear.bias, bbox_head.obj_bbox_attention.k_linear.weight, bbox_head.obj_bbox_attention.k_linear.bias, bbox_head.sub_mask_head.lay1.weight, bbox_head.sub_mask_head.lay1.bias, bbox_head.sub_mask_head.gn1.weight, bbox_head.sub_mask_head.gn1.bias, bbox_head.sub_mask_head.lay2.weight, bbox_head.sub_mask_head.lay2.bias, bbox_head.sub_mask_head.gn2.weight, bbox_head.sub_mask_head.gn2.bias, bbox_head.sub_mask_head.lay3.weight, bbox_head.sub_mask_head.lay3.bias, bbox_head.sub_mask_head.gn3.weight, bbox_head.sub_mask_head.gn3.bias, bbox_head.sub_mask_head.lay4.weight, bbox_head.sub_mask_head.lay4.bias, bbox_head.sub_mask_head.gn4.weight, bbox_head.sub_mask_head.gn4.bias, bbox_head.sub_mask_head.lay5.weight, bbox_head.sub_mask_head.lay5.bias, bbox_head.sub_mask_head.gn5.weight, bbox_head.sub_mask_head.gn5.bias, bbox_head.sub_mask_head.out_lay.weight, bbox_head.sub_mask_head.out_lay.bias, bbox_head.sub_mask_head.adapter1.weight, bbox_head.sub_mask_head.adapter1.bias, bbox_head.sub_mask_head.adapter2.weight, bbox_head.sub_mask_head.adapter2.bias, bbox_head.sub_mask_head.adapter3.weight, bbox_head.sub_mask_head.adapter3.bias, bbox_head.obj_mask_head.lay1.weight, bbox_head.obj_mask_head.lay1.bias, bbox_head.obj_mask_head.gn1.weight, bbox_head.obj_mask_head.gn1.bias, bbox_head.obj_mask_head.lay2.weight, bbox_head.obj_mask_head.lay2.bias, bbox_head.obj_mask_head.gn2.weight, bbox_head.obj_mask_head.gn2.bias, bbox_head.obj_mask_head.lay3.weight, bbox_head.obj_mask_head.lay3.bias, bbox_head.obj_mask_head.gn3.weight, bbox_head.obj_mask_head.gn3.bias, bbox_head.obj_mask_head.lay4.weight, bbox_head.obj_mask_head.lay4.bias, bbox_head.obj_mask_head.gn4.weight, bbox_head.obj_mask_head.gn4.bias, bbox_head.obj_mask_head.lay5.weight, bbox_head.obj_mask_head.lay5.bias, bbox_head.obj_mask_head.gn5.weight, bbox_head.obj_mask_head.gn5.bias, bbox_head.obj_mask_head.out_lay.weight, bbox_head.obj_mask_head.out_lay.bias, bbox_head.obj_mask_head.adapter1.weight, bbox_head.obj_mask_head.adapter1.bias, bbox_head.obj_mask_head.adapter2.weight, bbox_head.obj_mask_head.adapter2.bias, bbox_head.obj_mask_head.adapter3.weight, bbox_head.obj_mask_head.adapter3.bias

Though the testing itself is smooth, the result shows:

SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=phrdet, type=Mean Recall. SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=sgdet, type=NoGraphConstraint @ 56 Mean Recall. SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=phrdet, type=NoGraphConstraint @ 56 Mean Recall.

{'sgdet_recall_R_20': nan, 'sgdet_recall_R_50': nan, 'sgdet_recall_R_100': nan, 'sgdet_mean_recall_mR_20': 0.0, 'sgdet_mean_recall_mR_50': 0.0, 'sgdet_mean_recall_mR_100': 0.0, 'sgdet_copystat': 'sgdet_recall_R_20: nan\nsgdet_recall_R_50: nan\nsgdet_recall_R_100: nan\nsgdet_mean_recall_mR_20: 0.000\nsgdet_mean_recall_mR_50: 0.000\nsgdet_mean_recall_mR_100: 0.000\n', 'sgdet_runtime_eval_str' and | over | 0.0000 | in front of | 0.0000 | beside | 0.0000 |\n| on | 0.0000 | in | 0.0000 | attached to | 0.0000 |\n| hanging from | 0.0000 | on back of | 0.0000 | falling off | 0.0000 |\n| going down | 0.0000 | painted on | 0.0000 | walking on | 0.0000 |\n| running on | 0.0000 | crossing | 0.0000 | standing on | 0.0000 |\n| lying on | 0.0000 | sitting on | 0.0000 | flying over | 0.0000 |\n| jumping over | 0.0000 | jumping from | 0.0000 | wearing | 0.0000 |\n| holding | 0.0000 | carrying | 0.0000 | looking at | 0.0000 |\n| guiding | 0.0000 | kissing | 0.0000 | eating | 0.0000 |\n| drinking | 0.0000 | feeding | 0.0000 | biting | 0.0000 |\n| catching | 0.0000 | picking | 0.0000 | playing with | 0.0000 |\n| chasing | 0.0000 | climbing | 0.0000 | cleaning | 0.0000 |\n| playing | 0.0000 | touching | 0.0000 | pushing | 0.0000 |\n| pulling | 0.0000 | opening | 0.0000 | cooking | 0.0000 |\n| talking to | 0.0000 | throwing | 0.0000 | slicing | 0.0000 |\n| driving | 0.0000 | riding | 0.0000 | parked on | 0.0000 |\n| driving on | 0.0000 | about to hit | 0.0000 | kicking | 0.0000 |\n| swinging | 0.0000 | entering | 0.0000 | exiting | 0.0000 |\n| enclosing | 0.0000 | leaning on | 0.0000 | None | None . All results are zeroes. `

During the testing, I notice that there are warnings in the log, not too sure if it's related to the zeroes issue:

home/anaconda3/envs/detectron2/lib/python3.7/site-packages/mmdet/models/utils/positional_encoding.py:81: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). dim_t = self.temperature**(2 * (dim_t // 2) / self.num_feats) /home/OpenPSG/openpsg/models/relation_heads/psgtr_head.py:940: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). triplet_index = r_indexes // self.num_relations /home/anaconda3/envs/detectron2/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3441: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /home/anaconda3/envs/detectron2/lib/python3.7/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount)

When training the PSGTR with 3090, RuntimeError encountered!

When using cuda11.1, torch1.8.0, mmcv1.4.3, 3090

python -m pdb -c continue tools/train.py
configs/psgtr/psgtr_r101_psg.py

wandb: 🚀 View run at https://wandb.ai/hszhoushen/psgtr/runs/3nmjnxmg
/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmdet/models/losses/cross_entropy_loss.py:239: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
  self.class_weight, device=cls_score.device)
Traceback (most recent call last):
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/pdb.py", line 1699, in main
    pdb._runscript(mainpyfile)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/pdb.py", line 1568, in _runscript
    self.run(statement)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/bdb.py", line 578, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/data/lgzhou/psgg/OpenPSG/tools/train.py", line 2, in <module>
    import argparse
  File "/data/lgzhou/psgg/OpenPSG/tools/train.py", line 223, in main
    meta=meta,
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmdet/apis/train.py", line 209, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
    self.call_hook('after_train_iter')
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py", line 56, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/autograd/__init__.py", line 147, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /home/lgzhou/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/autograd/__init__.py(147)backward()
-> allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
(Pdb) exit()
Post mortem debugger finished. The tools/train.py will be restarted
> /data/lgzhou/psgg/OpenPSG/tools/train.py(2)<module>()
-> import argparse
(Pdb) exit

a bug for train_psgformer?

File "/userhome/miniconda3/envs/psg/lib/python3.7/site-packages/mmdet/apis/train.py", line 158, in train_detector
cfg.device,
File "/userhome/miniconda3/envs/psg/lib/python3.7/site-packages/mmcv/utils/config.py", line 48, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'device'

when i run the train_psgformer.sh of the one-stage psgformer model, i meet this problem.
how can i solve this problem?

About the spatial_head in openpsg/models/roi_extractors/visiual_spatial.py

I am very interested in your work.I would really appreciate it if you could help me answer this question.
in line 516: roi_feats_result.append(head((roi_feats + rect_feats).view( roi_feats.size(0), -1))) ,I know that roi_feats represent the features of the union of subject and object, so what does rect_feats represent?
in line 474 :rect_input = torch.stack((head_rect, tail_rect),dim=1)
in line 477: rect_feats = self.spatial_conv(rect_input) What features does the rect_feats represent?(Does it represent the spatial featuresof subject and predicate? So what does this spatial feature mean?)

o_label = labels[o_idx] IndexError: list index out of range

Hi!
Thank you for your nice work. I try to run your code, when I run test.py for validation, I get an error:

the code i run:
python ./tools/test.py configs/vctree/panoptic_fpn_r50_fpn_1x_sgdet_psg.py ./data/vctree_panoptic_fpn_r50_fpn_1x_sgdet_psg/vctree_panoptic_fpn_r50_fpn_1x_sgdet_psg/epoch_12.pth --eval sgdet --show-dir ./data/show_dir

the error:
File "./tools/test.py", line 232, in main
args.show_score_thr)
File "/usr/local/lib/python3.7/dist-packages/mmdet/apis/test.py", line 57, in single_gpu_test
score_thr=show_score_thr)
File "/OpenPSG-main/openpsg/models/frameworks/sg_panoptic_fpn.py", line 969, in show_result
o_label = labels[o_idx]
IndexError: list index out of range

Is there something wrong with my config file?

KeyError: 'relations'

Hello.
When I ran ./ ce7475/ main.py , I got the error message like this

file: ~/OpenPSG/ce7454/dataset.py", line 81, in getitem
soft_label[sample['relations']] = 1
KeyError: 'relations'

How should I modify the code?

No checkpoint of `GPSNet` Baseline

In Model Zoo section of README.md, it provides checkpoints of 5 baselines but none for GPSNet.

OpenPSG/README.md

Lines 223 to 230 in f8e33fb

## Model Zoo
Method | Backbone | #Epoch | R/mR@20 | R/mR@50 | R/mR@100 | ckpt
--- | --- | --- | --- | --- |--- |--- |
IMP | ResNet-50 | 12 | 16.5 / 6.52 | 18.2 / 7.05 | 18.6 / 7.23 | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EiTgJ9q2h3hDpyXSdu6BtlQBHAZNwNaYmcO7SElxhkIFXw?e=8fytHc) |
MOTIFS | ResNet-50 | 12 | 20.0 / 9.10 | 21.7 / 9.57 | 22.0 / 9.69 | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/Eh4hvXIspUFKpNa_75qwDoEBJTCIozTLzm49Ste6HaoPow?e=ZdAs6z) |
VCTree | ResNet-50 | 12 | 20.6 / 9.70 | 22.1 / 10.2 | 22.5 / 10.2 | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EhKfi9kqAd9CnSoHztQIChABeBjBD3hF7DflrNCjlHfh9A?e=lWa1bd) |
PSGTR | ResNet-50 | 60 | 28.4 / 16.6 | 34.4 / 20.8 | 36.3 / 22.1 | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/Eonc-KwOxg9EmdtGDX6ss-gB35QpKDnN_1KSWOj6U8sZwQ?e=zdqwqP) |
PSGFormer | ResNet-50 | 60 | 18.0 / 14.8 | 19.6 / 17.0 | 20.1 / 17.6 | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EnaJchJzJPtGrkl4k09evPIB5JUkkDZ2tSS9F-Hd-1KYzA?e=9QA8Nc) |

Not sure if the checkpoint of GPSNet was missed by accident or I didn't understand this table correctly.

Any responses would be appreciated. Thanks!

ValueError: need at least one array to concatenate

I got an error when training PSGTR: python tools/train.py configs/psgtr/psgtr_r50_psg.py --gpus 1

Traceback (most recent call last):
File "tools/train.py", line 225, in
main()
File "tools/train.py", line 220, in main
meta=meta,
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmdet/apis/train.py", line 209, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 352, in iter
return self._get_iterator()
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 294, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 827, in init
self._reset(loader, first_iter=True)
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 857, in _reset
self._try_put_index()
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1091, in _try_put_index
index = self._next_index()
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 427, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 227, in iter
for idx in self.sampler:
File "/home/llipa/anaconda3/envs/openpsg/lib/python3.7/site-packages/mmdet/datasets/samplers/group_sampler.py", line 36, in iter
indices = np.concatenate(indices)
File "<array_function internals>", line 6, in concatenate
ValueError: need at least one array to concatenate

It seems that the dataset config file goes wrong: open-mmlab/mmdetection#3628

Validation:AttributeError: 'NoneType' object has no attribute 'shape'

Hello, first of all I am very interested in your work. I try to run your code, when I run an epoch and save the model for validation, I get an error:

2022-08-01 23:10:31,978 - mmdet - INFO - Epoch [1][5650/5713] lr: 3.000e-02, eta: 17:35:21, time: 0.996, data_time: 0.162, memory: 3983, loss_object: 0.9221, acc_object: 71.2541, loss_relation: 0.1331, acc_relation: 96.9015, loss: 1.0552, grad_norm: 2.0675
2022-08-01 23:11:22,457 - mmdet - INFO - Epoch [1][5700/5713] lr: 3.000e-02, eta: 17:34:32, time: 1.010, data_time: 0.169, memory: 3983, loss_object: 0.9318, acc_object: 70.6544, loss_relation: 0.1167, acc_relation: 97.2933, loss: 1.0485, grad_norm: 2.0553
2022-08-01 23:11:35,678 - mmdet - INFO - Saving checkpoint at 1 epochs
[ ] 0/2177, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/train.py", line 231, in
main()
File "tools/train.py", line 226, in main
meta=meta,
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/apis/train.py", line 209, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/wangyize/.local/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch
self._do_evaluate(runner)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/core/evaluation/eval_hooks.py", line 56, in _do_evaluate
results = single_gpu_test(runner.model, self.dataloader, show=False)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/apis/test.py", line 26, in single_gpu_test
for i, data in enumerate(data_loader):
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/anaconda3/envs/openpsg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/datasets/custom.py", line 214, in getitem
return self.prepare_test_img(idx)
File "/home/wangyize/OpenPSG/openpsg/datasets/psg.py", line 289, in prepare_test_img
return self.prepare_train_img(idx)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/datasets/custom.py", line 239, in prepare_train_img
return self.pipeline(results)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/datasets/pipelines/compose.py", line 41, in call
data = t(data)
File "/home/wangyize/.local/lib/python3.7/site-packages/mmdet/datasets/pipelines/loading.py", line 73, in call
results['img_shape'] = img.shape
AttributeError: 'NoneType' object has no attribute 'shape'

The path of the training data set in the configuration file is the same as the path of the verification data set. There is no problem in the training process. The error reported during verification should be that no pictures were found. Did you encounter this problem? Looking forward to your reply!

when run test.py with arg --submit: AttributeError: 'NoneType' object has no attribute 'shape'

When I run the model for submission, the code occour :

Traceback (most recent call last):
File "tools/test.py", line 246, in
main()
File "tools/test.py", line 219, in main
save_results(outputs)
File "/opt/data/private/Code/OpenPSG/tools/grade.py", line 35, in save_results
img = np.full(masks.shape[1:3], 0)
AttributeError: 'NoneType' object has no attribute 'shape'

my scripts: python3 tools/test.py configs/motifs/panoptic_fpn_r50_fpn_1x_predcls_psg.py /opt/data/private/Pretrain/motifis_epoch_12.pth --submit

When I run the one-stage model, the code runs normally.

Getting invalid load key error

load checkpoint from local path: epoch_60.pth
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
/tmp/ipykernel_9716/343802214.py in <module>
    442 model_ckt = "epoch_60.pth"
    443 cfg = Config.fromfile("configs/psgtr/psgtr_r50_psg_inference.py")
--> 444 model = init_detector(cfg, model_ckt, device="gpu")
    445 
    446 image = "messi.jpg"

~/openpsg_env/lib/python3.7/site-packages/mmdet/apis/inference.py in init_detector(config, checkpoint, device, cfg_options)
     40     model = build_detector(config.model, test_cfg=config.get('test_cfg'))
     41     if checkpoint is not None:
---> 42         checkpoint = load_checkpoint(model, checkpoint, map_location='cpu')
     43         if 'CLASSES' in checkpoint.get('meta', {}):
     44             model.CLASSES = checkpoint['meta']['CLASSES']

~/openpsg_env/lib/python3.7/site-packages/mmcv/runner/checkpoint.py in load_checkpoint(model, filename, map_location, strict, logger, revise_keys)
    540         dict or OrderedDict: The loaded checkpoint.
    541     """
--> 542     checkpoint = _load_checkpoint(filename, map_location, logger)
    543     # OrderedDict is a subclass of dict
    544     if not isinstance(checkpoint, dict):

~/openpsg_env/lib/python3.7/site-packages/mmcv/runner/checkpoint.py in _load_checkpoint(filename, map_location, logger)
    479            information, which depends on the checkpoint.
    480     """
--> 481     return CheckpointLoader.load_checkpoint(filename, map_location, logger)
    482 
    483 

~/openpsg_env/lib/python3.7/site-packages/mmcv/runner/checkpoint.py in load_checkpoint(cls, filename, map_location, logger)
    249         mmcv.print_log(
    250             f'load checkpoint from {class_name[10:]} path: {filename}', logger)
--> 251         return checkpoint_loader(filename, map_location)
    252 
    253 

~/openpsg_env/lib/python3.7/site-packages/mmcv/runner/checkpoint.py in load_from_local(filename, map_location)
    266     if not osp.isfile(filename):
    267         raise FileNotFoundError(f'{filename} can not be found.')
--> 268     checkpoint = torch.load(filename, map_location=map_location)
    269     return checkpoint
    270 

~/openpsg_env/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    593                     return torch.jit.load(opened_file)
    594                 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 595         return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
    596 
    597 

~/openpsg_env/lib/python3.7/site-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
    762             "functionality.")
    763 
--> 764     magic_number = pickle_module.load(f, **pickle_load_args)
    765     if magic_number != MAGIC_NUMBER:
    766         raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, 'v'.

about the pre-trained model of two-stage?

Hello. Is the pretrained model of the two-stage approach trained by you on the coco dataset? or is it from open source? And if so, can you provide a link to the original pretrained model?

submit function error:AttributeError: 'NoneType' object has no attribute 'astype'

when i test the model for submission, the code occour :
relations=rels.astype(np.int32).tolist(),
AttributeError: 'NoneType' object has no attribute 'astype'

my scripts:
PYTHONPATH='.':$PYTHONPATH
python tools/test.py
configs/psgformer/psgformer_r50_psg.py
/userhome/data/psg_weights/psgformer_r50/epoch_60.pth
--submit
how can i solve this problem? thank you very much for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.