vdigpku / cbnet_pytorch Goto Github PK

View Code? Open in Web Editor NEW

84.0 84.0 23.0 5.36 MB

CBNet implementation based on mmdetection (AAAI 2020)

License: Apache License 2.0

Python 86.70% Dockerfile 0.03% C++ 5.38% Cuda 7.82% Shell 0.07%

cbnet_pytorch's People

Contributors

Stargazers

Watchers

cbnet_pytorch's Issues

训练模型

请问会分享COCO上训练好的模型吗？

Why instance segmentation config files not available?

I am trying to use the old repo implementation, but I am having compatibility issues. This new repo could be an alternative. However, it seems that you don't provide instance segmentation config files. Why? Could you share them?

Thanks in advance

coco-pretrained weights will be released?

The freeze stages may have problems?

In the mmdet/models/backbones/, I find you didn't rewrite the freeze function.

How to train and test the code？

The first contact with mmdetection, can you talk about how to train the network?

Need coco-pretrained weights

Would you mind sharing your pretrained model weights for us ?

CBNetv2

CBNetv2与CBNet有什么区别，在论文中有介绍吗

pytorch2onnx

Hi, have you tried to convert pth models to onnx models? When running pytorch2onnx.py, 'RuntimeError: ONNX export failed: Couldn't export Python operator _RoIAlignFunction ' happens.

How to use the CBNET code in mmdetection2.11?

I find a lot issue in mm2...

OSError: db-x101-32-4d.pth is not a checkpoint file

Thanks for your error report and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

What command or script did you run?

CUDA_VISIBLE_DEVICES=0,2 PORT=29500 ./tools/dist_train.sh configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py 2

Did you make any modifications on the code or config? Did you understand what you have modified?
I just changed the category
What dataset did you use?
coco dataset
Environment
Please run python mmdet/utils/collect_env.py to collect necessary environment infomation and paste it here.
sys.platform: linux
Python: 3.7.7 (default, May 7 2020, 21:25:33) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0,1,2,3: Tesla V100-DGXS-32GB
GCC: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
PyTorch: 1.3.1
PyTorch compiling details: PyTorch built with:

GCC 7.3
Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.0
NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
CuDNN 7.6.5
- Built with CuDNN 7.6.4
Magma 2.5.0
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS=-fvisibility-inlines-hidden -std=c++11 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -I/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/include -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/work=/usr/local/src/conda/pytorch-1.3.1 -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place=/usr/local/src/conda-prefix -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.4.2
OpenCV: 4.3.0
MMCV: 1.0.4
MMDetection: 1.1.0+unknown
MMDetection Compiler: GCC 7.4
MMDetection CUDA Compiler: 10.1

You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
  conda install pytorch cudatoolkit=10.0 torchvision -c pytorch
- Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

Traceback (most recent call last):
  File "./tools/train.py", line 142, in <module>
    main()
  File "./tools/train.py", line 115, in main
    cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 43, in build_detector
    return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 15, in build
    return build_from_cfg(cfg, registry, default_args)
  File "/raid/xuwt/ting/CBNet/mmdet/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/faster_rcnn.py", line 27, in __init__
    pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 62, in __init__
    self.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 70, in init_weights
    self.backbone.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/backbones/db_resnet.py", line 546, in init_weights
    load_checkpoint(self, pretrained, strict=False, logger=logger)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 224, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 199, in _load_checkpoint
    raise IOError(f'{filename} is not a checkpoint file')
OSError: db-x101-32-4d.pth is not a checkpoint file
2020-08-01 16:42:00,417 - mmdet - INFO - load model from: db-x101-32-4d.pth
Traceback (most recent call last):
  File "./tools/train.py", line 142, in <module>
    main()
  File "./tools/train.py", line 115, in main
    cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 43, in build_detector
    return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 15, in build
    return build_from_cfg(cfg, registry, default_args)
  File "/raid/xuwt/ting/CBNet/mmdet/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/faster_rcnn.py", line 27, in __init__
    pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 62, in __init__
    self.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 70, in init_weights
    self.backbone.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/backbones/db_resnet.py", line 546, in init_weights
    load_checkpoint(self, pretrained, strict=False, logger=logger)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 224, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 199, in _load_checkpoint
    raise IOError(f'{filename} is not a checkpoint file')
OSError: db-x101-32-4d.pth is not a checkpoint file
Traceback (most recent call last):
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in <module>
    main()
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/raid/xuwt/anaconda3/envs/detection_cb/bin/python', '-u', './tools/train.py', '--local_rank=1', 'configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

RuntimeError Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'

RuntimeError
RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 0 does not equal 1 (while checking arguments for cudnn_convolution)

environment
one computer with 3 gpus(but i only use 2 gpus) ,
gpu : GeForce RTX 2080 Ti
cpu: i7 64g
docker

modified

add os.environ['CUDA_VISIBLE_DEVICES']= '2,1' at the begining of the script file mmdet/tools/train.py

from future import division
import argparse
import copy
import os
import os.path as osp
import time

os.environ['CUDA_VISIBLE_DEVICES']= '2,1'

import mmcv
import torch
from mmcv import Config
from mmcv.runner import init_dist

from mmdet import version
from mmdet.apis import set_random_seed, train_detector
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.utils import collect_env, get_root_logger

dataset
coco2017

bash shell

#!/usr/bin/env bash

PYTHON=${PYTHON:-"python"}
CONFIG_FILE=configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py
GPUS=2
PORT=${PORT:-29500}
CUDA_VISIBLE_DEVICES=2,1 $PYTHON -m torch.distributed.launch
--nnodes=1
--nproc_per_node=${GPUS}
--master_port=$PORT
tools/train.py ${CONFIG_FILE} --local_rank=1 --launcher pytorch --validate

composite connections

Using convert_db.py to convert model_zoo models to cbnet version cannot get weights of composite connections, right? If I want to test cbnet version models on coco, after python convert_db.py, I have to train a model by tools/train.py . Is there any error in my understanding? Or could you tell me where to find the cbnet model which can be tested on coco directly?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.