Giter Club home page Giter Club logo

cbnet_pytorch's People

Contributors

borda avatar donnyyou avatar erotemic avatar eugenelawrence avatar fanqie03 avatar gfjiangly avatar hellock avatar innerlee avatar korabelnikov avatar lindahua avatar liushuchun avatar melikovk avatar michaelisc avatar mxbonn avatar myownskyw7 avatar oceanpang avatar rydenisbak avatar sovrasov avatar stevehjc avatar ternaus avatar thangvubk avatar tyomj avatar vdigpku avatar whikwon avatar wswday avatar xvjiarui avatar yhcao6 avatar youkaichao avatar zhihuagao avatar zwwwayne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cbnet_pytorch's Issues

训练模型

请问会分享COCO上训练好的模型吗?

CBNetv2

CBNetv2与CBNet有什么区别,在论文中有介绍吗

pytorch2onnx

Hi, have you tried to convert pth models to onnx models? When running pytorch2onnx.py, 'RuntimeError: ONNX export failed: Couldn't export Python operator _RoIAlignFunction ' happens.

OSError: db-x101-32-4d.pth is not a checkpoint file

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
CUDA_VISIBLE_DEVICES=0,2 PORT=29500 ./tools/dist_train.sh configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py 2

  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    I just changed the category

  2. What dataset did you use?
    coco dataset
    Environment

  3. Please run python mmdet/utils/collect_env.py to collect necessary environment infomation and paste it here.
    sys.platform: linux
    Python: 3.7.7 (default, May 7 2020, 21:25:33) [GCC 7.3.0]
    CUDA available: True
    CUDA_HOME: /usr/local/cuda-10.1
    NVCC: Cuda compilation tools, release 10.1, V10.1.243
    GPU 0,1,2,3: Tesla V100-DGXS-32GB
    GCC: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
    PyTorch: 1.3.1
    PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  • CuDNN 7.6.5
    • Built with CuDNN 7.6.4
  • Magma 2.5.0
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS=-fvisibility-inlines-hidden -std=c++11 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -I/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/include -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/work=/usr/local/src/conda/pytorch-1.3.1 -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place=/usr/local/src/conda-prefix -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.4.2
OpenCV: 4.3.0
MMCV: 1.0.4
MMDetection: 1.1.0+unknown
MMDetection Compiler: GCC 7.4
MMDetection CUDA Compiler: 10.1

  1. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
      conda install pytorch cudatoolkit=10.0 torchvision -c pytorch
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

Traceback (most recent call last):
  File "./tools/train.py", line 142, in <module>
    main()
  File "./tools/train.py", line 115, in main
    cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 43, in build_detector
    return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 15, in build
    return build_from_cfg(cfg, registry, default_args)
  File "/raid/xuwt/ting/CBNet/mmdet/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/faster_rcnn.py", line 27, in __init__
    pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 62, in __init__
    self.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 70, in init_weights
    self.backbone.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/backbones/db_resnet.py", line 546, in init_weights
    load_checkpoint(self, pretrained, strict=False, logger=logger)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 224, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 199, in _load_checkpoint
    raise IOError(f'{filename} is not a checkpoint file')
OSError: db-x101-32-4d.pth is not a checkpoint file
2020-08-01 16:42:00,417 - mmdet - INFO - load model from: db-x101-32-4d.pth
Traceback (most recent call last):
  File "./tools/train.py", line 142, in <module>
    main()
  File "./tools/train.py", line 115, in main
    cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 43, in build_detector
    return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/raid/xuwt/ting/CBNet/mmdet/models/builder.py", line 15, in build
    return build_from_cfg(cfg, registry, default_args)
  File "/raid/xuwt/ting/CBNet/mmdet/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/faster_rcnn.py", line 27, in __init__
    pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 62, in __init__
    self.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/detectors/two_stage.py", line 70, in init_weights
    self.backbone.init_weights(pretrained=pretrained)
  File "/raid/xuwt/ting/CBNet/mmdet/models/backbones/db_resnet.py", line 546, in init_weights
    load_checkpoint(self, pretrained, strict=False, logger=logger)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 224, in load_checkpoint
    checkpoint = _load_checkpoint(filename, map_location)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 199, in _load_checkpoint
    raise IOError(f'{filename} is not a checkpoint file')
OSError: db-x101-32-4d.pth is not a checkpoint file
Traceback (most recent call last):
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in <module>
    main()
  File "/raid/xuwt/anaconda3/envs/detection_cb/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/raid/xuwt/anaconda3/envs/detection_cb/bin/python', '-u', './tools/train.py', '--local_rank=1', 'configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

RuntimeError Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'

RuntimeError
RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 0 does not equal 1 (while checking arguments for cudnn_convolution)

environment
one computer with 3 gpus(but i only use 2 gpus) ,
gpu : GeForce RTX 2080 Ti
cpu: i7 64g
docker

modified

add os.environ['CUDA_VISIBLE_DEVICES']= '2,1' at the begining of the script file mmdet/tools/train.py

from future import division
import argparse
import copy
import os
import os.path as osp
import time

os.environ['CUDA_VISIBLE_DEVICES']= '2,1'

import mmcv
import torch
from mmcv import Config
from mmcv.runner import init_dist

from mmdet import version
from mmdet.apis import set_random_seed, train_detector
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.utils import collect_env, get_root_logger

dataset
coco2017

bash shell

#!/usr/bin/env bash

PYTHON=${PYTHON:-"python"}
CONFIG_FILE=configs/cbnet/faster_rcnn_db_x101_32x4d_fpn_1x.py
GPUS=2
PORT=${PORT:-29500}
CUDA_VISIBLE_DEVICES=2,1 $PYTHON -m torch.distributed.launch
--nnodes=1
--nproc_per_node=${GPUS}
--master_port=$PORT
tools/train.py ${CONFIG_FILE} --local_rank=1 --launcher pytorch --validate

composite connections

Using convert_db.py to convert model_zoo models to cbnet version cannot get weights of composite connections, right? If I want to test cbnet version models on coco, after python convert_db.py, I have to train a model by tools/train.py . Is there any error in my understanding? Or could you tell me where to find the cbnet model which can be tested on coco directly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.