Giter Club home page Giter Club logo

mask-rcnn-pytorch's Introduction

Pytorch--mask-rcnn

We modify the original Mask/Faster R-CNN which is implemented in torchvision with 4 aspects: backbone, region proposal network, RoI head and inverted attention (IA) module.
The modification are either modification or re-implementation of the papers below.

Backbone

CSPNET: A New Backbone That can enhance learning capability of CNN
CBAM: Convolutional Block Attention Module
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

RPN

Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution

RoI Head

Rethinking Classification and Localization for Object Detection

Inverted Attention

Improving Object Detection with Inverted Attention

Result

Installation

you can intall the requirements by anaconda or pip

  • python=3.7.7
pytorch=1.6.0=py3.7_cuda10.1.243_cudnn7.6.3_0
torchvision=0.7.0=py37_cu101
numpy=1.19.1=py37hbc911f0_0

conda environment setting

conda env create -f environment.yml -n rcnn

start conda environment

conda activate rcnn

Data Preparation

you can use the .sh file to collect the data you want

./download_coco2017.sh
./download_PASCAL.sh
./download_pedestrain.sh

Run the pedestrian dataset to make sure your model works

./download_pedestrian.sh
python test_pedestrian.py

Training on PASCAL

When you want to train on pascal voc. you don't need to run the .sh file because it is a built-in function in voc_utils.

python train_voc.py

Training COCO in distribute system

run distribute

python -m torch.distributed.launch --nproc_per_node=4 --use_env train_coco.py

python -m torch.distributed.launch --nproc_per_node= --use_env train_coco\
    --dataset coco --model maskrcnn_resnet50_fpn --epochs 26\
    --lr-steps 16 22 --aspect-ratio-group-factor 3

python -m torch.distributed.launch --nproc_per_node=4 --use_env train_coco\
    --dataset coco --model maskrcnn_resnet50_fpn --epochs 26\
    --lr-steps 16 22 --aspect-ratio-group-factor 3

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --use_env train_voc.py

Kill the nvidia zombie threads

kill $(ps aux | grep train_coco.py | grep -v grep | awk '{print $2}') 

Problem shooting

  1. cannot run .sh file
chmod 777 YOUR_SH_FILE_NAME.sh

Citation

The backbone foler timm and the pretrain model are from the awesome github repo rwightman/pytorch-image-models

mask-rcnn-pytorch's People

Contributors

derbychen avatar pkchen1129 avatar robintzeng avatar yuqing-zhou avatar yushuz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

mask-rcnn-pytorch's Issues

Add multiple class

I have been trying to train the model on the pedestrian data given in the code repository, and was pretty much successfully training it and received great results.

Now I am trying to train the model on my custom dataset, in my dataset I have four class including the background class. Unfortunately, I am unable to train the model with my custom dataset as the current code only supports two class [Background + 1 class].

Can anyone help me out with this.

conv2d(): argument 'input' (position 1) must be Tensor, not tuple

Hello. I'm training over COCO dataset with the mask rcnn model you provide. However, when the batch is loaded I noticed the code sends to the model a list of images (not a Tensor), presumably because the loaded images have different resolutions, but this does not run as it returns the error conv2d(): argument 'input' (position 1) must be Tensor, not tuple.

How are you avoiding this error?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.