yuwenxiong / py-r-fcn Goto Github PK

View Code? Open in Web Editor NEW

1.0K 51.0 472.0 806 KB

R-FCN with joint training and python support

License: MIT License

Shell 5.13% Makefile 0.02% MATLAB 0.75% Python 87.96% C++ 0.06% Cuda 2.09% C 3.98%

py-r-fcn's Introduction

py-R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

It is highly recommended to use the MXNet version of R-FCN/Deformable R-FCN, which supports multi-GPU train/test.

WARNING: This code does not support CPU-only mode. (See #28).

Disclaimer

The official R-FCN code (written in MATLAB) is available here.

py-R-FCN is modified from the offcial R-FCN implementation and py-faster-rcnn code, and the usage is quite similar to py-faster-rcnn.

There are slight differences between py-R-FCN and the official R-FCN implementation.

py-R-FCN is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 90ms / image vs. 99ms / image for ResNet-50)
py-R-FCN supports both join training and alternative optimization of R-FCN.

Some modification

The original py-faster-rcnn uses class-aware bounding box regression. However, R-FCN use class-agnostic bounding box regression to reduce model complexity. So I add a configuration AGNOSTIC into fast_rcnn/config.py, and the default value is False. You should set it to True both on train and test phase if you want to use class-agnostic training and test.

OHEM need all rois to select the hard examples, so I changed the sample strategy, set BATCH_SIZE: -1 for OHEM, otherwise OHEM would not take effect.

In conclusion:

AGNOSTIC: True is required for class-agnostic bounding box regression

BATCH_SIZE: -1 is required for OHEM

And I've already provided two configuration files for you(w/ OHEM and w/o OHEM) under experiments/cfgs folder, you could just use them and needn't change anything.

License

R-FCN is released under the MIT License (refer to the LICENSE file for details).

Citing R-FCN

If you find R-FCN useful in your research, please consider citing:

@article{dai16rfcn,
    Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
    Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
    Journal = {arXiv preprint arXiv:1605.06409},
    Year = {2016}
}

Main Results

joint training

	training data	test data	[email protected]	time/img (Titian X)
R-FCN, ResNet-50	VOC 07+12 trainval	VOC 07 test	77.6%	0.099sec
R-FCN, ResNet-101	VOC 07+12 trainval	VOC 07 test	79.4%	0.136sec

	training data	test data	mAP@[0.5:0.95]	time/img (Titian X)
R-FCN, ResNet-101	COCO 2014 train	COCO 2014 val	27.9%	0.138sec

alternative optimization

	training data	test data	[email protected]	time/img (Titian X)
R-FCN, ResNet-50	VOC 07+12 trainval	VOC 07 test	77.4%	0.099sec
R-FCN, ResNet-101	VOC 07+12 trainval	VOC 07 test	79.4%	0.136sec

VOC 0712 model (trained on VOC07+12 trainval) of R-FCN

COCO model (trained on 2014 train) of R-FCN

Requirements: software

Important Please use the Microsoft-version Caffe(@commit 1a2be8e), this Caffe supports R-FCN layer, and the prototxt in this repository follows the Microsoft-version Caffe's layer name. You need to put the Caffe root folder under py-R-FCN folder, just like what py-faster-rcnn does.
Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

Python packages you might not have: cython, opencv-python, easydict
[Optional] MATLAB is required for official PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.

Requirements: hardware

Any NVIDIA GPU with 6GB or larger memory is OK(4GB is enough for ResNet-50).

Installation

Clone the R-FCN repository

git clone https://github.com/Orpine/py-R-FCN.git

We'll call the directory that you cloned R-FCN into RFCN_ROOT

Clone the Caffe repository

cd $RFCN_ROOT
git clone https://github.com/Microsoft/caffe.git

[optional]

cd caffe
git reset --hard 1a2be8e

(I only test on this commit, and I'm not sure whether this Caffe is still compatible with the prototxt in this repository in the future)

If you followed the above instruction, python code will add $RFCN_ROOT/caffe/python to PYTHONPATH automatically, otherwise you need to add $CAFFE_ROOT/python by your own, you could check $RFCN_ROOT/tools/_init_paths.py for more details.

Build the Cython modules
```
cd $RFCN_ROOT/lib
make
```

Build Caffe and pycaffe

cd $RFCN_ROOT/caffe
# Now follow the Caffe installation instructions here:
#   http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe

Demo

To use demo you need to download the pretrained R-FCN model, please download the model manually from OneDrive, and put it under $RFCN/data.

Make sure it looks like this:
```
$RFCN/data/rfcn_models/resnet50_rfcn_final.caffemodel
$RFCN/data/rfcn_models/resnet101_rfcn_final.caffemodel
```
To run the demo
```
$RFCN/tools/demo_rfcn.py
```

The demo performs detection using a ResNet-101 network trained for detection on PASCAL VOC 2007.

Preparation for Training & Testing

Download the training, validation, test data and VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar

Extract all of these tars into one directory named VOCdevkit

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
tar xvf VOCtrainval_11-May-2012.tar

It should have this basic structure

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
$VOCdevkit/VOC2012                    # image sets, annotations, etc.
# ... and several other directories ...

Since py-faster-rcnn does not support multiple training datasets, we need to merge VOC 2007 data and VOC 2012 data manually. Just make a new directory named VOC0712, put all subfolders except ImageSets in VOC2007 and VOC2012 into VOC0712(you'll merge some folders). I provide a merged-version ImageSets folder for you, please put it into VOCdevkit/VOC0712/.
Then the folder structure should look like this

	$VOCdevkit/                           # development kit
	$VOCdevkit/VOCcode/                   # VOC utility code
	$VOCdevkit/VOC2007                    # image sets, annotations, etc.
	$VOCdevkit/VOC2012                    # image sets, annotations, etc.
	$VOCdevkit/VOC0712                    # you just created this folder
	# ... and several other directories ...

Create symlinks for the PASCAL VOC dataset

cd $RFCN_ROOT/data
ln -s $VOCdevkit VOCdevkit0712

Please download ImageNet-pre-trained ResNet-50 and ResNet-100 model manually, and put them into $RFCN_ROOT/data/imagenet_models
Then everything is done, you could train your own model.

Usage

To train and test a R-FCN detector using the approximate joint training method, use experiments/scripts/rfcn_end2end.sh. Output is written underneath $RFCN_ROOT/output.

To train and test a R-FCN detector using the approximate joint training method with OHEM, use experiments/scripts/rfcn_end2end_ohem.sh. Output is written underneath $RFCN_ROOT/output.

To train and test a R-FCN detector using the alternative optimization method with OHEM, use experiments/scripts/rfcn_alt_opt_5stage_ohem.sh. Output is written underneath $RFCN_ROOT/output

cd $RFCN_ROOT
./experiments/scripts/rfcn_end2end[_ohem].sh [GPU_ID] [NET] [DATASET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ResNet-50, ResNet-101} is the network arch to use
# DATASET in {pascal_voc, coco} is the dataset to use(I only tested on pascal_voc)
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

Trained R-FCN networks are saved under:

output/<experiment directory>/<dataset name>/

Test outputs are saved under:

output/<experiment directory>/<dataset name>/<network snapshot name>/

Misc

Tested on Ubuntu 14.04 with a Titan X / GTX1080 GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz

py-faster-rcnn code can also work properly, but I do not add any other feature(such as ResNet and OHEM).

py-r-fcn's People

Contributors

Stargazers

Watchers

Forkers

wanjinchang pierrehao benjamesbabala jetyingjia msrdl vzvzx lyk125 yxliang seaofocean fireae jrabary weixiaee kensun0 thomasdic2000 caomw cmxnono zhengfangwu wuqixiaobai amilgeorge ml-lab kixiang baiyancheng20 wait1988 cynsithia curiositycreations aronfothi gaopeng-eugene sxjscience xibinyue nuanxinqing illusionmia xshhhm tanger830 lijiannuist ronnie-tian arsenluca xmyqsh swearos jhung0 1292765944 mylxiaoyi andrewraharjo nitinreddy3 milestonesvn tonychouzju xhniu rpmunoz runauto zzzyue zbxzc35 fbi0817 andrei-pokrovsky dchall88 wenlin-zhang aichitudou shichenliu donstang leezqcst drzhanying soledad89 brade31919 cv-ip bigsnarfdude melody-rain ieee820 splashblot abtinsetyani robustfengbin huaijin-chen johnsonc flowice xiangliu886 mouatez yenchih fighting-liu algoskynet pchank yao-matrix manmandong issac8huxley tangshitao projectafey nicolewang xflee birdson baigao wuzhongdehua patricksstar trigrass2 absorbguo qinghuizhao wangzhiyuan2016 tooowzh stevenstoner sureone jxlin litingfeng peilin-yang aicarmark wenyafei4

py-r-fcn's Issues

Error when I set IMS_PER_BATCH: 2 in the rfcn_end2end_ohem.yml

I get this error when I set IMS_PER_BATCH: 2 in the rfcn_end2end_ohem.yml
F1025 15:56:37.929327 3406 loss_layer.cpp:19] Check failed: bottom[0]->num() == bottom[1]->num() (2 vs. 1) The data and label should have the same number.

* Check failure stack trace: *

Is that other setting I should change when I want to train 2 images per batch?

About R-FCN+ resnet ensemble

I have noticed that R-FCN+ resnet ensemble have achieved the best result in VOC 2012. So what the difference between resnet and ensenmble? And is R-FCN used in VOC2012 exactly the same with this py-R-FCN?

About position-sensitive score maps

How the position-sensitive score maps generate? I don not find anything in detail in paper and author just say that use a bank of specialized convolutional layers as the FCN output.

Thank you very much.

In demo_rfcn.py, if cfg.TEST.HAS_RPN is set to False, then how do I pass value for blob['rois'] in the _get_blobs function in /py-R-FCN/lib/fast_rcnn/test.py ?

I understand that blobs['rois'] is being set to None. However if None is sent to _get_rois_blob (in the if branch), then another function '_project_im_rois' is giving this (AttributeError: 'NoneType' object has no attribute 'astype') error.

def _get_blobs(im, rois):
"""Convert an image and RoIs within that image into network inputs."""
blobs = {'data' : None, 'rois' : None}

blobs['data'], im_scale_factors = _get_image_blob(im)
if not cfg.TEST.HAS_RPN:
    blobs['rois'] = _get_rois_blob(rois, im_scale_factors)
print blobs['rois'] , im_scale_factors,"<<<<<<<<<<<"
return blobs, im_scale_factors

psroi_pooling_layer.cu:108 invalid configuration argument

Hi,

I met the issue during ResNet-50 training:
psroi_pooling_layer.cu:108] Check failed: error == cudaSuccess (9 vs. 0) invalid configuration argument

Ubuntu 14.04 + one 1080 GPU card.

Any idea on the issue? Thanks.

how to fine tune fast rcnn with resnet-50?

Error: Message type "caffe.LayerParameter" has no field named "psroi_pooling_param".

W1221 22:46:14.682919 8972 _caffe.cpp:125] Net('/RFCN_root/py-R-FCN/models/pascal_voc/ResNet-101/rfcn_end2end/test_agnostic.prototxt', 1, weights='/RFCN_root/py-R-FCN/data/rfcn_models/resnet101_rfcn_final.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 7106:25: Message type "caffe.LayerParameter" has no field named "psroi_pooling_param".
F1221 22:46:14.688835 8972 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /RFCN_root/py-R-FCN/models/pascal_voc/ResNet-101/rfcn_end2end/test_agnostic.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)

Got the above error, any idea how to solve it ?

What's the benefit of warmup?

It's so nice of you to share the code with us,it's a wonderful work!
I know about ohem which is to select hard examples and re-train them,but what's the use of warmup?
waiting for your apply!

train problem

I modifed the size of position sensitive score map from 7X7 to 3X3 and trained this net 200,000 times with ResNet50, but the mAP of the output model was ~6%. The pre-train-ResNet download from the https://github.com/daijifeng001/R-FCN, I followed your training step. Is there not enough training times？

training error

Hello,

While training I am getting the following error

I0204 03:08:08.855443 21819 solver.cpp:228] Iteration 920, loss = 0.266415
I0204 03:08:08.855478 21819 solver.cpp:244] Train net output #0: accuarcy = 0.941176
I0204 03:08:08.855485 21819 solver.cpp:244] Train net output #1: loss_bbox = 0.000348185 (* 1 = 0.000348185 loss)
I0204 03:08:08.855489 21819 solver.cpp:244] Train net output #2: loss_cls = 0.202786 (* 1 = 0.202786 loss)
I0204 03:08:08.855494 21819 solver.cpp:244] Train net output #3: rpn_cls_loss = 0.0552239 (* 1 = 0.0552239 loss)
I0204 03:08:08.855497 21819 solver.cpp:244] Train net output #4: rpn_loss_bbox = 0.0136765 (* 1 = 0.0136765 loss)
I0204 03:08:08.855501 21819 sgd_solver.cpp:106] Iteration 920, lr = 0.001
Traceback (most recent call last):
File "./tools/train_net.py", line 112, in
max_iters=args.max_iters)
File "/home/sandeep/workspace/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 205, in train_net
model_paths = sw.train_model(max_iters)
File "/home/sandeep/workspace/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 146, in train_model
self.solver.step(1)
File "/home/sandeep/workspace/py-R-FCN/tools/../lib/rpn/proposal_target_layer.py", line 66, in forward
rois_per_image, self._num_classes)
File "/home/sandeep/workspace/py-R-FCN/tools/../lib/rpn/proposal_target_layer.py", line 185, in _sample_rois
fg_inds = npr.choice(fg_inds, size=fg_rois_per_this_image, replace=False)
File "mtrand.pyx", line 1176, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18822)
TypeError: 'numpy.float64' object cannot be interpreted as an index

Please help me...

Thanks in advance

IOError: [Errno 2] No such file or directory

4.Since py-faster-rcnn does not support multiple training datasets, we need to merge VOC 2007 data and VOC 2012 data manually. Just make a new directory named VOC0712, put all subfolders except ImageSets in VOC2007 and VOC2012 into VOC0712(you'll merge some folders). I provide a merged-version ImageSets folder for you, please put it into VOCdevkit/VOC0712/.
when i follow this step and marged with provided Imageset than i got this error

'USE_GPU_NMS': True}
Loaded dataset voc_0712_trainval for training
Set proposal method: gt
Appending horizontally-flipped training examples...
voc_0712_trainval gt roidb loaded from /home/user01/Music/fc/py-R-FCN/data/cache/voc_0712_trainval_gt_roidb.pkl
Traceback (most recent call last):
File "./tools/train_net.py", line 104, in
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/train_net.py", line 69, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/train_net.py", line 66, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/user01/Music/fc/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 142, in get_training_roidb
imdb.append_flipped_images()
File "/home/user01/Music/fc/py-R-FCN/tools/../lib/datasets/imdb.py", line 111, in append_flipped_images
assert (boxes[:, 2] >= boxes[:, 0]).all()
AssertionError

but if i use the Imageset from VOC2007 than after finish all ite i got this error,does anyone face same problem?

im_detect: 4948/4952 0.097s 0.001s
im_detect: 4949/4952 0.097s 0.001s
im_detect: 4950/4952 0.097s 0.001s
im_detect: 4951/4952 0.097s 0.001s
im_detect: 4952/4952 0.097s 0.001s
Evaluating detections
Writing aeroplane VOC results file
Traceback (most recent call last):
File "./tools/test_net.py", line 90, in
test_net(net, imdb, max_per_image=args.max_per_image, vis=args.vis)
File "/home/user01/Music/fc/py-R-FCN/tools/../lib/fast_rcnn/test.py", line 298, in test_net
imdb.evaluate_detections(all_boxes, output_dir)
File "/home/user01/Music/fc/py-R-FCN/tools/../lib/datasets/pascal_voc.py", line 321, in evaluate_detections
self._write_voc_results_file(all_boxes)
File "/home/user01/Music/fc/py-R-FCN/tools/../lib/datasets/pascal_voc.py", line 248, in _write_voc_results_file
with open(filename, 'wt') as f:
IOError: [Errno 2] No such file or directory: '/home/user01/Music/fc/py-R-FCN/data/VOCdevkit0712/results/VOC0712/Main/comp4_33830d8c-2942-4517-b6ca-6a2beeaa1c62_det_test_aeroplane.txt'

why does roi_data_layer forward twice in one solver step?

I tried to train rfcn_end2end_ohem.

In /roi_data_layer/layer.py , I added print "forward RoiDataLayer" in forward() (line 146).
In fast-rcnn/train.py, I added
for layer_name, blob in self.solver.net.blobs.iteritems():
print layer_name+'\t'+ str(blob.data.shape)
in train_model() (line 148).

The output is
forward RoiDataLayer
forward RoiDataLayer
I1110 11:00:53.827004 25709 solver.cpp:228] Iteration 0, loss = 3.86447
I1110 11:00:53.827035 25709 solver.cpp:244] Train net output #0: accuarcy = 0
I1110 11:00:53.827045 25709 solver.cpp:244] Train net output #1: loss_bbox = 0.0267997 (* 1 = 0.0267997 loss)
I1110 11:00:53.827050 25709 solver.cpp:244] Train net output #2: loss_cls = 3.04452 (* 1 = 3.04452 loss)
I1110 11:00:53.827055 25709 solver.cpp:244] Train net output #3: rpn_cls_loss = 0.693147 (* 1 = 0.693147 loss)
I1110 11:00:53.827060 25709 solver.cpp:244] Train net output #4: rpn_loss_bbox = 0.111002 (* 1 = 0.111002 loss)
I1110 11:00:53.827067 25709 sgd_solver.cpp:106] Iteration 0, lr = 0.001
data (1, 3, 600, 943)
im_info (1, 3)
gt_boxes (1, 5, 1, 1)
data_input-data_0_split_0 (1, 3, 600, 943)
data_input-data_0_split_1 (1, 3, 600, 943)
im_info_input-data_1_split_0 (1, 3)
im_info_input-data_1_split_1 (1, 3)
gt_boxes_input-data_2_split_0 (1, 5, 1, 1)
gt_boxes_input-data_2_split_1 (1, 5, 1, 1)

It seems that it runs forward twice in one solver step, besides I'm sure IMS_PER_BATCH: 1. why is that ?

Does the project support multi-scale testing?

About running it on Ubuntu

I followed most of your steps(caffe is copyed from py-faster-rcnn), yet it shows :

[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 7106:25: Message type "caffe.LayerParameter" has no field named "psroi_pooling_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1106 19:37:52.764528  5270 upgrade_proto.cpp:68] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/yyc/Documents/py-R-FCN-master/models/pascal_voc/ResNet-101/rfcn_end2end/test_agnostic.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)

Am I doing it wrong? I looked up "psroi_pooling", yet I didn't find the layer's definition anywhere, plz help~

Can you update pretrained R-FCN model of COCO？

Can you update pretrained R-FCN model of COCO？Thanks！

what is the function of box_annotator_layer?

After reading the code inbox_annotator_layer.cu, I still have no idea what is the purpose of this layer? Is it used to select 128 ROIs to backpropagate? If so, why is the top->num not 128?

These are the output of the shapes:
gt_inds (2,)
all_rois shape: (138, 5)
gt_boxes shape: (2, 5)
rois_per_image inf
fg_rois_per_image inf
I1109 21:24:01.311867 32751 box_annotator_ohem_layer.cu:71] bottom_rois 140
I1109 21:24:01.311887 32751 box_annotator_ohem_layer.cu:72] bottom_loss 140
I1109 21:24:01.311890 32751 box_annotator_ohem_layer.cu:73] bottom_labels 140
I1109 21:24:01.311893 32751 box_annotator_ohem_layer.cu:74] num_imgs 1
I1109 21:24:01.311895 32751 box_annotator_ohem_layer.cu:75] roi_per_img_ 128
I1109 21:24:01.311898 32751 box_annotator_ohem_layer.cu:76] top_labels 140
I1109 21:24:01.311916 32751 box_annotator_ohem_layer.cu:77] top_bbox_loss_weights 140
gt_inds (2,)
all_rois shape: (300, 5)
gt_boxes shape: (2, 5)
rois_per_image inf
fg_rois_per_image inf
I1109 21:24:01.395092 32751 box_annotator_ohem_layer.cu:71] bottom_rois 302
I1109 21:24:01.395112 32751 box_annotator_ohem_layer.cu:72] bottom_loss 302
I1109 21:24:01.395129 32751 box_annotator_ohem_layer.cu:73] bottom_labels 302
I1109 21:24:01.395133 32751 box_annotator_ohem_layer.cu:74] num_imgs 1
I1109 21:24:01.395135 32751 box_annotator_ohem_layer.cu:75] roi_per_img_ 128
I1109 21:24:01.395138 32751 box_annotator_ohem_layer.cu:76] top_labels 302
I1109 21:24:01.395140 32751 box_annotator_ohem_layer.cu:77] top_bbox_loss_weights 302
gt_inds (2,)
all_rois shape: (216, 5)
gt_boxes shape: (2, 5)
rois_per_image inf
fg_rois_per_image inf
I1109 21:24:01.517530 32751 box_annotator_ohem_layer.cu:71] bottom_rois 218
I1109 21:24:01.517550 32751 box_annotator_ohem_layer.cu:72] bottom_loss 218
I1109 21:24:01.517554 32751 box_annotator_ohem_layer.cu:73] bottom_labels 218
I1109 21:24:01.517556 32751 box_annotator_ohem_layer.cu:74] num_imgs 1
I1109 21:24:01.517559 32751 box_annotator_ohem_layer.cu:75] roi_per_img_ 128
I1109 21:24:01.517561 32751 box_annotator_ohem_layer.cu:76] top_labels 218
I1109 21:24:01.517563 32751 box_annotator_ohem_layer.cu:77] top_bbox_loss_weights 218

py-RFCN slower on using CUDNN

The testing on GPUs (Nvidia K80) is 2-3 times slower on using CuDNN versus without CuDNN. I am using Cuda 7.5 (and 8.0), CuDNN 5.1 (and 5.0), Python 2.7, on Ubuntu 14.04 server.

I checked everything twice and this behavior is consistent. I don't understand how could this be the case?

training error with layer issues

@orpine
F1013 12:05:22.696523 14673 net.cpp:784] Cannot copy param 0 weights from layer 'rpn_conv/3x3'; shape mismatch. Source param shape is 512 1024 3 3 (4718592); target param shape is 512 2048 3 3 (9437184). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***

Segmentation fault in cpu mode.

I have successfully train the rfcn model and test in gpu model. But when i run the demo in cpu model the program will core dump with "Segmentation fault". Can anyone help me about this problem?

the gdb error info is:

#0 0x00007ffff6e44653 in __memcpy_ssse3_back () from /usr/lib64/libc.so.6
#1 0x00007fffe368c848 in caffe::ScaleLayer::Forward_cpu(std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&, std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&) ()

Does this project support trained by multiple GPUs?

I wander whether your project can be trained by multiple GPUs?
I have tried to use the command like --gpu 0,1,2 but it seems it does not support this way.
But in Dai's paper, they use 8 GPUs to train the model.

Pretrain model extract error

Hi Orpine
I download the pretrained model in the OneDrive Link, but when I extract the file, it raise an error about "truncated gzip input". I download the model several times but this error still appears. Is is the problem of model?

availability of demo code

Good work to make python code available.
I have trained Resnet50 and like to check performance of some images.
Do you know when the demo code will be available?

Thanks

How to visualize position-sensitive score maps

I want to visualize position-sensitive score maps as the paper show
how can I do this ?
I think the position-sensitive score maps are rfcn_cls right?
Thanks!

Fine Tuning

How to fine tune your model? I don't have sufficient data to retrain your model from scratch.I want to fine tune your model on my data which has only two classes ?

How to train on own dataset?

Hi, i'm trying to train on my own dataset with different classes, when i revised the train.prototxt, i just revised three place[new conv layer:num_output; input_data:num_classes],is that right?

Small documentation issues for train&test ResNet-50 (without OHEM)

Thanks for making this code available. I am trying to train & test ResNet-50 (without OHEM). Here are the issues I have found in the docs:

Have to run make in $FRCN_ROOT/lib/ folder for cython_bbox compilation
Before running the training script, make sure your add $CAFFE_ROOT/python to your PYTHONPATH environment variable
The training script takes an extra argument for the dataset (pascal_voc or coco). This is not documented in the README, but it is documented in the script itself.
Despite what is stated in the README, test.agonistic and train.agonistic are set to True by default by the cfgs/rfcn_end2end.yml

Use on own dataset

Hi,

How would I go about using this model on my own dataset with different classes? Any pointers?

Thanks!

AssertionError when running end to end training

Hi I met a problem when running end to end training with pascal dataset.

Traceback (most recent call last):
File "./tools/train_net.py", line 107, in
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/train_net.py", line 70, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/train_net.py", line 67, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/test/xianyan/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 142, in get_training_roidb
imdb.append_flipped_images()
File "/home/test/xianyan/py-R-FCN/tools/../lib/datasets/imdb.py", line 111, in append_flipped_images
assert (boxes[:, 2] >= boxes[:, 0]).all()
AssertionError

py-faster-rcnn support multiple imdb with --imdb voc_2007_trainval+voc_2012_trainval

Since py-faster-rcnn does not support multiple training datasets, we need to merge VOC 2007 data and VOC 2012 data manually. Just make a new directory named VOC0712, put all subfolders except ImageSets in VOC2007 and VOC2012 into VOC0712(you'll merge some folders). I provide a merged-version ImageSets folder for you, please put it into VOCdevkit/VOC0712/

To my knowledge, py-faster-rcnn support multiple imdb with --imdb voc_2007_trainval+voc_2012_trainval

cls_num in prototxt is not same

For example, in /models/pascal_voc/ResNet-101/rfcn_end2end/class-aware/test.prototxt line 7054 and line 7079 , the value of cls_num in line 7054 is 21, while in line 7079 is 8, I don't know why, can anyone explain it ? Thank you !

Question about 'RPN' location!

Thanks for this great work.
A question here, as in faster rcnn work itself and its implementation in ResNet paper, 'RPN' layer is inserted right after Res4X, but in your implementation, you insert it right after Res5X, will it affect final results?

can't reproduce the mAP reported in readme, only 28.0 mAP in coco minival

hi:

I have run the traning code without change anyting for three times, but get lower mAP than you report in the readme, here's the details:

VOC: 07+12 trainval ,07test, mine: 79.1 the reported: 79.4
COCO: COCO 2014 train, COCO 2014 minival, mine: 28.0 (with 1920000 iter) the reported: 29.0 (though I test on 2014 minival, I think it won't be much different from the 2014 val )

my test env: python 2.7 , TITAN X (Pascal) 12G,
I also reset git 1a2be8e.
so anything wrong? i wanna know what should i care to reproduce the results.

bests
jemmy li

Error when I train, I get "has no attribute 'set_random_seed' "

Hi,
Thanks for you code. I have run demo successfully.
But when I run train_net.py, the error appeared.
File "tools/train_net.py", line 100, in
caffe.set_random_seed(cfg.RNG_SEED)
AttributeError : 'module' object has no attribute 'set_random_seed'**

My caffe is daijifeng's caffe-rfcn

SystemError: NULL result without error in PyObject_Call

Traceback (most recent call last):
File "/home/deepinsight/py-R-FCN/tools/../lib/roi_data_layer/layer.py", line 15, in
from roi_data_layer.minibatch import get_minibatch
File "/home/deepinsight/py-R-FCN/tools/../lib/roi_data_layer/minibatch.py", line 12, in
import cv2
ImportError: /usr/local/lib/libopencv_ocl.so.2.4: undefined symbol: _ZN2cv16TLSDataContainerD2Ev
Traceback (most recent call last):
File "./tools/train_net.py", line 112, in
max_iters=args.max_iters)
File "/home/deepinsight/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 202, in train_net
pretrained_model=pretrained_model)
File "/home/deepinsight/py-R-FCN/tools/../lib/fast_rcnn/train.py", line 43, in init
self.solver = caffe.SGDSolver(solver_prototxt)
SystemError: NULL result without error in PyObject_Call

How to train the py-R-FCN using my own data?

I am trying to train the py-R-FCN using my own data, and my data has only 2 categories. I changed the codes referring to faster-RCNN.
But what should I do with train_agonistic.prototxt and test_agonistic.prototxt? I have changed param_str: "'num_classes': 21 to param_str: "'num_classes': 3 in line 11 in train_agonistic.prototxt.
Should I change num_output: 1029 #21_(7^2) cls_num_(score_maps_size^2) in line 3736 and output_dim: 21 in line 3790?
Anything else?
Thank you!

what about the accuracy trained by end to end?

what about the difference between end to end and 5 step?

How to train FCN if I generate region proposals using a different method?

I generated region proposals using Aggregated Channel Features (ACF) . And now I want to train Resnet-50 based on these regions. However, I can't find a script where I can just feed the detected BBoxes and train an FCN.

Please help!
TIA

train voc0712 error

layer {
  name: "res4c_branch2b_relu"
  type: "ReLU"
  bottom: "res4c_branch2b"
  top: "res4c
I0107 03:54:09.780325  9558 layer_factory.hpp:77] Creating layer input-data
./experiments/scripts/rfcn_end2end.sh: line 57:  9558 Segmentation fault      (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${PT_DIR}/${NET}/rfcn_end2end/solver.prototxt --weights data/imagenet_models/${NET}-model.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS

Extracting features

Without a fc layer, what would be the best way to extract features (for a t-SNE plot for example)? Should I take the res5c layer output and flatten it?

Why the length of output bbox is 8 ?

Thanks for this great work first.
I have question about the network structure, this is, why the length of output bbox is 8? If we use class-agonistic bbox regression, the length of bbox vector should be 4. Are the first 4 values regressed for background, and the last 4 values regressed for objects?

What is the benefit of using OHEM？

As author said, OHEM need all rois to select the hard examples.
What is the benefit of using OHEM specifically？ And how the effect and efficiency changed when using OHEM?

RFCN with python layers on Windows

Static libcaffe.lib on windows can built following RFCN caffe branch. However, it reported PythonLayer cannot be initialized during importing libcaffe.lib using C++ programming on external project. Any idea? Thanks.

question about the bottom of rpn_conv/3x3

hi,
For resnet-50, in train_agnostic.prototxt, rpn_conv/3x3 is connected to res5c, while in the ohem version rpn_conv/3x3 is connected to res4f, and in test_agnostic.prototxt, it's connected to res4f as in the ohem version. Is this a typo or something?

Segmentation fault when running demo in CPU mode

./tools/demo_rfcn.py --cpu

leads to segmentation fault, after these lines

I0922 20:40:52.680196  4690 net.cpp:771] Ignoring source layer silence
I0922 20:40:52.680202  4690 net.cpp:771] Ignoring source layer loss
I0922 20:40:52.680209  4690 net.cpp:771] Ignoring source layer accuarcy
I0922 20:40:52.680217  4690 net.cpp:771] Ignoring source layer loss_bbox

I don't have GPU :(.

why coco train_agnostic.prototxt has 21 classes instead of 81 classes?

Hi,

https://github.com/Orpine/py-R-FCN/blob/master/models/coco/ResNet-101/rfcn_end2end/train_agnostic.prototxt#L11

I think coco detection has 80 classes, why the num_classes is 21 in this line (param_str: "'num_classes': 21" ) ?

F1124 15:33:06.386255 42445 smooth_l1_loss_layer.cpp:14] Check failed: bottom.size() == 4 (3 vs. 4) If weights are used, must specify both inside and outside weights

when i use pre_trained_models/ResNet-50L. I got this erro. who knows why?

The OHEM question

I read the codes and I'm confused in OHEM, you just put all the proposal rois (total 300 + gt number)into the next step in end2end training. When you sort the roi loss and hard mining the proposal rois examples? Thank you!

Any sample on how to train on my own dataset?

./lib/datasets/voc_eval.py BB = BB [sorted_ind, :] IndexError: too many indices

Hi @orpine , I want to train my own dataset with one target, which is just a binary classification problem. I modify the prototxt in /models/pascal_voc/ResNet-50/rfcn_alt_opt_5step_ohem/ and change each cls-num from 21 to 2, then I run the ./experiments/scripts/frcn_alt_opt_5stage_ohem.sh 0 ResNet-50 pascal_voc.
The training is very successful and the loss seems to decrease, while in the test_net.py, I meet the error:

./lib/datasets/voc_eval.py BB = BB [sorted_ind, :]
IndexError: too many indices

and I try to print the value of BB and sorted_ind, and both of them seems empty
I am wondering how to solve the problem, and can anyone give me some hints on how to solve the problem?