rbgirshick / fast-rcnn Goto Github PK

View Code? Open in Web Editor NEW

3.3K 3.3K 1.6K 459 KB

Fast R-CNN

License: Other

Makefile 0.06% MATLAB 10.84% Python 72.60% Shell 16.50%

fast-rcnn's People

Contributors

Stargazers

Watchers

Forkers

terrychenism guanlongtianzi jetyingjia mingleili quantombone cyniu lansatiankong chrisyang hihihippp fototo wangdongfrank bebekifis nksg zerkh lbsswu antingshen lai-bluejay nikolasmarkou amiltonwong ericxsun lenovor yiiwood jethrotan fujianhai keyuding jwmneu nipengmath cmxnono cesarsouza kolesman mingminzhen xsongx drozdvadym zeyuanxy guoyilin kelvinchan90 raingo wakamori chizhizhen chagge lchia leizi007 zhangyangang pipipopo rohitgirdhar-cmu-experimental yaoliuoa ronghanghu anguyen8 wilsonwangthu shinexunju peterpan1990 dizzyeyes linyang916 ctrevino jrabary leo-zhou willyd jbhuang0604 starte plsang hangsong amos-zq lgbwust harveyliufly salemameen xyy19920105 snazz2001 mrgloom katjasuvajac asbroad mlstudy pierrehao supersom zhuikonger socialeyesapp caffetao sunkaianna futurecrew alexeykurov deartonym senthilps8 ericeiffel coalabear maeikei gucasbrg kuyun-zhangyang luojianp nicklhy deercoderresearch zhangxujinsh miaomiao1989 golden1232004 skyuuka prithv1 andyhx anuchandra michaelxin zhangyuancv runauto sunxingxingtf

fast-rcnn's Issues

getting detailed time analysis on every layer

Hi,

In your paper, you have two pie diagrams showing detailed time analysis for individual layers (almost individual). I was wondering if I can do it in your forked version of caffe. could you plz give me some hints about it.Tnx.

Train with CPU

Is there a way to train with CPU using train_net.py?

Multiple tops for input blob

Hi,
This might not be the best place for asking this question, so apologies in advance.
In all the network definitions, for the training phase, in the models folder, the data layer seems to be of the following format:

layer {
  name: 'data'
  type: 'Python'
  top: 'data'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_loss_weights'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 21"
  }
}

Are data layers with multiple top blobs(corresponding to multiple labels in this case) a part of Caffe right now, or are they incorporated here using the roi_data_layer python layer that is present in the lib folder?
I wanted to train a network on multiple tasks(similar to the classification and bounding box regression that are jointly learnt in Fast-RCNN), and hence was wondering what would be the best way to have a data layer that outputs multiple tops, one for each type of label.

Build Fail with WITH_PYTHON_LAYER := 1 in Makefile.config

WITH_PYTHON_LAYER := 1 in makefile.config

If use WITH_PYTHON_LAYER := 0 in Makefile.config, then build without error.
If use WITH_PYTHON_LAYER := 1 in Makefile.config, then it will build fail with command "make -j8 "
The error message will pop out:

CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin
.build_release/lib/libcaffe.so: undefined reference to「boost::re_detail::raise_runtime_error(std::runtime_error const&)」
.build_release/lib/libcaffe.so: undefined reference to「boost::re_detail::put_mem_block(void*)」
.build_release/lib/libcaffe.so: undefined reference to「boost::match_results<__gnu_cxx::__normal_iterator<char const*, std::string>, st
d::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*, std::string> > > >::maybe_assign(boost::match_results<_
_gnu_cxx::__normal_iterator<char const*, std::string>, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*
, std::string> > > > const&)」
.....

The following is my step:

Clone the Fast R-CNN repository and build the Cython modules.
I build caffe using WITH_PYTHON_LAYER := 0 without any error and I can successfully run the demo.
I want to train FRCN on my own dataset. So I change WITH_PYTHON_LAYER from 0 to 1 in Makefile.config.
Then I execute "make clean" after change the variable WITH_PYTHON_LAYER.
Build caffe again, execute "make -j8", then I got those error message.

I have search from the Internet, it seems related to libboost_regex link problem.
But I am not familar with Makefile now, could you provide some hint?

Defining multiple ROI for Python Layer

This actually one of my questions:

If we have multiple ROI files (Multi-Region), in which part of the code should this input be added.

Currently, I noticed that your code has this structure. (Please correct me if I'm wrong)

Dataset --> ROI ----->  |                     
        |               |   Network     
        ------------->  |

About DEDUP_BOXES and spatial_scale

Hi rbgirshick,

First, thanks for sharing fast rcnn code.

I have a question about DEDUP_BOXES and spatial_scale. I don't understand why they are set to 1/16 (= 0.0625). I can't find any relevant explanation in your paper. Can you please explain such setting? thank you.

Does fast-rcnn support python3?

Hey, Does fast-rcnn support python3?

Utilizing CuDNN

Hi,

I had a question: by default I was trying to use CuDNN to make your caffe fork. But after seeing your Makefile.Config, I noticed you don't use CuDNN.

Why is that? Wouldn't that accelerate the code more?

What transform should be done to the cls_score and bbox

I implement fast-rcnn using c++, but I am curios that all my output cls_score is small than 0.5, but your python's code can show me the correct answer, is there any transform I should do when I get output from the net ?

where is the prototxt file matched to imagenet models？

I've tried to test vgg16.v2.caffemodel with VGG16's test.prototxt in demo.py, but its output is incorrect.
I wonder where is the corresponding prototxt file of vgg16.v2.caffemodel?

extracting features instead of scores

Hi,
I'd like to use the framework to extract the penultimate features used for each bounding box. This is similar in nature to extracting the fc7 features from AlexNet and using them as generic features for classification.
Where should I look in the code to do this?
Thanks,
-Amir

Training Problem on MS COCO

I would like to train a fast-rcnn model on MS COCO dataset according to the COCO branch.
I have generate selective search proposals and the coco.py could load all 82783 mat files successfully.
However, the code stops with the error of roidb = datasets.imdb.merge_roidbs(gt_roidb, method_roidb)
*** ValueError: blocks[0,:] is all None

How can I solve this problem? I appreciate it if someone could help.

An error when doing snapshot

I am new to Caffe and FRCN. Sorry that I have naive questions.
I am training 2 class data for detection (1 for a foreground class and 1 for background). I got an error when calling snapshot at iteration 10,000. And below is the corresponding log:

I0715 11:37:58.141914 22505 solver.cpp:464] Iteration 9940, lr = 0.001
I0715 11:38:06.983641 22505 solver.cpp:189] Iteration 9960, loss = 0.148392
I0715 11:38:06.983675 22505 solver.cpp:204] Train net output #0: loss_bbox = 0.0771341 (* 1 = 0.0771341 loss)
I0715 11:38:06.983683 22505 solver.cpp:204] Train net output #1: loss_cls = 0.0712577 (* 1 = 0.0712577 loss)
I0715 11:38:06.983691 22505 solver.cpp:464] Iteration 9960, lr = 0.001
I0715 11:38:15.777523 22505 solver.cpp:189] Iteration 9980, loss = 0.0926433
I0715 11:38:15.777556 22505 solver.cpp:204] Train net output #0: loss_bbox = 0.0715442 (* 1 = 0.0715442 loss)
I0715 11:38:15.777565 22505 solver.cpp:204] Train net output #1: loss_cls = 0.0210991 (* 1 = 0.0210991 loss)
I0715 11:38:15.777572 22505 solver.cpp:464] Iteration 9980, lr = 0.001
speed: 0.441s / iter
Traceback (most recent call last):
File "./tools/train_net.py", line 87, in
max_iters=args.max_iters)
File "/home/shu/Documents/Deeplearning/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 123, in train_net
sw.train_model(max_iters)
File "/home/shu/Documents/Deeplearning/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 98, in train_model
self.snapshot()
File "/home/shu/Documents/Deeplearning/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 62, in snapshot
self.bbox_stds[:, np.newaxis])
ValueError: operands could not be broadcast together with shapes (84,1024) (8,1)

Anyone has any idea about this broadcasting error?
Thank you.

About C.PIXEL_MEANS for CaffeNet

Hi,
I am using CaffeNet to train the fast-rcnn, and I noticed that in fast-rcnn.config:
(These are the values originally used for training VGG16)
__C.PIXEL_MEANS = np.array([[[102.9801, 115.9465, 122.7717]]])
Should I modify C.PIXEL_MEANS for CaffeNet? If I should , how to modify it?

selective search coordinates

after downloading the precomputed boxes by selective search via fetch_selective_search_data.sh, I wonder whether the coordinates are [x, y, w, h], [x1,y1,x2,y2], [y, x, h, w] or other combination.

Best

Out of memory while running demo.py

Hello,

I have found several similar issues with out of memory problem regarding caffe.
I am not sure if my issue is connected to caffe or fast-rcnn tuning.
I have amazon instance, nvidia-smi gives:

+------------------------------------------------------+                       
| NVIDIA-SMI 340.29     Driver Version: 340.29         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID K520           Off  | 0000:00:03.0     Off |                  N/A |
| N/A   36C    P0    43W / 125W |     10MiB /  4095MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GRID K520           Off  | 0000:00:04.0     Off |                  N/A |
| N/A   30C    P0    41W / 125W |     10MiB /  4095MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GRID K520           Off  | 0000:00:05.0     Off |                  N/A |
| N/A   38C    P0    44W / 125W |     10MiB /  4095MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GRID K520           Off  | 0000:00:06.0     Off |                  N/A |
| N/A   34C    P0    36W / 125W |     10MiB /  4095MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|  No running compute processes found                                         |
+-----------------------------------------------------------------------------+

Actual error output I get is:

I0502 15:34:45.246454 36085 net.cpp:218] Memory required for data: 114633208
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 538766130


Loaded network data/fast_rcnn_models/vgg16_fast_rcnn_iter_40000.caffemodel

Demo for data/demo/000004.jpg
F0502 15:34:46.777649 36085 syncedmem.cpp:57] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***

I have found that I have to reduce batch size. I cannot get it in which file to decrease and if that is the only thing neede. O r I have to recompile something afterwards.

Kindly thanks

Generate imdb file from annotations file

Hi,

I'd like to train my own object detector with fast-rcnn, but I can't figure out how to generate the imdb file with roi annotations for training. Could you please show me some pointers on how to do that?
How do I create a imdb file from a txt file like this
image_name.jpg x1 y1 x2 y2 class
My goal is to have a imdb file to use in train.py

Thank you in advance.

Did Any try LPO to generate proposals?

I am trying to use .hf5 files generated by lpo's selective search proposal.
I want to modify demo.py and use my *.hf5 box files instead of *.mat. Anyone tried that before?

Jetson TK1

I'm working on a project using the NVIDIA Jetson TK1 embedded computer system. I'm curious, I read the paper and I this will be definitively perfect for it. Wil this be able to run on it? Also, are there any pretrained models using this method or do I have to train my own model? Thanks!

CUDNN_STATUS_ARCH_MISMATCH

Hi I have installed your fast-rcnn according to your instruction. When I run the ./tools/demo.py I got a core dump. Bellow are the output:

F0817 22:53:22.694056 24820 cudnn_conv_layer.cpp:32] Check failed: status == CUDNN_STATUS_SUCCESS (6 vs. 0) CUDNN_STATUS_ARCH_MISMATCH
*** Check failure stack trace: ***
Aborted (core dumped)

It seems I got a CUDNN_STATUS_ARCH_MISMATCH, I googled this one it seems it's due to my hardware. I am running this demo on a Nvidia Tesla M2090 GPU, whose CUDA compute capability is 2.0. How can I fix this problem?

Thanks.

CPU Training error: smooth_L1_loss_layer Not Implemented Yet

I modified train_net.py to use cpu for training. I also disabled MATLAB and use the pre-computed selective search bounding box for VOC2007. However, I got this error during training:

Solving...
F0906 14:48:34.166416 13116 smooth_L1_loss_layer.cpp:39] Not Implemented Yet
*** Check failure stack trace: ***

I looked into fast-rcnn/caffe-fast-rcnn/src/caffe/layers/smooth_L1_loss_layer.cpp and found following code:

template <typename Dtype>
void SmoothL1LossLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  NOT_IMPLEMENTED;
}

template <typename Dtype>
void SmoothL1LossLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
  NOT_IMPLEMENTED;
}

So I guess this layer is only implemented for GPU, not for CPU.
Does this mean I can't use CPU to train fast-rcnn?

Error during training

When I run this command
./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt
--weights data/imagenet_models/VGG16.v2.caffemodel

I get the following:
F0618 21:14:26.843530 24813 layer_factory.hpp:77] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Python

I was able to both make caffe and pycaffe. make test and make runtest passed without any errors.

Any ideas why this might be happening?

Training issue

I'm trying to train a custom dataset following the instructions but I'm getting some errors, I guess the problem is with the input images format (mine are 640x480 and 480x640), can anyone please give me any help with this?

The error I'm getting when training is:

Traceback (most recent call last):
File "./tools/train_net.py", line 80, in
roidb = get_training_roidb(imdb)
File "/workspace/fastrcnn/fastrcnn/tools/../lib/fast_rcnn/train.py", line 107, in get_training_roidb
imdb.append_flipped_images()
File "/workspace/fastrcnn/fastrcnn/tools/../lib/datasets/imdb.py", line 104, in append_flipped_images
assert (boxes[:, 2] >= boxes[:, 0]).all()
AssertionError

fast-rcnn roi proposal?

Hello everyone,

I am trying fast-rcnn on my own dataset, and have some confusion with the ROI proposal and wish anyone can give me some hints for my problem. I am a newbie learner of CNN and would be grateful for any advice. Thank you very much in advance.

My training data set are food images with food items mostly centered on the images and each image contains only one food item. I want to use fast-rcnn to detect the food items on the test images which contains more than one food item. I do not know whether the setting I made below make sense and would like to have some advice from you.

I used the selective_search_rcnn.m to compute the ROIs. Since my training data set only contains one food item per image, I set a minSize of [300, 400](training images sized of 512x512) and the number of ROIs per image is around 200. An eroded version of the training image is going to be used as the ground truth bounding box.

Best Regards,
Chunfang

Error in generating selective search: MAP must be a Mx3 array.

I am trying to run selective_search.m to generate the proposals. I got the below error. Anyone has any idea, how to fix it?
I am getting selective_search.m from here:
https://github.com/EdisonResearch/fast-rcnn/tree/master/selective_search
@zeyuanxy @rbgirshick @drozdvadym

Error using rgb2hsv>parseInputs (line 95)
MAP must be a Mx3 array.

Error in rgb2hsv (line 36)
[r, g, b, isColorMap, isEmptyInput, isThreeChannel] = parseInputs(varargin{:});

Error in Image2ColourSpace

Error in Image2HierarchicalGrouping (line 24)
[colourIm imageToSegment] = Image2ColourSpace(im, colourType);

Error in selective_search_rcnn (line 64)
[boxesT{idx} blobIndIm blobBoxes hierarchy priorityT{idx}] = ...

Error in selective_search (line 13)
selective_search_rcnn(image_filenames, 'train.mat');

Error using textread (line 166) File not found.

First I trained a VGG_CNN_M network, using command
"./tools/train_net.py --gpu 0 --solver models/VGG_CNN_M_1024/solver.prototxt \

--weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel"

Then I run the test
"./tools/test_net.py --gpu 0 --def models/VGG_CNN_M_1024/test.prototxt --net output/default/voc_2007_trainval/vgg_cnn_m_1024_fast_rcnn_iter_40000.caffemodel"

At the end of the test I got error in VOCevaldet (line 30), here I put that part of output:

Running:
cd /home/andsonye/github/fast-rcnn/tools/../lib/datasets/VOCdevkit-matlab-wrapper && matlab -nodisplay -nodesktop -r "dbstop if error; voc_eval('/home/andsonye/github/fast-rcnn/tools/../lib/datasets/../../data/VOCdevkit2007','comp4-19028','test','/home/andsonye/github/fast-rcnn/output/default/voc_2007_test/vgg_cnn_m_1024_fast_rcnn_iter_40000',1); quit;"

                        < M A T L A B (R) >
              Copyright 1984-2013 The MathWorks, Inc.
                R2013b (8.2.0.701) 64-bit (glnxa64)
                          August 13, 2013

To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

aeroplane: pr: load: 120/4952
aeroplane: pr: load: 294/4952
aeroplane: pr: load: 468/4952
aeroplane: pr: load: 629/4952
aeroplane: pr: load: 794/4952
aeroplane: pr: load: 959/4952
aeroplane: pr: load: 1120/4952
aeroplane: pr: load: 1288/4952
aeroplane: pr: load: 1446/4952
aeroplane: pr: load: 1610/4952
aeroplane: pr: load: 1777/4952
aeroplane: pr: load: 1931/4952
aeroplane: pr: load: 2101/4952
aeroplane: pr: load: 2259/4952
aeroplane: pr: load: 2435/4952
aeroplane: pr: load: 2545/4952
aeroplane: pr: load: 2700/4952
aeroplane: pr: load: 2863/4952
aeroplane: pr: load: 3022/4952
aeroplane: pr: load: 3181/4952
aeroplane: pr: load: 3331/4952
aeroplane: pr: load: 3496/4952
aeroplane: pr: load: 3662/4952
aeroplane: pr: load: 3817/4952
aeroplane: pr: load: 3979/4952
aeroplane: pr: load: 4136/4952
aeroplane: pr: load: 4293/4952
aeroplane: pr: load: 4465/4952
aeroplane: pr: load: 4633/4952
aeroplane: pr: load: 4791/4952
aeroplane: pr: load: 4950/4952
Error using textread (line 166)
File not found.

Error in VOCevaldet (line 30)
[ids,confidence,b1,b2,b3,b4]=textread(sprintf(VOCopts.detrespath,id,cls),'%s %f
%f %f %f %f');

Error in voc_eval>voc_eval_cls (line 36)
[recall, prec, ap] = VOCevaldet(VOCopts, comp_id, cls, true);

Error in voc_eval (line 8)
res(i) = voc_eval_cls(cls, VOCopts, comp_id, output_dir, rm_res);

166 error(message('MATLAB:textread:FileNotFound'));
K>>

Can anyone tell me what's the problem? Thanks.

selective search parameters for generating the .mat files

I am trying to re-generate the .mat files used in the demo (e.g. 000004_boxes.mat).

I have downloaded the selective search IJCV code and used fast-mode to generate boxes for 000004.jpg.

I have saved the generated boxes in a .mat file. However, I see that default parameters of IJCV code does not generate the same boxes as 000004_boxes.mat has.

I have disabled the fast code and re-generate the .mat file. But still generated boxes are not the same (file sizes are different and result of the demo shows that only two cars can be detected in 000004.jpg).

So I was wondering what is the correct selective search parameters for generating 000004_boxes.mat.

Thanks!

How to use other methods to get proposals?

Instead of selective search, I want to use other methods to generate bounding box proposals. So what is the proper format I should use to feed the bounding boxes to the net when training it? Thanks!

Background Dataset for Training a fast-rcnn

Hello,
I have prepared the dataset for my classification, but how should I handle the background class? Do I have to create a random set of background dataset? and why is it necessary? is it just acting as negative images?

Thank you,

Error while running 'make runtest' for the caffe of fast-rcnn

I really appreciate that the research helps to improve the performance of original rcnn greatly. However, there is something wrong when I try to compile the caffe of the fast-rcnn. Errors occur while running "make runtest".

[----------] 1 test from ROIPoolingLayerTest/0, where TypeParam = caffe::FloatCPU
[ RUN ] ROIPoolingLayerTest/0.TestGradient
F0505 10:07:28.363382 28868 roi_pooling_layer.cpp:130] Not Implemented Yet
*** Check failure stack trace: ***
@ 0x2b5e374c8daa (unknown)
@ 0x2b5e374c8ce4 (unknown)
@ 0x2b5e374c86e6 (unknown)
@ 0x2b5e374cb687 (unknown)
@ 0x2b5e38e8201c caffe::ROIPoolingLayer<>::Backward_cpu()
@ 0x4b86d4 caffe::GradientChecker<>::CheckGradientSingle()
@ 0x4b9649 caffe::GradientChecker<>::CheckGradientExhaustive()
@ 0x4fd21b caffe::ROIPoolingLayerTest_TestGradient_Test<>::TestBody()
@ 0x73201d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x729fb1 testing::Test::Run()
@ 0x72a096 testing::TestInfo::Run()
@ 0x72a1d7 testing::TestCase::Run()
@ 0x72a52e testing::internal::UnitTestImpl::RunAllTests()
@ 0x731b9d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x72960e testing::UnitTest::Run()
@ 0x443b82 main
@ 0x2b5e39ad9ec5 (unknown)
@ 0x448b64 (unknown)
@ (nil) (unknown)

make: *** [runtest] 已放弃 (core dumped)

I can't figure it out. Any help will be appreciated!

annotation file format

Hello, where can I see and example for an annotation file?
thanks

How to reproduce the experiment in the paper

Hello.

I'm new to fast-rcnn, and I'd like to get the same mAP results in VOC2007 test as your paper in arxiv(http://arxiv.org/abs/1504.08083).
In the paper, it is said that mAP=66.9% when using a method called FRCN[our].

However, when I tried, I got 67.5% by running the command below.
(Of course, I read a instruction "Experiment scripts" in README.md and tried to follow them.)
./experiments/scripts/default_vgg16.sh 0

Is there any mistake or misunderstanding?

Thank you in advance.

Error during make

Hi all.
I hope you can help me.
I get the following error during make.

...
...
CXX src/caffe/layers/concat_layer.cpp
CXX src/caffe/layers/contrastive_loss_layer.cpp
CXX src/caffe/layers/pooling_layer.cpp
CXX src/caffe/layers/cudnn_softmax_layer.cpp
CXX src/caffe/layers/slice_layer.cpp
CXX src/caffe/layers/image_data_layer.cpp
CXX src/caffe/layers/loss_layer.cpp
CXX src/caffe/syncedmem.cpp
NVCC src/caffe/util/math_functions.cu
NVCC src/caffe/util/im2col.cu
NVCC src/caffe/layers/prelu_layer.cu
NVCC src/caffe/layers/relu_layer.cu
src/caffe/layers/prelu_layer.cu(58): error: a host function call cannot be configured
detected during instantiation of "void caffe::PReLULayer::Forward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=float]"
(127): here

src/caffe/layers/prelu_layer.cu(92): error: a host function call cannot be configured
detected during instantiation of "void caffe::PReLULayer::Backward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=float]"
(127): here

src/caffe/layers/prelu_layer.cu(118): error: a host function call cannot be configured
detected during instantiation of "void caffe::PReLULayer::Backward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=float]"
(127): here

src/caffe/layers/prelu_layer.cu(58): error: a host function call cannot be configured
detected during instantiation of "void caffe::PReLULayer::Forward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=double]"
(127): here

6 errors detected in the compilation of "/tmp/tmpxft_000008af_00000000-16_prelu_layer.compute_50.cpp1.ii".
make: *** [.build_release/cuda/src/caffe/layers/prelu_layer.o] Error 1
make: *** Waiting for unfinished jobs....
src/caffe/layers/relu_layer.cu(25): error: a host function call cannot be configured
detected during instantiation of "void caffe::ReLULayer::Forward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=float]"
(62): here

src/caffe/layers/relu_layer.cu(55): error: a host function call cannot be configured
detected during instantiation of "void caffe::ReLULayer::Backward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=float]"
(62): here

src/caffe/layers/relu_layer.cu(25): error: a host function call cannot be configured
detected during instantiation of "void caffe::ReLULayer::Forward_gpu(const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &, const std::vectorcaffe::Blob<Dtype *, std::allocatorcaffe::Blob<Dtype *>> &) [with Dtype=double]"
(62): here

4 errors detected in the compilation of "/tmp/tmpxft_000008e4_00000000-16_relu_layer.compute_50.cpp1.ii".
make: *** [.build_release/cuda/src/caffe/layers/relu_layer.o] Error 1

...

I already try to solve it but i fail.
I get the same error with different versions of cuda. (i'm using 7-0 by default).

Ps. With the original version of caffe everything works.

Thanks

Selective Search Configuration

Hi Ross,

I run your demo, and it works good.
But when I use your selective search matlab code from your rcnn repo to generate object proposals for image 000001.jpg, I can only get 2393 proposals, which is not the same as your mat file (2880 proposals). By using these proposals, only 2 cars are detected.
So would you please tell me your configurations of selective search methods?
Or How can I get 2880 proposals for image 000001.jpg?
Thanks.

How to use negative examples when training

Hi, I tried training on INRIA Person, but it does not contain annotation files for negative examples, thus I just set the objs in _load_pascal_annotation to be empty? Is that okay? Thanks!

How to train rcnn for GoogLeNet

I have used the following proto file: https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/train_val.prototxt

and create this train.prototxt for 33 output classes (including 1 for background). But as soon as I run the tools/train_net.py, I get the following error:

insert_splits.cpp:35] Unknown blob input label to layer 1

Can you please have a quick look at the modified train.prototxt provided below:

name: "GoogleNet"
layer {
name: 'data'
type: 'Python'
top: 'data'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_loss_weights'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 33"
}
}
layer {
name: "conv1/7x7_s2"
type: "Convolution"
bottom: "data"
top: "conv1/7x7_s2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "conv1/relu_7x7"
type: "ReLU"
bottom: "conv1/7x7_s2"
top: "conv1/7x7_s2"
}
layer {
name: "pool1/3x3_s2"
type: "Pooling"
bottom: "conv1/7x7_s2"
top: "pool1/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "pool1/norm1"
type: "LRN"
bottom: "pool1/3x3_s2"
top: "pool1/norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2/3x3_reduce"
type: "Convolution"
bottom: "pool1/norm1"
top: "conv2/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "conv2/relu_3x3_reduce"
type: "ReLU"
bottom: "conv2/3x3_reduce"
top: "conv2/3x3_reduce"
}
layer {
name: "conv2/3x3"
type: "Convolution"
bottom: "conv2/3x3_reduce"
top: "conv2/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "conv2/relu_3x3"
type: "ReLU"
bottom: "conv2/3x3"
top: "conv2/3x3"
}
layer {
name: "conv2/norm2"
type: "LRN"
bottom: "conv2/3x3"
top: "conv2/norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2/3x3_s2"
type: "Pooling"
bottom: "conv2/norm2"
top: "pool2/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "inception_3a/1x1"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_1x1"
type: "ReLU"
bottom: "inception_3a/1x1"
top: "inception_3a/1x1"
}
layer {
name: "inception_3a/3x3_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3_reduce"
}
layer {
name: "inception_3a/3x3"
type: "Convolution"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3"
type: "ReLU"
bottom: "inception_3a/3x3"
top: "inception_3a/3x3"
}
layer {
name: "inception_3a/5x5_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5_reduce"
}
layer {
name: "inception_3a/5x5"
type: "Convolution"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5"
type: "ReLU"
bottom: "inception_3a/5x5"
top: "inception_3a/5x5"
}
layer {
name: "inception_3a/pool"
type: "Pooling"
bottom: "pool2/3x3_s2"
top: "inception_3a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3a/pool_proj"
type: "Convolution"
bottom: "inception_3a/pool"
top: "inception_3a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_pool_proj"
type: "ReLU"
bottom: "inception_3a/pool_proj"
top: "inception_3a/pool_proj"
}
layer {
name: "inception_3a/output"
type: "Concat"
bottom: "inception_3a/1x1"
bottom: "inception_3a/3x3"
bottom: "inception_3a/5x5"
bottom: "inception_3a/pool_proj"
top: "inception_3a/output"
}
layer {
name: "inception_3b/1x1"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_1x1"
type: "ReLU"
bottom: "inception_3b/1x1"
top: "inception_3b/1x1"
}
layer {
name: "inception_3b/3x3_reduce"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3b/3x3_reduce"
top: "inception_3b/3x3_reduce"
}
layer {
name: "inception_3b/3x3"
type: "Convolution"
bottom: "inception_3b/3x3_reduce"
top: "inception_3b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_3x3"
type: "ReLU"
bottom: "inception_3b/3x3"
top: "inception_3b/3x3"
}
layer {
name: "inception_3b/5x5_reduce"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3b/5x5_reduce"
top: "inception_3b/5x5_reduce"
}
layer {
name: "inception_3b/5x5"
type: "Convolution"
bottom: "inception_3b/5x5_reduce"
top: "inception_3b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_5x5"
type: "ReLU"
bottom: "inception_3b/5x5"
top: "inception_3b/5x5"
}
layer {
name: "inception_3b/pool"
type: "Pooling"
bottom: "inception_3a/output"
top: "inception_3b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3b/pool_proj"
type: "Convolution"
bottom: "inception_3b/pool"
top: "inception_3b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_pool_proj"
type: "ReLU"
bottom: "inception_3b/pool_proj"
top: "inception_3b/pool_proj"
}
layer {
name: "inception_3b/output"
type: "Concat"
bottom: "inception_3b/1x1"
bottom: "inception_3b/3x3"
bottom: "inception_3b/5x5"
bottom: "inception_3b/pool_proj"
top: "inception_3b/output"
}
layer {
name: "pool3/3x3_s2"
type: "Pooling"
bottom: "inception_3b/output"
top: "pool3/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "inception_4a/1x1"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_1x1"
type: "ReLU"
bottom: "inception_4a/1x1"
top: "inception_4a/1x1"
}
layer {
name: "inception_4a/3x3_reduce"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4a/3x3_reduce"
top: "inception_4a/3x3_reduce"
}
layer {
name: "inception_4a/3x3"
type: "Convolution"
bottom: "inception_4a/3x3_reduce"
top: "inception_4a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 208
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_3x3"
type: "ReLU"
bottom: "inception_4a/3x3"
top: "inception_4a/3x3"
}
layer {
name: "inception_4a/5x5_reduce"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4a/5x5_reduce"
top: "inception_4a/5x5_reduce"
}
layer {
name: "inception_4a/5x5"
type: "Convolution"
bottom: "inception_4a/5x5_reduce"
top: "inception_4a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 48
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_5x5"
type: "ReLU"
bottom: "inception_4a/5x5"
top: "inception_4a/5x5"
}
layer {
name: "inception_4a/pool"
type: "Pooling"
bottom: "pool3/3x3_s2"
top: "inception_4a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4a/pool_proj"
type: "Convolution"
bottom: "inception_4a/pool"
top: "inception_4a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_pool_proj"
type: "ReLU"
bottom: "inception_4a/pool_proj"
top: "inception_4a/pool_proj"
}
layer {
name: "inception_4a/output"
type: "Concat"
bottom: "inception_4a/1x1"
bottom: "inception_4a/3x3"
bottom: "inception_4a/5x5"
bottom: "inception_4a/pool_proj"
top: "inception_4a/output"
}
layer {
name: "loss1/ave_pool"
type: "Pooling"
bottom: "inception_4a/output"
top: "loss1/ave_pool"
pooling_param {
pool: AVE
kernel_size: 5
stride: 3
}
}
layer {
name: "loss1/conv"
type: "Convolution"
bottom: "loss1/ave_pool"
top: "loss1/conv"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "loss1/relu_conv"
type: "ReLU"
bottom: "loss1/conv"
top: "loss1/conv"
}
layer {
name: "loss1/fc"
type: "InnerProduct"
bottom: "loss1/conv"
top: "loss1/fc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "loss1/relu_fc"
type: "ReLU"
bottom: "loss1/fc"
top: "loss1/fc"
}
layer {
name: "loss1/drop_fc"
type: "Dropout"
bottom: "loss1/fc"
top: "loss1/fc"
dropout_param {
dropout_ratio: 0.7
}
}
layer {
name: "loss1/classifier"
type: "InnerProduct"
bottom: "loss1/fc"
top: "loss1/classifier"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1000
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss1/loss"
type: "SoftmaxWithLoss"
bottom: "loss1/classifier"
bottom: "label"
top: "loss1/loss1"
loss_weight: 0.3
}
layer {
name: "loss1/top-1"
type: "Accuracy"
bottom: "loss1/classifier"
bottom: "label"
top: "loss1/top-1"
include {
phase: TEST
}
}
layer {
name: "loss1/top-5"
type: "Accuracy"
bottom: "loss1/classifier"
bottom: "label"
top: "loss1/top-5"
include {
phase: TEST
}
accuracy_param {
top_k: 5
}
}
layer {
name: "inception_4b/1x1"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_1x1"
type: "ReLU"
bottom: "inception_4b/1x1"
top: "inception_4b/1x1"
}
layer {
name: "inception_4b/3x3_reduce"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 112
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4b/3x3_reduce"
top: "inception_4b/3x3_reduce"
}
layer {
name: "inception_4b/3x3"
type: "Convolution"
bottom: "inception_4b/3x3_reduce"
top: "inception_4b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 224
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_3x3"
type: "ReLU"
bottom: "inception_4b/3x3"
top: "inception_4b/3x3"
}
layer {
name: "inception_4b/5x5_reduce"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4b/5x5_reduce"
top: "inception_4b/5x5_reduce"
}
layer {
name: "inception_4b/5x5"
type: "Convolution"
bottom: "inception_4b/5x5_reduce"
top: "inception_4b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_5x5"
type: "ReLU"
bottom: "inception_4b/5x5"
top: "inception_4b/5x5"
}
layer {
name: "inception_4b/pool"
type: "Pooling"
bottom: "inception_4a/output"
top: "inception_4b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4b/pool_proj"
type: "Convolution"
bottom: "inception_4b/pool"
top: "inception_4b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_pool_proj"
type: "ReLU"
bottom: "inception_4b/pool_proj"
top: "inception_4b/pool_proj"
}
layer {
name: "inception_4b/output"
type: "Concat"
bottom: "inception_4b/1x1"
bottom: "inception_4b/3x3"
bottom: "inception_4b/5x5"
bottom: "inception_4b/pool_proj"
top: "inception_4b/output"
}
layer {
name: "inception_4c/1x1"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_1x1"
type: "ReLU"
bottom: "inception_4c/1x1"
top: "inception_4c/1x1"
}
layer {
name: "inception_4c/3x3_reduce"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4c/3x3_reduce"
top: "inception_4c/3x3_reduce"
}
layer {
name: "inception_4c/3x3"
type: "Convolution"
bottom: "inception_4c/3x3_reduce"
top: "inception_4c/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_3x3"
type: "ReLU"
bottom: "inception_4c/3x3"
top: "inception_4c/3x3"
}
layer {
name: "inception_4c/5x5_reduce"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4c/5x5_reduce"
top: "inception_4c/5x5_reduce"
}
layer {
name: "inception_4c/5x5"
type: "Convolution"
bottom: "inception_4c/5x5_reduce"
top: "inception_4c/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_5x5"
type: "ReLU"
bottom: "inception_4c/5x5"
top: "inception_4c/5x5"
}
layer {
name: "inception_4c/pool"
type: "Pooling"
bottom: "inception_4b/output"
top: "inception_4c/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4c/pool_proj"
type: "Convolution"
bottom: "inception_4c/pool"
top: "inception_4c/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_pool_proj"
type: "ReLU"
bottom: "inception_4c/pool_proj"
top: "inception_4c/pool_proj"
}
layer {
name: "inception_4c/output"
type: "Concat"
bottom: "inception_4c/1x1"
bottom: "inception_4c/3x3"
bottom: "inception_4c/5x5"
bottom: "inception_4c/pool_proj"
top: "inception_4c/output"
}
layer {
name: "inception_4d/1x1"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 112
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_1x1"
type: "ReLU"
bottom: "inception_4d/1x1"
top: "inception_4d/1x1"
}
layer {
name: "inception_4d/3x3_reduce"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 144
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4d/3x3_reduce"
top: "inception_4d/3x3_reduce"
}
layer {
name: "inception_4d/3x3"
type: "Convolution"
bottom: "inception_4d/3x3_reduce"
top: "inception_4d/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 288
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_3x3"
type: "ReLU"
bottom: "inception_4d/3x3"
top: "inception_4d/3x3"
}
layer {
name: "inception_4d/5x5_reduce"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4d/5x5_reduce"
top: "inception_4d/5x5_reduce"
}
layer {
name: "inception_4d/5x5"
type: "Convolution"
bottom: "inception_4d/5x5_reduce"
top: "inception_4d/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_5x5"
type: "ReLU"
bottom: "inception_4d/5x5"
top: "inception_4d/5x5"
}
layer {
name: "inception_4d/pool"
type: "Pooling"
bottom: "inception_4c/output"
top: "inception_4d/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4d/pool_proj"
type: "Convolution"
bottom: "inception_4d/pool"
top: "inception_4d/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_pool_proj"
type: "ReLU"
bottom: "inception_4d/pool_proj"
top: "inception_4d/pool_proj"
}
layer {
name: "inception_4d/output"
type: "Concat"
bottom: "inception_4d/1x1"
bottom: "inception_4d/3x3"
bottom: "inception_4d/5x5"
bottom: "inception_4d/pool_proj"
top: "inception_4d/output"
}
layer {
name: "loss2/ave_pool"
type: "Pooling"
bottom: "inception_4d/output"
top: "loss2/ave_pool"
pooling_param {
pool: AVE
kernel_size: 5
stride: 3
}
}
layer {
name: "loss2/conv"
type: "Convolution"
bottom: "loss2/ave_pool"
top: "loss2/conv"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "loss2/relu_conv"
type: "ReLU"
bottom: "loss2/conv"
top: "loss2/conv"
}
layer {
name: "loss2/fc"
type: "InnerProduct"
bottom: "loss2/conv"
top: "loss2/fc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "loss2/relu_fc"
type: "ReLU"
bottom: "loss2/fc"
top: "loss2/fc"
}
layer {
name: "loss2/drop_fc"
type: "Dropout"
bottom: "loss2/fc"
top: "loss2/fc"
dropout_param {
dropout_ratio: 0.7
}
}
layer {
name: "loss2/classifier"
type: "InnerProduct"
bottom: "loss2/fc"
top: "loss2/classifier"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1000
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss2/loss"
type: "SoftmaxWithLoss"
bottom: "loss2/classifier"
bottom: "label"
top: "loss2/loss1"
loss_weight: 0.3
}
layer {
name: "loss2/top-1"
type: "Accuracy"
bottom: "loss2/classifier"
bottom: "label"
top: "loss2/top-1"
include {
phase: TEST
}
}
layer {
name: "loss2/top-5"
type: "Accuracy"
bottom: "loss2/classifier"
bottom: "label"
top: "loss2/top-5"
include {
phase: TEST
}
accuracy_param {
top_k: 5
}
}
layer {
name: "inception_4e/1x1"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_1x1"
type: "ReLU"
bottom: "inception_4e/1x1"
top: "inception_4e/1x1"
}
layer {
name: "inception_4e/3x3_reduce"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4e/3x3_reduce"
top: "inception_4e/3x3_reduce"
}
layer {
name: "inception_4e/3x3"
type: "Convolution"
bottom: "inception_4e/3x3_reduce"
top: "inception_4e/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 320
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_3x3"
type: "ReLU"
bottom: "inception_4e/3x3"
top: "inception_4e/3x3"
}
layer {
name: "inception_4e/5x5_reduce"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4e/5x5_reduce"
top: "inception_4e/5x5_reduce"
}
layer {
name: "inception_4e/5x5"
type: "Convolution"
bottom: "inception_4e/5x5_reduce"
top: "inception_4e/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_5x5"
type: "ReLU"
bottom: "inception_4e/5x5"
top: "inception_4e/5x5"
}
layer {
name: "inception_4e/pool"
type: "Pooling"
bottom: "inception_4d/output"
top: "inception_4e/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4e/pool_proj"
type: "Convolution"
bottom: "inception_4e/pool"
top: "inception_4e/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_pool_proj"
type: "ReLU"
bottom: "inception_4e/pool_proj"
top: "inception_4e/pool_proj"
}
layer {
name: "inception_4e/output"
type: "Concat"
bottom: "inception_4e/1x1"
bottom: "inception_4e/3x3"
bottom: "inception_4e/5x5"
bottom: "inception_4e/pool_proj"
top: "inception_4e/output"
}
layer {
name: "pool4/3x3_s2"
type: "Pooling"
bottom: "inception_4e/output"
top: "pool4/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "inception_5a/1x1"
type: "Convolution"
bottom: "pool4/3x3_s2"
top: "inception_5a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_1x1"
type: "ReLU"
bottom: "inception_5a/1x1"
top: "inception_5a/1x1"
}
layer {
name: "inception_5a/3x3_reduce"
type: "Convolution"
bottom: "pool4/3x3_s2"
top: "inception_5a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_5a/3x3_reduce"
top: "inception_5a/3x3_reduce"
}
layer {
name: "inception_5a/3x3"
type: "Convolution"
bottom: "inception_5a/3x3_reduce"
top: "inception_5a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 320
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_3x3"
type: "ReLU"
bottom: "inception_5a/3x3"
top: "inception_5a/3x3"
}
layer {
name: "inception_5a/5x5_reduce"
type: "Convolution"
bottom: "pool4/3x3_s2"
top: "inception_5a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_5a/5x5_reduce"
top: "inception_5a/5x5_reduce"
}
layer {
name: "inception_5a/5x5"
type: "Convolution"
bottom: "inception_5a/5x5_reduce"
top: "inception_5a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_5x5"
type: "ReLU"
bottom: "inception_5a/5x5"
top: "inception_5a/5x5"
}
layer {
name: "inception_5a/pool"
type: "Pooling"
bottom: "pool4/3x3_s2"
top: "inception_5a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_5a/pool_proj"
type: "Convolution"
bottom: "inception_5a/pool"
top: "inception_5a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_pool_proj"
type: "ReLU"
bottom: "inception_5a/pool_proj"
top: "inception_5a/pool_proj"
}
layer {
name: "inception_5a/output"
type: "Concat"
bottom: "inception_5a/1x1"
bottom: "inception_5a/3x3"
bottom: "inception_5a/5x5"
bottom: "inception_5a/pool_proj"
top: "inception_5a/output"
}
layer {
name: "inception_5b/1x1"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_1x1"
type: "ReLU"
bottom: "inception_5b/1x1"
top: "inception_5b/1x1"
}
layer {
name: "inception_5b/3x3_reduce"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_5b/3x3_reduce"
top: "inception_5b/3x3_reduce"
}
layer {
name: "inception_5b/3x3"
type: "Convolution"
bottom: "inception_5b/3x3_reduce"
top: "inception_5b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_3x3"
type: "ReLU"
bottom: "inception_5b/3x3"
top: "inception_5b/3x3"
}
layer {
name: "inception_5b/5x5_reduce"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 48
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_5b/5x5_reduce"
top: "inception_5b/5x5_reduce"
}
layer {
name: "inception_5b/5x5"
type: "Convolution"
bottom: "inception_5b/5x5_reduce"
top: "inception_5b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_5x5"
type: "ReLU"
bottom: "inception_5b/5x5"
top: "inception_5b/5x5"
}
layer {
name: "inception_5b/pool"
type: "Pooling"
bottom: "inception_5a/output"
top: "inception_5b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_5b/pool_proj"
type: "Convolution"
bottom: "inception_5b/pool"
top: "inception_5b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_pool_proj"
type: "ReLU"
bottom: "inception_5b/pool_proj"
top: "inception_5b/pool_proj"
}
layer {
name: "inception_5b/output"
type: "Concat"
bottom: "inception_5b/1x1"
bottom: "inception_5b/3x3"
bottom: "inception_5b/5x5"
bottom: "inception_5b/pool_proj"
top: "inception_5b/output"
}
layer {
name: "pool5/7x7_s1"
type: "Pooling"
bottom: "inception_5b/output"
top: "pool5/7x7_s1"
pooling_param {
pool: AVE
kernel_size: 7
stride: 1
}
}
layer {
name: "pool5/drop_7x7_s1"
type: "Dropout"
bottom: "pool5/7x7_s1"
top: "pool5/7x7_s1"
dropout_param {
dropout_ratio: 0.4
}
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "pool5/7x7_s1"
top: "cls_score"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 33
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "pool5/7x7_s1"
top: "bbox_pred"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 132
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_loss_weights"
top: "loss_bbox"
loss_weight: 1
}

Combined 2007+2012 trainval code

To replicate the training of the networks on 2007+2012 trainval, what part of the code must be modified? Does the current code support this?

Error When Training

Hi, when I run the command:

./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt
--weights data/imagenet_models/VGG16.v2.caffemodel

the output is:

Traceback (most recent call last):
File "./tools/test_net.py", line 13, in
from fast_rcnn.test import test_net
File "/home/szy/fast-rcnn/tools/../lib/fast_rcnn/init.py", line 9, in
from . import train
File "/home/szy/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 10, in
import caffe
File "/home/szy/fast-rcnn/tools/../caffe-fast-rcnn/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver
File "/home/szy/fast-rcnn/tools/../caffe-fast-rcnn/python/caffe/pycaffe.py", line 14, in
import caffe.io
File "/home/szy/fast-rcnn/tools/../caffe-fast-rcnn/python/caffe/io.py", line 3, in
from scipy.ndimage import zoom
ImportError: cannot import name zoom

Maybe there are some problems about the dependencies?

learn model without fine tuning

does anybody get success to train model without fine tuning?

COCO model

Can you provide a model trained on COCO?

nms error

in some case, I got an error of using mns
following codes are from demo

dets = np.hstack((cls_boxes,cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]

It gave me error as following
Traceback (most recent call last):

File "", line 7, in
keep = nms(dets, NMS_THRESH)

File "nn/lib/utils/nms.pyx", line 64, in utils.cython_nms.nms (utils/nms.c:2134)

ZeroDivisionError: float division

PASCAL VOC 2007 Dataset (x-post)

I've been trying to get this implementation online, but am running into a roadblock with the PASCAL 2007 dataset. It appears as though pascallin.ecs.soton.ac.uk is down and unreachable.

Does anybody have a working mirror or willing to host a copy of the data:

If I can get a copy of the data, I will add it to the academictorrents.com tracker so this doesn't happen again.

Edit: I'm still trying to track them all down, but I've added some of the missing VOC dataset files to academictorrents.com: http://academictorrents.com/browse.php?search=voc

Note: this is a repost from rbgirshick/rcnn#48, for visibility.

install CPU mode

Hi
can i install fast-rcnn in CPU mode? How many times CPU mode of fast-rcnn is faster than usual cnn?

Train a fast-rcnn CovNet on another dataset

Hi, I am trying to train a fast-rcnn CovNet on another dataset ,and there is something wrong with the snapshot when it comes to
net.params['bbox_pred'][0].data[...] = (net.params['bbox_pred'][0].data *self.bbox_stds[:, np.newaxis])
ValueError: operands could not be broadcast together with shapes (84,4096) (12,1) .
I think that maybe the difference of the number of classes between my dataset and VOC is account for the issues . If is it necessary to chang some parameters of the pretrained_model? If it is ,how should I change that?

NMS performance

Hi,
When I run demo.py, NMS(Non-maximum suppression) takes quite amount of time.

I ran demo.py with all the 20 classes in my PC (i5 CPU, GTX970)
which means nms runs 20 times per image.

And the result shows that nms takes big portion of time in the overall detection process.

[ 000004.jpg ]
im_detect : 0.44s
total nms : 0.39s

[ 001551.jpg ]
im_detect : 0.34s
total nms : 0.17s

This nms time will increase as the number of the detection classes increases.
Even though the current Cython implementation of nms reduces elapsed time,
it will be very helpful if there is much faster implementations of nms.

Any suggestion of faster implementations of nms?

libprotobuf ERROR google/protobuf/text_format.cc:245

I compile the code following the guide, and when I try to run the demo from tools/demo.py, the error occurred:

libdc1394 error: Failed to initialize libdc1394
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 392:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0501 11:01:21.328140 11559 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/VGG16/test.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)

I think there is something wrong with the prototxt or the pre-trained model.
How can I fix this problem?

How to use new snapshotting?

fast-rcnn doesn't take as an argument --snapshot so I'm not sure how to use a snapshot.

I'm asking because in the /models/VGG16/solver.prototxt it has this :
"We disable standard caffe solver snapshotting and implement our own snapshot"

Thanks

a Error in VOCevaldet(line 30)

I am very like this project. I finish the demo.py and train.py, but when I excute the command “./tool/test_net.py....” ， I meet a error after extract features.
very hope to be able to get helps.

``` ``
aeroplane: pr: load: 3795/4952
aeroplane: pr: load: 4013/4952
aeroplane: pr: load: 4237/4952
aeroplane: pr: load: 4462/4952
aeroplane: pr: load: 4690/4952
aeroplane: pr: load: 4899/4952
Error using textread (line 165)
File not found.

Error in VOCevaldet (line 30)
[ids,confidence,b1,b2,b3,b4]=textread(sprintf(VOCopts.detrespath,id,cls),'%s %f
%f %f %f %f');

Error in voc_eval>voc_eval_cls (line 36)
[recall, prec, ap] = VOCevaldet(VOCopts, comp_id, cls, true);

Error in voc_eval (line 8)
res(i) = voc_eval_cls(cls, VOCopts, comp_id, output_dir, rm_res);

165 error(message('MATLAB:textread:FileNotFound'));

demo.py only supports 20 classes?

I'm trying to use the hybridCNN model from the zoo with a modified version of the demo code and for some reason the scores only return up to 20 classes. Is there a reason for this? I can't find anywhere in the code that says why this might be.

Using dlib's selective search

Posting this here, as I feel the audience is most relevant...

The selective search .mat files provided by running a dataset through the original implementation of SS are fairly small... on the order of 100-200 megabytes. I put the ImageNet-200 dataset through dlib's selective search implementation (which, I know, is slightly different than the original), and I'm getting tens of gigabytes in output. I assume that the dlib implementation isn't nearly as selective as the original SS matlab code. Using dlib SS boxes in fast-rcnn works well.

Anyway - I'm interested in training some fast-rcnn models, but knowing that the SS data needs to fit into memory means that the dlib data is not useable as-is. I'd need to modify some files to look up the SS data from disk while training.

Has anyone encountered such an issue? How would you deal with it? Or, does anyone know of a good object proposal method that doesn't require matlab?