kaiminghe / deep-residual-networks Goto Github PK

View Code? Open in Web Editor NEW

6.4K 6.4K 2.2K 38 KB

Deep Residual Learning for Image Recognition

License: MIT License

deep-residual-networks's People

Stargazers

Watchers

Forkers

qinghe2015 ml-ai-nlp-ir cloudpeng younghai shinexunju kfpandayang janemyleng salopge wangdongfrank milestonesvn twtygqyy cloudherods msnqqer pinglmlcv runauto sunxingxingtf mathkann lim0606 turinglife qinhongwei paradisehit guoshengcv hitluobin lijian8 dt42 liu09114 nerei lvchigo poseidon1214 arasharchor hariag huaijin-chen fch808 yunqing liyong3forever umariqb freemanzhao ilovecv cogito2012 weiliu89 lahwran ml-lab weijiaustc qingsong99 lchia amos-zq cloudxtreme ericeiffel lambder snazz2001 kuyun-zhangyang chengduozhao zhangxujinsh apark263 zencoding anguoyang facegen weih201 nimishzynga phecy westamine wangxianliang caomw mudelin guoyongcs thuhuwei kingofoz tybxiaobao nature0310 nagyistge cui134 seed93 zdltheone lovemakodo junhuizx michaelxin keeganren rollingstone kltsyn chingyaoc xhuvom vovallen fuxicv nikhilagrima kingwenchen mariolew wendydadong belvo nightwuyo menglaiwang xperzy mldl 1292765944 clcarwin gjtjx vanpersie32 nonlocalstream anilcs13m mlstudy dechengxjtu

deep-residual-networks's Issues

What data argument is used during training?

I am new in caffe,I don't know how to use data argument, would you mind share train.prototxt?

How to preprocess the image when test the resnet-50/101 model?

@KaimingHe
Hello, I am testing the resnet-50-model with your model. While the accuracy seems not good. For example, the top-5 classification result of 'cat.jpg' is:
Probability 285 0.29% => [n02124075 Egyptian cat]
Probability 277 0.24% => [n02119022 red fox, Vulpes vulpes]
Probability 278 0.12% => [n02119789 kit fox, Vulpes macrotis]
Probability 287 0.10% => [n02127052 lynx, catamount]
Probability 282 0.07% => [n02123159 tiger cat]
I am not sure about this result, can you provide some info about the image mean value that should be minus or the image data range?
Thank you very much.

Training RESNET on PASCAL VOC dataset

I need a resnet50 model trained on PASCAL VOC dataset. So is it better if I crop the training images based on annotations or simply feed the whole images for training on RESNET?

Resnet-50 finetune BN layer use_global_stats

hello,everyone,I am using Resnet-50 to finetune my data.I have a question about BN layer param "use_global_stats".

first,i learn that this param should be false when traing; when test should be true。
Now in train_val.prototxt i set this param false,and add Accuracy layer to see the train accuracy and test accuracy 。
i find "test accuracy" output by train_val.prototxt is very lower .And when i use"caffe/build/tools/caffe test -model test.prototxt -weights xxx.caffemodel",the "test accuracy" is higher.
I think this is because the first "test accuracy" is computed by train_val.prototxt whose use_global_stats is set to false. which "test accuracy" is beleviable ? anyone have this question too?why official train_val.protxt don't need accuracy layer ?
thank you

Request for pre-trained version of ResNet-34

Hi! I am a student interested in deep nets (especially ResNet).

I would like to train and test out the Caffe implementation of ResNet-34.
Just in case, do you have the pre-trained model of ResNet-34? If so, I would appreciate if you could upload the .caffemodel file for sharing.

Thanks in advance. :)

need resnet-50 caffemodel

due to some reasons, I need finetune resnet-50 caffemodel on my data.
who can give me a download link except on onedrive or googledrive

Why using ReLU not PReLU in ResNet？

batch normalisation using gradient accumulation or multi-gpu

During the initial training , how is the batch normalisation (bn) mean and variance computed? The batch size of 256 is large and has to be split into multiple sub-batches. In that case, each sub-batch mean and variance will be different? I am guessing the below, please let me know if I was wrong:

conduct training using sub-batch mean and variance, even though they are different
compute mean and variance using very large training samples
freeze the update for the convolutional weights. update bn parameters (scale and bias) using fixed mean and variance above

train_val.prototxt and solver.prototxt for Resnet-152 layer

Hi All,
Past few weeks i have been trying to Train ResNet-152 on BVLC caffe with Imagenet 2012 dataset. But my accuracy or learning remains flat all the time, can some one share the train_val.prototxt and solver.prototxt. i tried pynetbuilder but some how couldn't crack it. If someone is successful can some one share these prototxt files

the test loss is flat.

i hope anyone can waster several minutes to read the log. it is not long . thanks in advance .
my train sample is 40w. and the test sample is 10w . besides i have a question : i set the test batchsize:1 . but my test sample is 10w .if that mean i should set the test_iter: 100000 ? that is too big.
my_log.txt
the test loss is greater than the train loss. is that overfitting ? i try to set the dropout:0.5 but it doesnt work. anyone help?

Using resnet34 to train Imagenet datasets, accuracy is low

my base_lr =0.01 weight_decay=0.0001 ,batch_size=32,mean_value=104,117,123
I've iterated 30W times, loss down to about 1.5, but the accuracy is always 0.1%
what's wrong? I have shuffle the dataset.I resize the dataset to 224*224

Problem when finetuning with ResNet caffemodel

Hi, I was trying to finetune residual network using ResNet caffemodel
But it does not work when it goes to UpgradeNetInput(), and keep tells me segment fault
Is there some thing wrong with my solver.prototxt? or my train_val.sh?

// this is my solver
net: "/v8/eccv/residual/train_val_50.prototxt"
test_iter: 1000
test_interval: 1000
base_lr: 0.1
lr_policy: "step"
average_loss: 40
stepsize: 320000
gamma: 0.96
display: 20
max_iter: 450000
momentum: 0.9
weight_decay: 0.0001
snapshot: 10000
snapshot_prefix: "/v8/dummy"
solver_mode: GPU

// this is my train_val.sh
/v8/eccv/caffe/build/tools/caffe train --gpu=all
--solver=/v8/eccv/residual/solver.prototxt
--weights=/ResNet-50.caffemodel

sorry about poor english

object detection using resnet-50 in coco

I tried resnet-50 in coco object detections, but I get poor detection results. I also read the paper Deep Residual Learning for Image Recognition. In paper you said both ResNet-50/101 were experimented. However, I only see the ResNet-101 results table in your paper. How is the performance of ResNet-50 in coco detection? Thanks!

Not able to build error on debian jessie

I'm not able to build.. it says some gcc c++ error. I'm running on debian jessie.
Is there a particular version of gcc that works?

Here's the error:

[ 28%] Performing build step for 'caffe_dd'
[ 33%] No install step for 'caffe_dd'
[ 38%] Completed 'caffe_dd'
[ 38%] Built target caffe_dd
Scanning dependencies of target ddetect
[ 42%] Building CXX object src/CMakeFiles/ddetect.dir/deepdetect.cc.o
In file included from /home/anand/deepdetect/build/caffe_dd/src/caffe_dd/include/caffe/caffe.hpp:13:0,
from /home/anand/deepdetect/src/caffeinputconns.h:28,
from /home/anand/deepdetect/src/imginputfileconn.h:343,
from /home/anand/deepdetect/src/services.h:29,
from /home/anand/deepdetect/src/apistrategy.h:26,
from /home/anand/deepdetect/src/deepdetect.h:25,
from /home/anand/deepdetect/src/deepdetect.cc:22:
/home/anand/deepdetect/build/caffe_dd/src/caffe_dd/include/caffe/parallel.hpp:99:35: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
inline const int initial_iter() const { return initial_iter_; }
^
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-4.9/README.Bugs for instructions.
src/CMakeFiles/ddetect.dir/build.make:54: recipe for target 'src/CMakeFiles/ddetect.dir/deepdetect.cc.o' failed
make[2]: *** [src/CMakeFiles/ddetect.dir/deepdetect.cc.o] Error 4
CMakeFiles/Makefile2:110: recipe for target 'src/CMakeFiles/ddetect.dir/all' failed
make[1]: *** [src/CMakeFiles/ddetect.dir/all] Error 2
Makefile:76: recipe for target 'all' failed
make: *** [all] Error 2

My gcc -v says this:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.9.2-10' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.9.2 (Debian 4.9.2-10)

Python training and test example

@KaimingHe

thanks for putting this repo and repo together. its great to see this coming along quickly.

my request is if you could please post a simple Python example script for running training and testing on a sample mini data set (say 5 object classes of 50 images each), it would be really great to get going.

this would straightaway help address all the unknowns with setting the right parameters for training.

also if you could publish a train_val.prototxt, it would help.

resnet18 and 36

I am trying to train resnet 18 and 36 from scratch using imagenet12 train dataset. The preprocessing is the same as the Facebook Torch implementation, however, the top 1 accuracy on the validation set is still very low (49%), quite below the accuracy reported by the paper. I wonder if these resnet18 and 36 will be released as well.

Best

Deep resnet runs out of memory

Hi all,

Currently I want to use Resnet as the base model for training FCN segmentation. But both Resnet-101 and Resnet-152 run out of memory on 12G Titan GPU. I am wondering how do you train this kind of very deep network. I think model parallel on multiple GPU is necessary, but I can't find any resources about this kind of implementation.

when the dimensions increase

Hello everyone.
When the dimensions increase, there are two options: (A) extra zero entries padded for increasing dimensions; (B) use 1×1 convolutions to match dimensions. For both options they are performed with a stride of 2.
I'm confused that decreasing in spatial size is handled by using stride 2 (kernel size 1x1) convolutions will lose 75% of the information.
Wouldn't a max/average-pooling or convolution layer of kernel(2x2) and stride(2x2) maintain more information?

Any help whatsoever is valuable.

Accuracy #s on the test set?

Does anyone have accuracy numbers on Imagenet 50k testset for Resnet-18 and 50? I could find this for validation set but not for test set. My numbers are 65.2 and 72.9, less than the whats seen on the validation set. Also, TF resnet-50 accuracy is similar to Caffe Validation accuracy, strangely. (https://github.com/tensorflow/models/tree/master/research/slim)

My preprocessing for resnet: resize to 256, centercrop 224, resnetmean subraction. Strangely my accuracy is bad if my 1st resize is aspect ration preserving & smaller side to 256. Any comments/suggestions is appreciated. Thanks!

Are there Caffe pretrained models of ResNet-18 and ResNet-34?

Hi @KaimingHe, hi guys

I am using Resnet for object detection. However, I need smaller Caffe pretrained models such as resnet-18 and resnet-34 due to the limit of my GPU card.

I see there are Torch pretrained models at https://github.com/facebook/fb.resnet.torch. However, I'm using Caffe. Please help.

Caffe prototxts are also needed.

Thanks,

One question about regularization

I am new to deep learning and I am trying to implement ResNet-34 in Tensorflow myself. I want to know if you applied weight decay to trainable weights in EVERY LAYER.

Failed to parse NetParameter file: ResNet-152-model.caffemodel

went i run the resnet-152，i can't parse the ResNet-152-model.caffemodel.
Check failed: ReadProtoFromBinaryFile(param_file, param) Failed to parse NetParameter file: ResNet-152-model.caffemodel

Question about visualization of the deep residual network structures.

Hi @KaimingHe ,

Thanks for sharing deep residual network prototxt with us. I am curious about your visualization of network structures, like http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006.

How to make these web pages? It's so useful and clear.

Thanks.

Is there a simple python example to deploy the model?

I am just a beginner in deep learning as well as in caffe. I wonder if you can offer a simple python example showing how to deploy the model?
Thank you very much.

can not import from _caffe

After I downloaded the code,I try to run /project/caffe-b590f1d27eb5cbd9bc7b9157d447706407c68682/python/detect.py. But I can not improt from _caffe.
when come to the line in pycaffe.py
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
RMSPropSolver, AdaDeltaSolver, AdamSolver

/home/jane/anaconda3/bin/python3.6 /home/jane/pycharm-community-2017.3.2/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 42569 --file /home/jane/project/caffe-b590f1d27eb5cbd9bc7b9157d447706407c68682/python/detect.py
pydev debugger: process 5970 is connecting

Connected to pydev debugger (build 173.4127.16)
Backend Qt5Agg is interactive backend. Turning interactive mode on.
Traceback (most recent call last):
File "/home/jane/pycharm-community-2017.3.2/helpers/pydev/pydevd.py", line 1668, in
main()
File "/home/jane/pycharm-community-2017.3.2/helpers/pydev/pydevd.py", line 1662, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/jane/pycharm-community-2017.3.2/helpers/pydev/pydevd.py", line 1072, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/jane/pycharm-community-2017.3.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/jane/project/caffe-b590f1d27eb5cbd9bc7b9157d447706407c68682/python/detect.py", line 24, in
import caffe
File "/home/jane/project/caffe-b590f1d27eb5cbd9bc7b9157d447706407c68682/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver
File "/home/jane/project/caffe-b590f1d27eb5cbd9bc7b9157d447706407c68682/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ModuleNotFoundError: No module named 'caffe._caffe'

need resnet-18 caffemodel

I want to do some experiments on resnet-18 caffemodel on my own data, who can give me a download link?

thanks in advance!

Is there a training .prototxt?

I am actually looking for a resnet training net. Or I need to make one myself especially putting the weight_filler terms. Any pointers on init for batch normalization layers?

Solver.prototxt and train.prototxt are missing

Hi,

I think to finetune Residual networks on a different dataset using caffe, solver.prototxt and train.prototxt are needed, but they are missing and only deploy.prototxt has been provided.

-It would be good if you could put pretrained models for Pascal VOC benchmark plus prototxt files.
Best,
Seyedmajid

finetuning problem

Hi thanks for making your code available.

I am trying to refine my network using your model. Should I run the "caffe.bin train" like caffe or should I change the command?

Thanks.

If i use resnet to extract features , which layers should be extracted?

404 error for model download

The MSR link http://research.microsoft.com/en-us/um/people/kahe/resnet/models.zip in the readme leads to a 404 error.

Not able to train network with Adagrad Solver.

The parameters for the solver are:
net: "train_val.prototxt"

test_iter: 5000

test_interval: 5000

test_initialization: false

display: 1000

average_loss: 40

base_lr: 0.001

lr_policy: "fixed"

power: 2.0

gamma: 0.96

max_iter: 10000

weight_decay: 0.0002

snapshot: 2500

snapshot_prefix: "/disk1/classification/model/image_net_quick"
solver_mode: GPU

device_id: 2

type: "AdaGrad"

But the net is solving using sgd_solver.

The log file shows the following output.

I0428 02:33:17.797746 31652 solver.cpp:280] Learning Rate Policy: fixed
I0428 02:33:18.508793 31652 solver.cpp:228] Iteration 0, loss = 9.23844
I0428 02:33:18.508919 31652 solver.cpp:244] Train net output #0: loss1/loss1 = 20.0148 (* 0.3 = 6.00443 loss)
I0428 02:33:18.508967 31652 solver.cpp:244] Train net output #1: loss2/loss1 = 8.38504 (* 0.3 = 2.51551 loss)
I0428 02:33:18.508993 31652 solver.cpp:244] Train net output #2: loss3/loss3_tune = 0.718495 (* 1 = 0.718495 loss)
I0428 02:33:18.509017 31652 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I0428 02:44:05.604993 31652 solver.cpp:228] Iteration 1000, loss = 0.422412
I0428 02:44:05.605200 31652 solver.cpp:244] Train net output #0: loss1/loss1 = 0.15632 (* 0.3 = 0.0468961 loss)
I0428 02:44:05.605213 31652 solver.cpp:244] Train net output #1: loss2/loss1 = 0.194293 (* 0.3 = 0.0582878 loss)
I0428 02:44:05.605224 31652 solver.cpp:244] Train net output #2: loss3/loss3_tune = 0.182052 (* 1 = 0.182052 loss)
I0428 02:44:05.605235 31652 sgd_solver.cpp:106] Iteration 1000, lr = 0.001

why the solver is not picking the adagrad but instead solving with sgd?

problem in downloading Resnet-50-model.caffemodel

I cannot download Resnet-50-model.caffemodel from your link.plz guide me....

I use the prototxt to cifar10,but there are errors.

this is my error info:
I0523 15:05:40.603726 14782 caffe.cpp:185] Using GPUs 0
I0523 15:05:40.619964 14782 caffe.cpp:190] GPU 0: GeForce GTX 750 Ti
I0523 15:05:40.734799 14782 solver.cpp:48] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.001
display: 100
max_iter: 4000
lr_policy: "fixed"
momentum: 0.9
weight_decay: 0.004
snapshot: 4000
snapshot_prefix: "examples/cifar10/cifar10_batch_norm"
solver_mode: GPU
device_id: 0
net: "examples/cifar10/cifar10_batch_norm_net.prototxt"
snapshot_format: HDF5
I0523 15:05:40.734958 14782 solver.cpp:91] Creating training net from net file: examples/cifar10/cifar10_batch_norm_net.prototxt
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 2:1: Expected identifier.
F0523 15:05:40.735024 14782 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: examples/cifar10/cifar10_batch_norm_net.prototxt
*** Check failure stack trace: ***

would you help me please?

Test mode VRAM requirements with cuDNN

Running in test mode on a GeForce GTX TITAN X (driver version 367.57), with cuDNN-4, batch size 1 and a 224x224 image requires 10740 MB of VRAM. This doesn't seem correct given that VGGNet, etc. require < 4 GB VRAM in test mode for similar batches. Does storing intermediate gradients require this much memory? Are my results sane?

how to get top1 and top5 like ILSVRC

Can you tell how to get top1 and top5 like ILSVRC2012

how to solve the problem"has no field named "scale_param""

when I use the deploy to do get the feature,it has the problem like this?if I have a wrong version caffe?

Preprocessing TF vs Caffe

Issue resolved...

Training with deep residual network

@KaimingHe Thanks for sharing the network.

I've modified your 50 layer resnet deploy prototxt for training (8 GPU, 128 batch size with 0.01 learning rate), however, it's not converge, can you suggest any possible causes? Thanks.

I added initial parameter for convolution and scale layers, example:

Convolution Layer

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  convolution_param {
    num_output: 64
    pad: 3
    kernel_size: 7
    stride: 2
    bias_term: false
    weight_filler {
      type: "msra"
    }
  }

Scale Layer

layer {
  name: "conv1_scale"
  type: "Scale"
  bottom: "conv1"
  top: "conv1"
  param {
    lr_mult: 2
    decay_mult: 0
  } 
  scale_param {
    bias_term: true
    bias_filler {
      type: "constant"
      value: 0
    }
  }

I train the resnet on my own datasets (6000 images), which converge on Googlenet / Alexnet.

How to compute output volume of CONV1 and POOL1?

As the input size is 224*224(ignore the channel), conv1 has kernel_size:7, pad:3, stride:2, as i compute (224+3x2-7) / 2 + 1 = 112.5, so we use floor to get 112? But pool1 has kernel_size:3, stride:2, and (112-3)/2 + 1 = 55.5, so we use ceil to get 56? Anyone can give me a help? Thanks!

preprocessing?

Hi @KaimingHe ,
I'm wondering about the exact preprocessing involved. The paper just says you subtract the test set mean? Do you switch RGB to BGR like in VGG nets? I assume the input is [0, 255]? Do you use the same means as VGG?

Experimentation seems to indicate the same preprocessing as used in VGG works?

def preprocess(img):
    VGG_MEAN = [103.939, 116.779, 123.68]
    out = np.copy(img) * 255
    out = out[:, :, [2,1,0]] # swap channel from RGB to BGR
    out[:,:,0] -= VGG_MEAN[0]
    out[:,:,1] -= VGG_MEAN[1]
    out[:,:,2] -= VGG_MEAN[2]
    return out

Segmentation RESNET

Hey. Are there examples RESNET-50/101/152 * .prototxt for image segmentation?

fine-tuning Res50 on caffe

when I was fine-tuning Kaiming's pre-trained model of ResNet50 which was download from the link given, there came up with an error:
I0113 18:32:41.627681 22 net.cpp:110] Creating Layer res2a_branch1
I0113 18:32:41.627684 22 net.cpp:448] res2a_branch1 <- pool1_pool1_0_split_0
I0113 18:32:41.627688 22 net.cpp:419] res2a_branch1 -> res2a_branch1
I0113 18:32:41.628553 22 net.cpp:160] Setting up res2a_branch1
I0113 18:32:41.628568 22 net.cpp:167] Top shape: 32 256 56 56 (25690112)
I0113 18:32:41.628571 22 net.cpp:175] Memory required for data: 610140288
F0113 18:32:41.628585 22 net.cpp:179] Check failed: param_size <= num_param_blobs (2 vs. 1) Too many params specified for layer res2a_branch1
*** Check failure stack trace: ***
And this is my prototxt:
layer {
bottom: "pool1"
top: "res2a_branch1"
name: "res2a_branch1"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}

Error loading ResNet in Torch

I am trying to load ResNet model in torch using loadcaffe


require 'loadcaffe'

model = loadcaffe.load('ResNet-50-deploy.prototxt ', 'ResNet-50-model.caffemodel')`

the output in terminal is this

Successfully loaded ResNet-50-model.caffemodel

warning: module 'bn_conv1 [type BatchNorm]' not found

warning: module 'scale_conv1 [type Scale]' not found

warning: module 'pool1_pool1_0_split [type Split]' not found

warning: module 'res2a [type Eltwise]' not found`

.
.
.
`conv1: 64 3 7 7

res2a_branch1: 256 64 1 1

Segmentation fault (core dumped)`

and the resulting ResNet-50-deploy.prototxt.lua is

`require 'nn'

local model = {}

table.insert(model, {'conv1', nn.SpatialConvolution(3, 64, 7, 7, 2, 2, 3, 3)})

-- warning: module 'bn_conv1 [type BatchNorm]' not found

-- warning: module 'scale_conv1 [type Scale]' not found

table.insert(model, {'conv1_relu', nn.ReLU(true)})

table.insert(model, {'pool1', nn.SpatialMaxPooling(3, 3, 2, 2, 0, 0):ceil()})

-- warning: module 'pool1_pool1_0_split [type Split]' not found`
.
.
.
`
--warning: module 'res5c [type Eltwise]' not found

table.insert(model, {'res5c_relu', nn.ReLU(true)})

table.insert(model, {'pool5', nn.SpatialAveragePooling(7, 7, 1, 1, 0, 0):ceil()})

table.insert(model, {'fc1000', nn.Linear(2048, 1000)})

return model`

Probably some layers cannot be recognized by Torch. Has anyone managed to resolve this issue?

Thanks,
Lina

resnet50 target_blobs.size() == source_layer.blobs_size() (1 vs. 2)

I have the problem:
Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) Incompatible number of blobs for layer conv1

when I tried to run resnet-50 in BVLC caffe, but everything was ok when I run resnet-101 and resnet-152 using your model file in https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777

Is there any simple way to load this model to torch

Obviously, loadcaffe doesn't work.

Increase download speed of networks?

This may seem a little out of place, but is there any way that you could increase the pretrained model weight download speed? I'm getting 200 KB/s from the MSR mirror, meaning it takes about 30 mins to download the model. When I host things on AWS, people can download at about 20 MB/s, which is literally 100x faster. Maybe MSR could consider increasing their download speeds?

Using for object localization and segmentation.

Hello.
I have some fool question.
It is said, that ResNet take first place on image detection, segmentation and object localization.
I know how to use it for detection(blobs from probs layer), but I have no idea how can I get from ResNet bound boxes, for example. Could someone give me info about how can I do it? Thanks
P.S. Sorry for my English

Should the convolution layers have biases?

This is so helpful! Looking at the prototxt, I have a very quick question about whether the convolution layers should have non-0 bias terms.

Regarding the bias terms in your convolutions, the paper says that "biases are omitted for simplifying notations" which suggested that there should be biases, but you didn't bother writing them down. However, in this published prototxt, I see that bias_term:false in the convolution_params (excluding conv1).

Hence I am just looking for a clarification or a few words in general: which was/is the intent? Is there a bug?

Thanks in advance!

Stupid question?

How come when I don't use biases at all in every layer the resnet still seems to be learning very well?

thanks