Giter Club home page Giter Club logo

feathercnn's Introduction

license Release Version PRs Welcome

Introduction

FeatherCNN is a high-performance lightweight CNN inference library, developed by Tencent AI Platform Department. FeatureCNN origins from our game AI project for King of Glory (Chinese: 王者荣耀), in which we aim to build a neural model for MOBA game AI and run it on mobile devices. FeatherCNN currently targets at ARM CPUs. We will extend it to cover other architecutures in the near future.

Comparing with other libraries, FeatherCNN has the following features:

  • High Performance FeatherCNN delivers state-of-the-art inference computing performance on a wide range of devices, including mobile phones (iOS/Android), embedded devices (Linux) as well as ARM-based servers (Linux).

  • Easy Deployment FeatherCNN packs everything in a single code base to get rid of third-party dependencies. Hence, it facilitates deployment on mobile platforms.

  • Featherweight The compiled FeatherCNN library is small-sized (hundreds of KBs).

Please kindly open an issue in this repo for bug reports and enhancement suggests. We are grateful to user responses and will actively polish this library.

Citation

FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures (TPDS September 2019, In press, DOI:10.1109/TPDS.2019.2939785)

Clone hints

The FeatherCNN repository has a heavy development history, please only clone the master branch as follows:

git clone -b master --single-branch https://github.com/tencent/FeatherCNN.git

Detailed Instructions for iOS/Android/Linux

Build From Source

iOS Guide

Android Guide

Android ADB Guide

Usage

Model Format Conversion

FeatherCNN accepts Caffemodels. It merges the structure file (.prototxt) and the weight file (.caffemodel) into a single binary model (.feathermodel). The convert tool requires protobuf, but you don't need them for the library.

Model Convert Guide.

Runtime Interfaces

The basic user interfaces are listed in feather/net.h. Currently we are using raw pointers to reference data. We may provide more convenient interfaces in the near future.

Before inference, FeatherCNN requires two steps to initialize the network.

feather::Net forward_net(num_threads);
forward_net.InitFromPath(FILE_PATH_TO_FEATHERMODEL);

The net can also be initialized with raw buffers and FILE pointers. We can perform forward computation with raw float* buffer consequently.

forward_net.Forward(PTR_TO_YOUR_INPUT_DATA);

The output can be extracted from the net by the name of blobs. The blob names are kept consistent with caffe prototxt.

forward_net.ExtractBlob(PTR_TO_YOUR_OUTPUT_BUFFER, BLOB_NAME);

BTW, you can also get the blob's data size by calling

size_t data_size = 0;
forward_net.GetBlobDataSize(&data_size, BLOB_NAME);

Performance Benchmarks

We have tested FeatherCNN on a bunch of devices, see this page for details.

User Groups

Telegram: https://t.me/FeatherCNN

QQ: 728147343

feathercnn's People

Contributors

mengjintao avatar turbo0628 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

feathercnn's Issues

Performance testing using experimental branch on Andriod

Please, has anyone tried running performance tests using the experimental branch on Android?

I have tried this but had to manually update the framework to build for Android. This did not work by just following the "Android ADB guide". Also, i had to change the source in the target to test_txt.cpp instead of test.bin.cpp. Please what is the difference between these files.?

Also, the benchmark results from CPU only runs using the experimental branch is quite slow.
Mate10 - MobileNet
A73 - 220 ms
A53 - 589 ms

This is based on loop size of 10 and thread size of 4.

Finally, when i configured the framework for GPU runs ( i.e setting DeviceType::GPU_CL). The times i obtained for GPU runs where ridiculous fast ~ 5ms for Mobilenet.

Please, are there additional steps for running the performance test using GPU_CL config on Android?

typo and why so large input buffer

Below two lines are in feather/test_txt.cpp

size_t input_size = 224 * 2224 * 3 ;
float *input = new float[input_size * 20];
  1. typo 2224->224?
  2. why do you allocate 20 times of input size, seems like each time only use float[input_size], is there a reason for 20 times buffer?

build出错

运行
./build_scripts/build_linux.sh

报错
In file included from /output/FeatherCNN/src/layer_factory.cpp:39:0: /output/FeatherCNN/src/layers/filter_layer.h: In constructor ‘feather::FilterLayer::FilterLayer(const feather::LayerParameter*, const RuntimeParameter<float>*)’: /output/FeatherCNN/src/layers/filter_layer.h:29:39: error: ‘const struct feather::LayerParameter’ has no member named ‘filter_param’ num_output = layer_param->filter_param()->num_output(); ^ src/CMakeFiles/feather.dir/build.make:101: recipe for target 'src/CMakeFiles/feather.dir/layer_factory.cpp.o' failed make[2]: *** [src/CMakeFiles/feather.dir/layer_factory.cpp.o] Error 1 CMakeFiles/Makefile2:87: recipe for target 'src/CMakeFiles/feather.dir/all' failed

ubuntu 16.04

请问该框架与ncnn的区别与联系?

你好,请问该框架与ncnn的区别与联系?有哪些优势?另外我看ncnn实现了int8量化,但是会比较简单,你们有什么样的计划,是否会做出新的方法?
期待你们的回复,谢谢!

Model Convert Errror

Hi, thank you for your hard works on Feather CNN. I need help to run Benchmark on different networks.

Seen this Benchmark result of different networks, mobilenet, squeezenet, googlenet and VGG16.
https://github.com/Tencent/FeatherCNN/wiki/Benchmarks
Can you share feathermodel file you used, please?

I've tried to convert some caffe models (prototxt / caffe model upgraded) into feathermodel using tools/feather_convert_caffe, and hits an error.
[libprotobuf FATAL /HDD/usr/local/include/google/protobuf/repeated_field.h:1431] CHECK failed: (index) < (current_size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted_

Note : MobileNet, and SqueezeNet hits the same error.

I've found an way to work around,
in feather_convert_caffe.cc:217
layer_num = net_param.layer_size(); change this to layer_num = net_param_prototxt.layer_size();

-> which makes the conversion works, and MobileNet benchmark runs without issues.

However, Squeezenet hits another issue on Benchmark Run.
++++++Start Loader++++++
Finished loading from file
feather_benchmark: /root/feather/src/flatbuffers/flatbuffers.h:242: flatbuffers::Vector::return_type flatbuffers::Vector::Get(flatbuffers::uoffset_t) const [with T = flatbuffers::Offsetfeather::BlobProto; flatbuffers::Vector::return_type = const feather::BlobProto*; flatbuffers::uoffset_t = unsigned int]: Assertion `i < size()' failed.
Aborted

I guess this comes from layer_num mismatches where I've changed the layer_num to prototxt's. But if I didn't, conversion to feathermodel would fail in first place.

Can you shed some lights on this, please?

Another,
How do I create input data file, please? repo has input_3x224x224, but shouldn't alexnet use input_3x227x227? Not sure where the input file comes from, so cannot create one for 227x227. Missing some points maybe?

Thanks!

build error

./build_scripts/build_android.sh 在macos(ndkr17b)编译不通过有如下错误
make: *** No targets specified and no makefile found. Stop.
make: *** No rule to make target `install'. Stop.
请问是什么问题,需要如何解决?

Documentation or examples for ARM usage.

Hi,
first of all kudos to fantastic developers like you guys... I am very excited to use this library. I plan to use it on an embedded system but I can't get past the basic compiling of ./build_scripts/build_linux.sh
I mean, it would be very nice if you could redirect me to a use case of other resource for using the library on Embedded OS. :)

这个项目还在维护吗?

编译脚本有问题:没有 .cl 文件夹
代码编译有问题: net.cpp 288行函数调用歧义。
conv_layer.h 71行, this->name 应为 this->name.c_str()

loadparam

// printf("bottom name %s\n", bottom_name);
// layer->bottoms[j] = new Blob<float>(bottom_name);
std::map<std::string, Blob<float> *>::iterator map_iter = blob_map.find(bottom_name); 
if (( map_iter == blob_map.end()) && (layer->type.compare("Input") != 0))
{
        LOGE("Topology error: bottom blob %s of layer %s type %s not found in map.", bottom_name, layer_name, layer_type);
        return -300;
}

在解析网络配置文件时,配置文件中blob在blob_map中进行查找,如果找不到就返回,对于新的bottom blob不是应该新创建吗,如果直接返回的话,能完成网络的解析吗?

supported layer mismatch between layer_factory.cpp and feather_convert_caffe.cc

These are what are supported in layer_factory.cpp
void register_layer_creators()
{
REGISTER_LAYER_CREATOR(Input, GetInputLayer);
REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);
REGISTER_LAYER_CREATOR(DepthwiseConvolution, GetDepthwiseConvolutionLayer);
REGISTER_LAYER_CREATOR(BatchNorm, GetBatchNormLayer);
REGISTER_LAYER_CREATOR(LRN, GetLRNLayer);
REGISTER_LAYER_CREATOR(Concat, GetConcatLayer);
REGISTER_LAYER_CREATOR(Dropout, GetDropoutLayer);
REGISTER_LAYER_CREATOR(ReLU, GetReluLayer);
REGISTER_LAYER_CREATOR(PReLU, GetPReluLayer);
REGISTER_LAYER_CREATOR(Scale, GetScaleLayer);
REGISTER_LAYER_CREATOR(Slice, GetSliceLayer);
REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);
REGISTER_LAYER_CREATOR(Eltwise, GetEltwiseLayer);
REGISTER_LAYER_CREATOR(InnerProduct, GetInnerProductLayer);
REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);
REGISTER_LAYER_CREATOR(Filter, GetFilterLayer);
REGISTER_LAYER_CREATOR(Reshape, GetReshapeLayer);
}
These are what are supported in
feather_convert_caffe.cc

layer_type.compare("Input")
if (layer_type.compare("Convolution") == 0 || (layer_type.compare("DepthwiseConvolution") == 0))
else if (layer_type.compare("LRN") == 0)
else if (layer_type.compare("Pooling") == 0)
else if (layer_type.compare("Interp") == 0)
else if (layer_type.compare("InnerProduct") == 0)
else if (layer_type.compare("Softmax") == 0)
else if (layer_type.compare("Scale") == 0)
else if (layer_type.compare("Eltwise") == 0)
else if (layer_type.compare("Flatten") == 0)
else if (layer_type.compare("Filter") == 0)

There are a few not supported in this parser. Can you confirm only layers Input, Convolution, LRN, Polling, Interp, InnerProduct, Softmax, Scale, Eltwise, Flattern, Filter are supported?

Dead loop during net init fuse stage

Info:

  1. Conv A (top blob a, bottom blob x)
  2. BatchNormal B (bottom blob a, top blob a)
  3. Scale C (bottom blob a, top blob a)

up case code will run into dead loop

Layer type Deconvolution not registered

Failed to call InitFromPath().

Error:

Finished loading from file
Layer type Deconvolution not registered
Layer type Deconvolution not registered
Layer type Sigmoid not registered
bottom name ...
...
Segmentation fault (core dumped)

Cann't we use Deconvolution layer or Sigmoid layer?

Net::ExtractBlob() error

我在跑 ios 程序的时候 ,最后输出结果时调用 Net::ExtractBlob(float** output_ptr, std::string name) 这个函数报错
调用部分代码是这样的:
float p = NULL;
forward_net.ExtractBlob(&p, "fc7");
请问是这个 指针初始化的问题么?
在这个 forward_net.ExtractBlob(float
output_ptr, std::string name) 实现里给数组指针分配内存,在函数内部为什么要
assert(output_ptr == NULL);

can not conver the bvlc_googlenet.caffemodel

Hello,thank you for your contributions for the FeatherCNN,I have a question that when I use ./feather_convert_caffe bvlc_googlenet.prototxt bvlc_googlenet.caffemodel to conver bvlc_googlenet.caffemodel, it gets something wrong:

Input Num 0
Input Layer
Input dim 10
Input dim 3
Input dim 224
Input dim 224
Layer num 0
Legacy layer num 169
[libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:1522] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted (core dumped)

Is it wrong with my bvlc_googlenet.prototxt? I download the prototxt from https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/deploy.prototxt and the caffmodel was downloaded from
http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel

ubantu16.04 64Bit Host
Looking forward to your reply!
Thanks in advance!

InitFromPath never returns

I've converted Yahoo's Open NSFW model to your format. Whenever I try to load the model with InitFromPath, it just never returns. It pegs a single CPU core to 100% and just spins and spins. I'm wondering if a slow startup time is expected with such a model (23 megs, ResNet model).

Running benchmark with SqueezeNet model, segmentation fault occured

Platform: Hikey960 / Linux Debian 4.4.74 aarch64 GNU/Linux
Model: ./data/squeezenet.feathermodel from (http://hpcc.siat.ac.cn/jintao/feathercnn/)
Input data: ./data/input_3x224x224.txt

Description:
On running the benchmark test using squeezenet.feathermodel, i get a run-time error (Segmentation fault).

The issue occurs when a release (master branch) ahead of e8f2d95 is used.i.e it occurs in the next release "add implementation for reshape layer - d12e42b".

Please see logs below: -

root@debian:~/feather# ./feather_benchmark ./data/squeezenet.feathermodel ./data/input_3x224x224.txt 20 4
++++++Start Loader++++++
Finished loading from file
bottom name fire2/relu_squeeze1x1 ptr 0x557515ffc0
Output shape 128 55 55
bottom name fire3/relu_squeeze1x1 ptr 0x5575160a70
Output shape 128 55 55
bottom name fire4/relu_squeeze1x1 ptr 0x5575161520
Output shape 256 55 55
bottom name fire5/relu_squeeze1x1 ptr 0x5575178da0
Output shape 256 27 27
bottom name fire6/relu_squeeze1x1 ptr 0x55751904e0
Output shape 384 27 27
bottom name fire7/relu_squeeze1x1 ptr 0x5575190f90
Output shape 384 27 27
bottom name fire8/relu_squeeze1x1 ptr 0x5575191a40
Output shape 512 27 27
bottom name fire9/relu_squeeze1x1 ptr 0x557519cf40
Output shape 512 13 13
input size 150528 parts size 150528
Segmentation fault

Evaluation

Hello, why didn't I find /build_ scripts/build_ linux_ test.sh ?

Model convert error - libprotobuf

I am trying to convert some caffe models (prototxt / caffe model upgraded) into feathermodel using tools/feather_convert_caffe, and hit this error.

[libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:1514] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (current_size_):
Aborted (core dumped)

Note: I get this error when trying to convert any converting any caffe model . Also, I am using the experimental branch ( commit - 5023303 ).

Please, do you have any ideas for a solution?

Also, is there a input file for 3x227x227?
This is required for some models like AlexNet. Currently, there is only on input file - input_3x224x224.txt. Please, how is this file created.?

build_linux.sh

Compiling and Install
cd FeatherCNN
./build_scripts/build_linux.sh

build_scripts 目录已经没有build_linux.sh这个文件了

请问转换caffe模型的文件是feather_convert_caffe还是caffe_model_convert

对转换工具编译时只生成了feather_convert_caffe这个文件,通过这个文件也能将caffe模型转换成feathermodel,这样转化的featherCNN模型正确吗,另外编译时需要安装protobuf吗?
还有一个问题我的caffe网络有44层(Convolution,Eltwise,ReLU三种类型),调用Forward(float *input)执行网络时layers.size的值为33,ReLU层有11层,这样解析的layers.size正确吗?

Model convert error

layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 22
dim: 92
}
}
}

if input layer prototxt is as above, will report bug as "Blob not setup yet, may be casued by wrong layer order. Aborted"

test error

when I run the feather_benchmark,like this:
./feather_benchmark ./data/mobilenet.feathermodel ./data/input_3x224x224.txt 20 4
can't run:
bash: ./feather_benchmark: cannot execute binary file: 可执行文件格式错误

Running bvlc_googlenet with one thread, segmentation fault occured

Platform: Raspberry Pi 3 / Linux ubuntu 4.14.37 aarch64 GNU/Linux
Model: bvlc_googlenet in caffe/models
Input data: ./data/input_3x224x224.txt
Description:
Running the model with 2 or more than 2 threads, it can function well.

./feather_benchmark ./bvlc_googlenet/bvlc_googlenet.feathermodel ./data/input_3x224x224.txt 20 2
++++++Start Loader++++++
Finished loading from file
-- Loading 143 layers
input num 1 input dim num 4
input_name data (n c h w) (10 3 224 224)
stride 2, 2
_bottom data
setup layer conv1/7x7_s2
_bottom conv1/7x7_s2
setup layer conv1/relu_7x7
_bottom conv1/relu_7x7
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool1/3x3_s2
_bottom pool1/3x3_s2
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer pool1/norm1
stride 1, 1
_bottom pool1/norm1
setup layer conv2/3x3_reduce
_bottom conv2/3x3_reduce
setup layer conv2/relu_3x3_reduce
stride 1, 1
_bottom conv2/relu_3x3_reduce
setup layer conv2/3x3
_bottom conv2/3x3
setup layer conv2/relu_3x3
_bottom conv2/relu_3x3
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer conv2/norm2
_bottom conv2/norm2
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool2/3x3_s2
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/1x1
_bottom inception_3a/1x1
setup layer inception_3a/relu_1x1
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/3x3_reduce
_bottom inception_3a/3x3_reduce
setup layer inception_3a/relu_3x3_reduce
stride 1, 1
_bottom inception_3a/relu_3x3_reduce
setup layer inception_3a/3x3
_bottom inception_3a/3x3
setup layer inception_3a/relu_3x3
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/5x5_reduce
_bottom inception_3a/5x5_reduce
setup layer inception_3a/relu_5x5_reduce
stride 1, 1
_bottom inception_3a/relu_5x5_reduce
setup layer inception_3a/5x5
_bottom inception_3a/5x5
setup layer inception_3a/relu_5x5
_bottom pool2/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3a/pool
stride 1, 1
_bottom inception_3a/pool
setup layer inception_3a/pool_proj
_bottom inception_3a/pool_proj
setup layer inception_3a/relu_pool_proj
_bottom inception_3a/relu_1x1
_bottom inception_3a/relu_3x3
_bottom inception_3a/relu_5x5
_bottom inception_3a/relu_pool_proj
setup layer inception_3a/output
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/1x1
_bottom inception_3b/1x1
setup layer inception_3b/relu_1x1
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/3x3_reduce
_bottom inception_3b/3x3_reduce
setup layer inception_3b/relu_3x3_reduce
stride 1, 1
_bottom inception_3b/relu_3x3_reduce
setup layer inception_3b/3x3
_bottom inception_3b/3x3
setup layer inception_3b/relu_3x3
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/5x5_reduce
_bottom inception_3b/5x5_reduce
setup layer inception_3b/relu_5x5_reduce
stride 1, 1
_bottom inception_3b/relu_5x5_reduce
setup layer inception_3b/5x5
_bottom inception_3b/5x5
setup layer inception_3b/relu_5x5
_bottom inception_3a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3b/pool
stride 1, 1
_bottom inception_3b/pool
setup layer inception_3b/pool_proj
_bottom inception_3b/pool_proj
setup layer inception_3b/relu_pool_proj
_bottom inception_3b/relu_1x1
_bottom inception_3b/relu_3x3
_bottom inception_3b/relu_5x5
_bottom inception_3b/relu_pool_proj
setup layer inception_3b/output
_bottom inception_3b/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool3/3x3_s2
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/1x1
_bottom inception_4a/1x1
setup layer inception_4a/relu_1x1
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/3x3_reduce
_bottom inception_4a/3x3_reduce
setup layer inception_4a/relu_3x3_reduce
stride 1, 1
_bottom inception_4a/relu_3x3_reduce
setup layer inception_4a/3x3
_bottom inception_4a/3x3
setup layer inception_4a/relu_3x3
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/5x5_reduce
_bottom inception_4a/5x5_reduce
setup layer inception_4a/relu_5x5_reduce
stride 1, 1
_bottom inception_4a/relu_5x5_reduce
setup layer inception_4a/5x5
_bottom inception_4a/5x5
setup layer inception_4a/relu_5x5
_bottom pool3/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4a/pool
stride 1, 1
_bottom inception_4a/pool
setup layer inception_4a/pool_proj
_bottom inception_4a/pool_proj
setup layer inception_4a/relu_pool_proj
_bottom inception_4a/relu_1x1
_bottom inception_4a/relu_3x3
_bottom inception_4a/relu_5x5
_bottom inception_4a/relu_pool_proj
setup layer inception_4a/output
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/1x1
_bottom inception_4b/1x1
setup layer inception_4b/relu_1x1
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/3x3_reduce
_bottom inception_4b/3x3_reduce
setup layer inception_4b/relu_3x3_reduce
stride 1, 1
_bottom inception_4b/relu_3x3_reduce
setup layer inception_4b/3x3
_bottom inception_4b/3x3
setup layer inception_4b/relu_3x3
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/5x5_reduce
_bottom inception_4b/5x5_reduce
setup layer inception_4b/relu_5x5_reduce
stride 1, 1
_bottom inception_4b/relu_5x5_reduce
setup layer inception_4b/5x5
_bottom inception_4b/5x5
setup layer inception_4b/relu_5x5
_bottom inception_4a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4b/pool
stride 1, 1
_bottom inception_4b/pool
setup layer inception_4b/pool_proj
_bottom inception_4b/pool_proj
setup layer inception_4b/relu_pool_proj
_bottom inception_4b/relu_1x1
_bottom inception_4b/relu_3x3
_bottom inception_4b/relu_5x5
_bottom inception_4b/relu_pool_proj
setup layer inception_4b/output
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/1x1
_bottom inception_4c/1x1
setup layer inception_4c/relu_1x1
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/3x3_reduce
_bottom inception_4c/3x3_reduce
setup layer inception_4c/relu_3x3_reduce
stride 1, 1
_bottom inception_4c/relu_3x3_reduce
setup layer inception_4c/3x3
_bottom inception_4c/3x3
setup layer inception_4c/relu_3x3
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/5x5_reduce
_bottom inception_4c/5x5_reduce
setup layer inception_4c/relu_5x5_reduce
stride 1, 1
_bottom inception_4c/relu_5x5_reduce
setup layer inception_4c/5x5
_bottom inception_4c/5x5
setup layer inception_4c/relu_5x5
_bottom inception_4b/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4c/pool
stride 1, 1
_bottom inception_4c/pool
setup layer inception_4c/pool_proj
_bottom inception_4c/pool_proj
setup layer inception_4c/relu_pool_proj
_bottom inception_4c/relu_1x1
_bottom inception_4c/relu_3x3
_bottom inception_4c/relu_5x5
_bottom inception_4c/relu_pool_proj
setup layer inception_4c/output
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/1x1
_bottom inception_4d/1x1
setup layer inception_4d/relu_1x1
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/3x3_reduce
_bottom inception_4d/3x3_reduce
setup layer inception_4d/relu_3x3_reduce
stride 1, 1
_bottom inception_4d/relu_3x3_reduce
setup layer inception_4d/3x3
_bottom inception_4d/3x3
setup layer inception_4d/relu_3x3
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/5x5_reduce
_bottom inception_4d/5x5_reduce
setup layer inception_4d/relu_5x5_reduce
stride 1, 1
_bottom inception_4d/relu_5x5_reduce
setup layer inception_4d/5x5
_bottom inception_4d/5x5
setup layer inception_4d/relu_5x5
_bottom inception_4c/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4d/pool
stride 1, 1
_bottom inception_4d/pool
setup layer inception_4d/pool_proj
_bottom inception_4d/pool_proj
setup layer inception_4d/relu_pool_proj
_bottom inception_4d/relu_1x1
_bottom inception_4d/relu_3x3
_bottom inception_4d/relu_5x5
_bottom inception_4d/relu_pool_proj
setup layer inception_4d/output
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/1x1
_bottom inception_4e/1x1
setup layer inception_4e/relu_1x1
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/3x3_reduce
_bottom inception_4e/3x3_reduce
setup layer inception_4e/relu_3x3_reduce
stride 1, 1
_bottom inception_4e/relu_3x3_reduce
setup layer inception_4e/3x3
_bottom inception_4e/3x3
setup layer inception_4e/relu_3x3
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/5x5_reduce
_bottom inception_4e/5x5_reduce
setup layer inception_4e/relu_5x5_reduce
stride 1, 1
_bottom inception_4e/relu_5x5_reduce
setup layer inception_4e/5x5
_bottom inception_4e/5x5
setup layer inception_4e/relu_5x5
_bottom inception_4d/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4e/pool
stride 1, 1
_bottom inception_4e/pool
setup layer inception_4e/pool_proj
_bottom inception_4e/pool_proj
setup layer inception_4e/relu_pool_proj
_bottom inception_4e/relu_1x1
_bottom inception_4e/relu_3x3
_bottom inception_4e/relu_5x5
_bottom inception_4e/relu_pool_proj
setup layer inception_4e/output
_bottom inception_4e/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool4/3x3_s2
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/1x1
_bottom inception_5a/1x1
setup layer inception_5a/relu_1x1
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/3x3_reduce
_bottom inception_5a/3x3_reduce
setup layer inception_5a/relu_3x3_reduce
stride 1, 1
_bottom inception_5a/relu_3x3_reduce
setup layer inception_5a/3x3
_bottom inception_5a/3x3
setup layer inception_5a/relu_3x3
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/5x5_reduce
_bottom inception_5a/5x5_reduce
setup layer inception_5a/relu_5x5_reduce
stride 1, 1
_bottom inception_5a/relu_5x5_reduce
setup layer inception_5a/5x5
_bottom inception_5a/5x5
setup layer inception_5a/relu_5x5
_bottom pool4/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5a/pool
stride 1, 1
_bottom inception_5a/pool
setup layer inception_5a/pool_proj
_bottom inception_5a/pool_proj
setup layer inception_5a/relu_pool_proj
_bottom inception_5a/relu_1x1
_bottom inception_5a/relu_3x3
_bottom inception_5a/relu_5x5
_bottom inception_5a/relu_pool_proj
setup layer inception_5a/output
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/1x1
_bottom inception_5b/1x1
setup layer inception_5b/relu_1x1
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/3x3_reduce
_bottom inception_5b/3x3_reduce
setup layer inception_5b/relu_3x3_reduce
stride 1, 1
_bottom inception_5b/relu_3x3_reduce
setup layer inception_5b/3x3
_bottom inception_5b/3x3
setup layer inception_5b/relu_3x3
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/5x5_reduce
_bottom inception_5b/5x5_reduce
setup layer inception_5b/relu_5x5_reduce
stride 1, 1
_bottom inception_5b/relu_5x5_reduce
setup layer inception_5b/5x5
_bottom inception_5b/5x5
setup layer inception_5b/relu_5x5
_bottom inception_5a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5b/pool
stride 1, 1
_bottom inception_5b/pool
setup layer inception_5b/pool_proj
_bottom inception_5b/pool_proj
setup layer inception_5b/relu_pool_proj
_bottom inception_5b/relu_1x1
_bottom inception_5b/relu_3x3
_bottom inception_5b/relu_5x5
_bottom inception_5b/relu_pool_proj
setup layer inception_5b/output
_bottom inception_5b/output
kernel (7 7) pad (0 0) stride (1 1) global_pooling 0
setup layer pool5/7x7_s1
_bottom pool5/7x7_s1
setup layer pool5/drop_7x7_s1
_bottom pool5/drop_7x7_s1
----BlobInfo----
Shape in nchw (1000 1024 1 1)
----------------
setup layer loss3/classifier
_bottom loss3/classifier
setup layer prob
Output shape 256 28 28
Output shape 480 28 28
Output shape 512 14 14
Output shape 512 14 14
Output shape 512 14 14
Output shape 528 14 14
Output shape 832 14 14
Output shape 832 7 7
Output shape 1024 7 7
input 1024 1 1
----BlobInfo----
Shape in nchw (1 1000 1 1)
----------------
old bottom conv2/relu_3x3 to new bottom conv2/3x3
*old bottom conv2/relu_3x3 to new bottom conv2/3x3
+old bottom conv2/relu_3x3 to new bottom conv2/3x3
Erasing layer 8 conv2/relu_3x3
Layer 8 after erasing: conv2/norm2 type LRN
old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
*old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
+old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
Erasing layer 15 inception_3a/relu_3x3
Layer 15 after erasing: inception_3a/5x5_reduce type Convolution
old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
*old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
+old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
Erasing layer 28 inception_3b/relu_3x3
Layer 28 after erasing: inception_3b/5x5_reduce type Convolution
old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
*old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
+old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
Erasing layer 42 inception_4a/relu_3x3
Layer 42 after erasing: inception_4a/5x5_reduce type Convolution
old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
*old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
+old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
Erasing layer 55 inception_4b/relu_3x3
Layer 55 after erasing: inception_4b/5x5_reduce type Convolution
old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
*old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
+old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
Erasing layer 68 inception_4c/relu_3x3
Layer 68 after erasing: inception_4c/5x5_reduce type Convolution
old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
*old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
+old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
Erasing layer 81 inception_4d/relu_3x3
Layer 81 after erasing: inception_4d/5x5_reduce type Convolution
old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
*old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
+old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
Erasing layer 94 inception_4e/relu_3x3
Layer 94 after erasing: inception_4e/5x5_reduce type Convolution
old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
*old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
+old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
Erasing layer 108 inception_5a/relu_3x3
Layer 108 after erasing: inception_5a/5x5_reduce type Convolution
old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
*old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
+old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
Erasing layer 121 inception_5b/relu_3x3
Layer 121 after erasing: inception_5b/5x5_reduce type Convolution
input size 150528 parts size 150528
Forward
----------Prediction costs 1217.012244ms
Forward
----------Prediction costs 1147.220138ms
Forward
----------Prediction costs 1117.961493ms
Forward
----------Prediction costs 682.398315ms
Forward
----------Prediction costs 682.362640ms
Forward
----------Prediction costs 682.154001ms
Forward
----------Prediction costs 683.279697ms
Forward
----------Prediction costs 683.667342ms
Forward
----------Prediction costs 682.445607ms
Forward
----------Prediction costs 682.233636ms
Forward
----------Prediction costs 682.563834ms
Forward
----------Prediction costs 682.186085ms
Forward
----------Prediction costs 682.851743ms
Forward
----------Prediction costs 682.801224ms
Forward
----------Prediction costs 682.941741ms
Forward
----------Prediction costs 684.093530ms
Forward
----------Prediction costs 682.520608ms
Forward
----------Prediction costs 682.639928ms
Forward
----------Prediction costs 682.725135ms
Forward
----------Prediction costs 682.587327ms
--------Average runtime 730.086001msi------
Warning: common memroy not freed before pool desctruction. Proceed with free.
Default common pool stat: size 8463360, ptr 2092d760
double free or corruption (!prev)

But with one thread:

++++++Start Loader++++++
Finished loading from file
-- Loading 143 layers
input num 1 input dim num 4
input_name data (n c h w) (10 3 224 224)
stride 2, 2
_bottom data
setup layer conv1/7x7_s2
_bottom conv1/7x7_s2
setup layer conv1/relu_7x7
_bottom conv1/relu_7x7
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool1/3x3_s2
_bottom pool1/3x3_s2
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer pool1/norm1
stride 1, 1
_bottom pool1/norm1
setup layer conv2/3x3_reduce
_bottom conv2/3x3_reduce
setup layer conv2/relu_3x3_reduce
stride 1, 1
_bottom conv2/relu_3x3_reduce
setup layer conv2/3x3
_bottom conv2/3x3
setup layer conv2/relu_3x3
_bottom conv2/relu_3x3
localsize 5 alpha 0.000100 beta 0.750000 k 1.000000
setup layer conv2/norm2
_bottom conv2/norm2
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool2/3x3_s2
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/1x1
_bottom inception_3a/1x1
setup layer inception_3a/relu_1x1
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/3x3_reduce
_bottom inception_3a/3x3_reduce
setup layer inception_3a/relu_3x3_reduce
stride 1, 1
_bottom inception_3a/relu_3x3_reduce
setup layer inception_3a/3x3
_bottom inception_3a/3x3
setup layer inception_3a/relu_3x3
stride 1, 1
_bottom pool2/3x3_s2
setup layer inception_3a/5x5_reduce
_bottom inception_3a/5x5_reduce
setup layer inception_3a/relu_5x5_reduce
stride 1, 1
_bottom inception_3a/relu_5x5_reduce
setup layer inception_3a/5x5
_bottom inception_3a/5x5
setup layer inception_3a/relu_5x5
_bottom pool2/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3a/pool
stride 1, 1
_bottom inception_3a/pool
setup layer inception_3a/pool_proj
_bottom inception_3a/pool_proj
setup layer inception_3a/relu_pool_proj
_bottom inception_3a/relu_1x1
_bottom inception_3a/relu_3x3
_bottom inception_3a/relu_5x5
_bottom inception_3a/relu_pool_proj
setup layer inception_3a/output
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/1x1
_bottom inception_3b/1x1
setup layer inception_3b/relu_1x1
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/3x3_reduce
_bottom inception_3b/3x3_reduce
setup layer inception_3b/relu_3x3_reduce
stride 1, 1
_bottom inception_3b/relu_3x3_reduce
setup layer inception_3b/3x3
_bottom inception_3b/3x3
setup layer inception_3b/relu_3x3
stride 1, 1
_bottom inception_3a/output
setup layer inception_3b/5x5_reduce
_bottom inception_3b/5x5_reduce
setup layer inception_3b/relu_5x5_reduce
stride 1, 1
_bottom inception_3b/relu_5x5_reduce
setup layer inception_3b/5x5
_bottom inception_3b/5x5
setup layer inception_3b/relu_5x5
_bottom inception_3a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_3b/pool
stride 1, 1
_bottom inception_3b/pool
setup layer inception_3b/pool_proj
_bottom inception_3b/pool_proj
setup layer inception_3b/relu_pool_proj
_bottom inception_3b/relu_1x1
_bottom inception_3b/relu_3x3
_bottom inception_3b/relu_5x5
_bottom inception_3b/relu_pool_proj
setup layer inception_3b/output
_bottom inception_3b/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool3/3x3_s2
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/1x1
_bottom inception_4a/1x1
setup layer inception_4a/relu_1x1
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/3x3_reduce
_bottom inception_4a/3x3_reduce
setup layer inception_4a/relu_3x3_reduce
stride 1, 1
_bottom inception_4a/relu_3x3_reduce
setup layer inception_4a/3x3
_bottom inception_4a/3x3
setup layer inception_4a/relu_3x3
stride 1, 1
_bottom pool3/3x3_s2
setup layer inception_4a/5x5_reduce
_bottom inception_4a/5x5_reduce
setup layer inception_4a/relu_5x5_reduce
stride 1, 1
_bottom inception_4a/relu_5x5_reduce
setup layer inception_4a/5x5
_bottom inception_4a/5x5
setup layer inception_4a/relu_5x5
_bottom pool3/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4a/pool
stride 1, 1
_bottom inception_4a/pool
setup layer inception_4a/pool_proj
_bottom inception_4a/pool_proj
setup layer inception_4a/relu_pool_proj
_bottom inception_4a/relu_1x1
_bottom inception_4a/relu_3x3
_bottom inception_4a/relu_5x5
_bottom inception_4a/relu_pool_proj
setup layer inception_4a/output
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/1x1
_bottom inception_4b/1x1
setup layer inception_4b/relu_1x1
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/3x3_reduce
_bottom inception_4b/3x3_reduce
setup layer inception_4b/relu_3x3_reduce
stride 1, 1
_bottom inception_4b/relu_3x3_reduce
setup layer inception_4b/3x3
_bottom inception_4b/3x3
setup layer inception_4b/relu_3x3
stride 1, 1
_bottom inception_4a/output
setup layer inception_4b/5x5_reduce
_bottom inception_4b/5x5_reduce
setup layer inception_4b/relu_5x5_reduce
stride 1, 1
_bottom inception_4b/relu_5x5_reduce
setup layer inception_4b/5x5
_bottom inception_4b/5x5
setup layer inception_4b/relu_5x5
_bottom inception_4a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4b/pool
stride 1, 1
_bottom inception_4b/pool
setup layer inception_4b/pool_proj
_bottom inception_4b/pool_proj
setup layer inception_4b/relu_pool_proj
_bottom inception_4b/relu_1x1
_bottom inception_4b/relu_3x3
_bottom inception_4b/relu_5x5
_bottom inception_4b/relu_pool_proj
setup layer inception_4b/output
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/1x1
_bottom inception_4c/1x1
setup layer inception_4c/relu_1x1
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/3x3_reduce
_bottom inception_4c/3x3_reduce
setup layer inception_4c/relu_3x3_reduce
stride 1, 1
_bottom inception_4c/relu_3x3_reduce
setup layer inception_4c/3x3
_bottom inception_4c/3x3
setup layer inception_4c/relu_3x3
stride 1, 1
_bottom inception_4b/output
setup layer inception_4c/5x5_reduce
_bottom inception_4c/5x5_reduce
setup layer inception_4c/relu_5x5_reduce
stride 1, 1
_bottom inception_4c/relu_5x5_reduce
setup layer inception_4c/5x5
_bottom inception_4c/5x5
setup layer inception_4c/relu_5x5
_bottom inception_4b/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4c/pool
stride 1, 1
_bottom inception_4c/pool
setup layer inception_4c/pool_proj
_bottom inception_4c/pool_proj
setup layer inception_4c/relu_pool_proj
_bottom inception_4c/relu_1x1
_bottom inception_4c/relu_3x3
_bottom inception_4c/relu_5x5
_bottom inception_4c/relu_pool_proj
setup layer inception_4c/output
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/1x1
_bottom inception_4d/1x1
setup layer inception_4d/relu_1x1
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/3x3_reduce
_bottom inception_4d/3x3_reduce
setup layer inception_4d/relu_3x3_reduce
stride 1, 1
_bottom inception_4d/relu_3x3_reduce
setup layer inception_4d/3x3
_bottom inception_4d/3x3
setup layer inception_4d/relu_3x3
stride 1, 1
_bottom inception_4c/output
setup layer inception_4d/5x5_reduce
_bottom inception_4d/5x5_reduce
setup layer inception_4d/relu_5x5_reduce
stride 1, 1
_bottom inception_4d/relu_5x5_reduce
setup layer inception_4d/5x5
_bottom inception_4d/5x5
setup layer inception_4d/relu_5x5
_bottom inception_4c/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4d/pool
stride 1, 1
_bottom inception_4d/pool
setup layer inception_4d/pool_proj
_bottom inception_4d/pool_proj
setup layer inception_4d/relu_pool_proj
_bottom inception_4d/relu_1x1
_bottom inception_4d/relu_3x3
_bottom inception_4d/relu_5x5
_bottom inception_4d/relu_pool_proj
setup layer inception_4d/output
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/1x1
_bottom inception_4e/1x1
setup layer inception_4e/relu_1x1
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/3x3_reduce
_bottom inception_4e/3x3_reduce
setup layer inception_4e/relu_3x3_reduce
stride 1, 1
_bottom inception_4e/relu_3x3_reduce
setup layer inception_4e/3x3
_bottom inception_4e/3x3
setup layer inception_4e/relu_3x3
stride 1, 1
_bottom inception_4d/output
setup layer inception_4e/5x5_reduce
_bottom inception_4e/5x5_reduce
setup layer inception_4e/relu_5x5_reduce
stride 1, 1
_bottom inception_4e/relu_5x5_reduce
setup layer inception_4e/5x5
_bottom inception_4e/5x5
setup layer inception_4e/relu_5x5
_bottom inception_4d/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_4e/pool
stride 1, 1
_bottom inception_4e/pool
setup layer inception_4e/pool_proj
_bottom inception_4e/pool_proj
setup layer inception_4e/relu_pool_proj
_bottom inception_4e/relu_1x1
_bottom inception_4e/relu_3x3
_bottom inception_4e/relu_5x5
_bottom inception_4e/relu_pool_proj
setup layer inception_4e/output
_bottom inception_4e/output
kernel (3 3) pad (0 0) stride (2 2) global_pooling 0
setup layer pool4/3x3_s2
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/1x1
_bottom inception_5a/1x1
setup layer inception_5a/relu_1x1
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/3x3_reduce
_bottom inception_5a/3x3_reduce
setup layer inception_5a/relu_3x3_reduce
stride 1, 1
_bottom inception_5a/relu_3x3_reduce
setup layer inception_5a/3x3
_bottom inception_5a/3x3
setup layer inception_5a/relu_3x3
stride 1, 1
_bottom pool4/3x3_s2
setup layer inception_5a/5x5_reduce
_bottom inception_5a/5x5_reduce
setup layer inception_5a/relu_5x5_reduce
stride 1, 1
_bottom inception_5a/relu_5x5_reduce
setup layer inception_5a/5x5
_bottom inception_5a/5x5
setup layer inception_5a/relu_5x5
_bottom pool4/3x3_s2
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5a/pool
stride 1, 1
_bottom inception_5a/pool
setup layer inception_5a/pool_proj
_bottom inception_5a/pool_proj
setup layer inception_5a/relu_pool_proj
_bottom inception_5a/relu_1x1
_bottom inception_5a/relu_3x3
_bottom inception_5a/relu_5x5
_bottom inception_5a/relu_pool_proj
setup layer inception_5a/output
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/1x1
_bottom inception_5b/1x1
setup layer inception_5b/relu_1x1
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/3x3_reduce
_bottom inception_5b/3x3_reduce
setup layer inception_5b/relu_3x3_reduce
stride 1, 1
_bottom inception_5b/relu_3x3_reduce
setup layer inception_5b/3x3
_bottom inception_5b/3x3
setup layer inception_5b/relu_3x3
stride 1, 1
_bottom inception_5a/output
setup layer inception_5b/5x5_reduce
_bottom inception_5b/5x5_reduce
setup layer inception_5b/relu_5x5_reduce
stride 1, 1
_bottom inception_5b/relu_5x5_reduce
setup layer inception_5b/5x5
_bottom inception_5b/5x5
setup layer inception_5b/relu_5x5
_bottom inception_5a/output
kernel (3 3) pad (1 1) stride (1 1) global_pooling 0
setup layer inception_5b/pool
stride 1, 1
_bottom inception_5b/pool
setup layer inception_5b/pool_proj
_bottom inception_5b/pool_proj
setup layer inception_5b/relu_pool_proj
_bottom inception_5b/relu_1x1
_bottom inception_5b/relu_3x3
_bottom inception_5b/relu_5x5
_bottom inception_5b/relu_pool_proj
setup layer inception_5b/output
_bottom inception_5b/output
kernel (7 7) pad (0 0) stride (1 1) global_pooling 0
setup layer pool5/7x7_s1
_bottom pool5/7x7_s1
setup layer pool5/drop_7x7_s1
_bottom pool5/drop_7x7_s1
----BlobInfo----
Shape in nchw (1000 1024 1 1)
----------------
setup layer loss3/classifier
_bottom loss3/classifier
setup layer prob
Output shape 256 28 28
Output shape 480 28 28
Output shape 512 14 14
Output shape 512 14 14
Output shape 512 14 14
Output shape 528 14 14
Output shape 832 14 14
Output shape 832 7 7
Output shape 1024 7 7
input 1024 1 1
----BlobInfo----
Shape in nchw (1 1000 1 1)
----------------
old bottom conv2/relu_3x3 to new bottom conv2/3x3
*old bottom conv2/relu_3x3 to new bottom conv2/3x3
+old bottom conv2/relu_3x3 to new bottom conv2/3x3
Erasing layer 8 conv2/relu_3x3
Layer 8 after erasing: conv2/norm2 type LRN
old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
*old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
+old bottom inception_3a/relu_3x3 to new bottom inception_3a/3x3
Erasing layer 15 inception_3a/relu_3x3
Layer 15 after erasing: inception_3a/5x5_reduce type Convolution
old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
*old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
+old bottom inception_3b/relu_3x3 to new bottom inception_3b/3x3
Erasing layer 28 inception_3b/relu_3x3
Layer 28 after erasing: inception_3b/5x5_reduce type Convolution
old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
*old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
+old bottom inception_4a/relu_3x3 to new bottom inception_4a/3x3
Erasing layer 42 inception_4a/relu_3x3
Layer 42 after erasing: inception_4a/5x5_reduce type Convolution
old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
*old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
+old bottom inception_4b/relu_3x3 to new bottom inception_4b/3x3
Erasing layer 55 inception_4b/relu_3x3
Layer 55 after erasing: inception_4b/5x5_reduce type Convolution
old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
*old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
+old bottom inception_4c/relu_3x3 to new bottom inception_4c/3x3
Erasing layer 68 inception_4c/relu_3x3
Layer 68 after erasing: inception_4c/5x5_reduce type Convolution
old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
*old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
+old bottom inception_4d/relu_3x3 to new bottom inception_4d/3x3
Erasing layer 81 inception_4d/relu_3x3
Layer 81 after erasing: inception_4d/5x5_reduce type Convolution
old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
*old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
+old bottom inception_4e/relu_3x3 to new bottom inception_4e/3x3
Erasing layer 94 inception_4e/relu_3x3
Layer 94 after erasing: inception_4e/5x5_reduce type Convolution
old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
*old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
+old bottom inception_5a/relu_3x3 to new bottom inception_5a/3x3
Erasing layer 108 inception_5a/relu_3x3
Layer 108 after erasing: inception_5a/5x5_reduce type Convolution
old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
*old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
+old bottom inception_5b/relu_3x3 to new bottom inception_5b/3x3
Erasing layer 121 inception_5b/relu_3x3
Layer 121 after erasing: inception_5b/5x5_reduce type Convolution
input size 150528 parts size 150528
Forward
----------Prediction costs 1138.138452ms
Forward
Segmentation fault

benchmarking

How to build the feather_benchmark ? Could you please help?

Supported layers/operators

Can you provide what full list of layers/operators supported by your engine in each framework (caffe/tensorflow)?

Comparison with ncnn?

Hi, I am new to ncnn and featherCNN, so could you please give me some introdution about the difference between these two framwork? thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.