Light

peace195 / sppnet Goto Github PK

View Code? Open in Web Editor NEW

131.0 11.0 43.0 31 KB

Spatial Pyramid Pooling on top of AlexNet using tensorflow. ***New updates for SPPnet in Pytorch**

Home Page: https://peace195.github.io/spatial-pyramid-pooling/

Python 100.00%

tensorflow spp sppnet spp-layer deep-learning cnn alexnet spatial-pyramid-pooling 102-category-flower flower-identification plant-identification spp-net pytorch spp-pytorch

sppnet's Introduction

Spatial Pyramid Pooling in Deep Convolutional Networks using tensorflow

New updates

Instead of sppnet, you can use this block of code in Pytorch to train a neural network with variable-sized inputs:

#With these lines of code below, we can memorize the gradient for later updates using pytorch because the
#loss.backward()function accumulates the gradient. After 64 steps, we call optimizer.step() for updating the parameters.
#https://discuss.pytorch.org/t/how-are-optimizer-step-and-loss-backward-related/7350
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=1, num_workers=8, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, num_workers=8, shuffle=False)
for i, (seqs, labels) in enumerate(train_loader):
	...
	loss = criterion(outputs, labels)
	loss.backward()
	if i % 64 == 0 or i == len(train_loader) - 1:
    		optimizer.step()
    		optimizer.zero_grad()
	...

Descriptions

I implemented a Spatial Pyramid Pooling on top of AlexNet in tensorflow. Then I applied it to 102 Category Flower identification task. I implemented for identification task only. If you are interested in this project, I will continue to develop it in object detection task. Do not hesitate to contact me at [email protected]. :)

More information: https://peace195.github.io/spatial-pyramid-pooling/

Data

102 Category Flower Dataset

Requirements

python 2.7
tensorflow 1.2
pretrained parameters of AlexNet in ImageNet dataset: bvlc_alexnet.npy

Running

$ python alexnet_spp.py

Result

82% accuracy rate (the state-of-the-art is 94%).

Author

Binh Do

sppnet's People

Contributors

Stargazers

Watchers

sppnet's Issues

About the SPP layer

TypeError: Expected binary or unicode string , got None
in line: spp = tf.reshape(max_pool, [num_sample, -1])
How to resolve it ?

hope to continue to develop it in object detection task

hello, your work is good. hope to continue to develop it in object detection task.

Sppnet input image preprocessing

x_train = tf.image.resize_images(x_train,
[int(size_cluster_keys[it%len(size_cluster_keys)][1]/2),
int(size_cluster_keys[it%len(size_cluster_keys)][0]/2)],
method=1, align_corners=False)

Shouldn't it be like this：
x_train = tf.image.resize_images(x_train,
[int(size_cluster_keys[it%len(size_cluster_keys)][0]/2),
int(size_cluster_keys[it%len(size_cluster_keys)][1]/2)],
method=1, align_corners=False)

Problem about maxpooling kernel size and strides

The implementation of kernel size and strides of maxpooling in your code is this:

 for i in range(len(out_pool_size)):
        h_strd = previous_conv_size[0] / out_pool_size[i]
        w_strd = previous_conv_size[1] / out_pool_size[i]
        h_wid = previous_conv_size[0] - h_strd * out_pool_size[i] + 1
        w_wid = previous_conv_size[1] - w_strd * out_pool_size[i] + 1
        max_pool = tf.nn.max_pool(previous_conv,
                                   ksize=[1,h_wid,w_wid, 1],
                                   strides=[1,h_strd, w_strd,1],
                                   padding='VALID')
        if (i == 0):
            spp = tf.reshape(max_pool, [num_sample, -1])
        else:
            spp = tf.concat(axis=1, values=[spp, tf.reshape(max_pool, [num_sample, -1])])

But the official caffe code is this:

Why the net doesn't converge?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.