Giter Club home page Giter Club logo

sppnet's Introduction

Spatial Pyramid Pooling in Deep Convolutional Networks using tensorflow

New updates

Instead of sppnet, you can use this block of code in Pytorch to train a neural network with variable-sized inputs:

#With these lines of code below, we can memorize the gradient for later updates using pytorch because the
#loss.backward()function accumulates the gradient. After 64 steps, we call optimizer.step() for updating the parameters.
#https://discuss.pytorch.org/t/how-are-optimizer-step-and-loss-backward-related/7350
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=1, num_workers=8, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, num_workers=8, shuffle=False)
for i, (seqs, labels) in enumerate(train_loader):
	...
	loss = criterion(outputs, labels)
	loss.backward()
	if i % 64 == 0 or i == len(train_loader) - 1:
    		optimizer.step()
    		optimizer.zero_grad()
	...

Descriptions

I implemented a Spatial Pyramid Pooling on top of AlexNet in tensorflow. Then I applied it to 102 Category Flower identification task. I implemented for identification task only. If you are interested in this project, I will continue to develop it in object detection task. Do not hesitate to contact me at [email protected]. :)

More information: https://peace195.github.io/spatial-pyramid-pooling/

Data

102 Category Flower Dataset

Requirements

  • python 2.7
  • tensorflow 1.2
  • pretrained parameters of AlexNet in ImageNet dataset: bvlc_alexnet.npy

Running

$ python alexnet_spp.py

Result

82% accuracy rate (the state-of-the-art is 94%).

Author

Binh Do

sppnet's People

Contributors

peace195 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sppnet's Issues

About the SPP layer

TypeError: Expected binary or unicode string , got None
in line: spp = tf.reshape(max_pool, [num_sample, -1])
How to resolve it ?

Sppnet input image preprocessing

x_train = tf.image.resize_images(x_train,
[int(size_cluster_keys[it%len(size_cluster_keys)][1]/2),
int(size_cluster_keys[it%len(size_cluster_keys)][0]/2)],
method=1, align_corners=False)

Shouldn't it be like this:
x_train = tf.image.resize_images(x_train,
[int(size_cluster_keys[it%len(size_cluster_keys)][0]/2),
int(size_cluster_keys[it%len(size_cluster_keys)][1]/2)],
method=1, align_corners=False)

Problem about maxpooling kernel size and strides

The implementation of kernel size and strides of maxpooling in your code is this:

 for i in range(len(out_pool_size)):
        h_strd = previous_conv_size[0] / out_pool_size[i]
        w_strd = previous_conv_size[1] / out_pool_size[i]
        h_wid = previous_conv_size[0] - h_strd * out_pool_size[i] + 1
        w_wid = previous_conv_size[1] - w_strd * out_pool_size[i] + 1
        max_pool = tf.nn.max_pool(previous_conv,
                                   ksize=[1,h_wid,w_wid, 1],
                                   strides=[1,h_strd, w_strd,1],
                                   padding='VALID')
        if (i == 0):
            spp = tf.reshape(max_pool, [num_sample, -1])
        else:
            spp = tf.concat(axis=1, values=[spp, tf.reshape(max_pool, [num_sample, -1])])

But the official caffe code is this:
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.