Giter Club home page Giter Club logo

stn.pytorch's Introduction

PyTorch version of spatial transformer network

Ported from https://github.com/qassemoquab/stnbhwd according to pytorch tutorial. Now support CPU and GPU. To use the ffi you need to install the cffi package from pip.

Build and test

cd script
./make.sh #build cuda code, don't forget to modify -arch argument for your GPU computational capacity version
python build.py
python test.py

There is a demo in test_stn.ipynb

Modules

STN is the spatial transformer module, it takes a B*H*W*D tensor and a B*H*W*2 grid normalized to [-1,1] as an input and do bilinear sampling.

AffineGridGen takes a B*2*3 matrix and generate an affine transformation grid.

CylinderGridGen takes a B*1 theta vector and generate a transformation grid to remap equirectangular images along x axis.

DenseAffineGridGen takes a B*H*W*6 tensor and do affine transformation for each pixel. Example of convolutional spatial transformer can be found in test_conv_stn.ipynb.

An example of the landscape of the loss function of a simple STN with L1 Loss can be found in the demo.

Train hacks

  • set a learning rate multiplier, 1e-3 or 1e-4 would work fine.
  • add an auxiliary loss to regularized the difference of the affine transformation from identity mapping, to aviod sampling outside the original image.

Complex grid demo

STN is able to handle a complex grid, however, how to parameterize the grid is a problem.

image

stn.pytorch's People

Contributors

aarcosg avatar berleon avatar fxia22 avatar jj0mst avatar lim0606 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stn.pytorch's Issues

Example of use in Training

Is there a example of STN on any dataset, maybe MNIST? Like here
It is not so easy to get it working. This example could explain some ambiguity like:

  1. How to implement auxiliary loss for Grid Generation?
  2. In test_conv_stn.ipynb the very strange stuff are happening like:
conv = self.conv(input2.transpose(2,3).transpose(1,2)) # Why Transpose?
conv = conv.transpose(1,2).transpose(2,3)  # Why Transpose Back ?
iden = Variable(torch.cat([torch.ones(1, 328, 582, 1), torch.zeros(1, 328, 582, 3), torch.ones(1, 328, 582, 1), torch.zeros(1, 328, 582, 1)],3 )) # Why we need it?
out = self.g(conv + iden) # Why we add this values?

I'm planing to try the STN but I'm not sure how I should start.

Bug in gpu support

Hi, Thank you for sharing great code.
I found that the code works fine with gpu_id = 0, but it raises cuda runtime error if I tried to run it on different gpus. I guess some parts of codes are implicitly assuming to use default gpu, which would be great to be modified in the future.

how to run the demo

I want to use STN to correct the distorted picture with text ,in order to recognize the text,how to use this?

compatibility with DataParallel

Thank you for this implementation! Have you tried using it within a network that is wrapped in a DataParallel in order to make use of multiple graphics cards? I am getting an illegal memory access was encountered error when replacing

with torch.cuda.device(3):
    input1 = input1.cuda()
    input2 = input2.cuda()
    start = time.time()
    out = s(input1, input2)
    print(out.size(), 'time:', time.time() - start)
    start = time.time()
    out.backward(input1.data.cuda())
    print('time:', time.time() - start)

in test.py with

s = torch.nn.DataParallel(s)
if True:
    input1 = input1.cuda()
    input2 = input2.cuda()
    start = time.time()
    out = s(input1, input2)
    print(out.size(), 'time:', time.time() - start)
    start = time.time()
    out.backward(input1.data.cuda())
    print('time:', time.time() - start)

Interestingly, the code works with

export CUDA_VISIBLE_DEVICES="0"

but fails with

export CUDA_VISIBLE_DEVICES="0,1"

I see that you are explicitly setting the CUDA device before executing the kernel, which might be the reason for the illegal memory access. Any ideas? Thank you!

How can it do backpropogation?

After reading your code, it seems that the code doesn't mention anything about backpropogation. Does pytorch do it automatically?
Thanks!

cuda runtime error

hi, I'm getting this error during the backward propagation:

Traceback (most recent call last):
  File "/home/eriba/code/main.py", line 259, in <module>
    main()
  File "/home/eriba/code/main.py", line 235, in main
    loss.backward()
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 152, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/__init__.py", line 98, in backward
    variables, grad_variables, retain_graph)
  File "/home/eriba/software/stn.pytorch/script/functions/stn.py", line 72, in backward
    grad_input1 = grad_input1.cuda(self.device)
  File "/usr/local/lib/python2.7/dist-packages/torch/_utils.py", line 65, in _cuda
    return new_type(self.size()).copy_(self, async)
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /home/eriba/software/pytorch/pytorch/torch/lib/THC/generic/THCTensorCopy.c:18

Any idea why would that happen ?

error in BilinearSampler.updateOutput: invalid device function

I compiled the code and run the test.py, but I got some error message below:

script $ python test.py
Variable containing:
(0 ,.,.) =
0.8000 0.3000 1.0000
0.5000 0.0000 0.0000
[torch.FloatTensor of size 1x2x3]

(1L, 64L, 128L, 2L)
(1L, 2L, 3L)
(64L, 64L, 128L, 64L) time: 0.219668865204
(64L, 64L, 128L, 64L) time: 0.432946920395
error in BilinearSampler.updateOutput: invalid device function
Traceback (most recent call last):
File "test.py", line 48, in
out = s(input1, input2)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "stn.pytorch/script/modules/stn.py", line 12, in forward
return self.f(input1, input2)
File "stn.pytorch/script/functions/stn.py", line 24, in forward
my_lib.BilinearSamplerBHWD_updateOutput_cuda(input1, input2, output, self.device_c)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 177, in safe_call
result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: aborting at stn.pytorch/script/src/my_lib_cuda.c:50

I have been able to compile the code by running ./make.sh. The compilation looks well.

I am using cuda 8.0, pytorch 0.1.12.post2, and ubuntu 16.04.

nan gradients with BCHW layout

I recently tried to use the new BCHW functions with my network, since i always use that layout and it simplifies my code.

I noticed that all the gradients of my convolutional layers are nan now, which makes also the weights full of nan after the parameters' update.

I'm sure i made all the necessary conversions and i get no error like inconsistency between tensors or anything else.

Windows support ?

Hi, is the module support windows (CPU) ? For your information, I have installed windows based pytorch and VS2017. Thank you.

BCHW format

Excellent work!

I would like to use this in the middle of my pytorch network, so my tensors are in [Batch x Channel x Height x Width] format. I tried to use torch.permute to change their dimension orders, but it was not successful.
For example, when a = torch.randn((2,3,4,5)), a.stride() is (60, 20, 5, 1), but
if I do b = a.permute((0,1,2,3)), b.stride() is (1, 60, 20, 5) while torch.randn(5,2,3,4).stride() is (24, 12, 4, 1).

Is there an easy and efficient way to do it? or do I need to change .c and .cu files in src?

I guess a.permute((0,1,2,3)).contiguous() might be a solution, but I'm not sure it is safe for Variable (autograd).

Thank you.

cannot compute gradients when STN is called multiple times

Hello, I get the following error when the code below gets executed:

<ipython-input-4-7d15b88aed95> in <module>()
     19     loss += crit(input1, input1.detach()*0)
     20 
---> 21 loss.backward()

~/anaconda2/envs/python36/lib/python3.6/site-packages/torch/autograd/variable.py in backward(self, gradient, retain_variables)
    144                     'or with gradient w.r.t. the variable')
    145             gradient = self.data.new().resize_as_(self.data).fill_(1)
--> 146         self._execution_engine.run_backward((self,), (gradient,), retain_variables)
    147 
    148     def register_hook(self, hook):

RuntimeError: could not compute gradients for some functions

This happens whenever the STN module is called more then once per backward (the same bug seems to appear when using cuda). Does this code work for somebody else? Does somebody have an idea where it comes from?

import torch
import numpy as np
from modules.stn import STN
from modules.gridgen import AffineGridGenV2

s = STN()
g = AffineGridGenV2(64, 64)
crit = torch.nn.MSELoss()

inputImages = torch.rand(1, 64, 64, 3)
input = torch.autograd.Variable(torch.from_numpy(np.array([[[1, 0.5, 0], [0.5, 1, 0]]], dtype=np.float32)), requires_grad = True)
input1 = torch.autograd.Variable(inputImages)

loss = 0
for i in range(2):
    out = g(input)
    input1 = s(input1, out)
    loss += crit(input1, input1.detach()*0)
    
loss.backward()

Runtime error: dimension out of range

python test.py 
Variable containing:
(0 ,.,.) = 
  0.8000  0.3000  1.0000
  0.5000  0.0000  0.0000
[torch.FloatTensor of size 1x2x3]

Traceback (most recent call last):
  File "test.py", line 28, in <module>
    out, aux = g(input)
  File "/home/kaiyin/miniconda3/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kaiyin/PycharmProjects/stn.pytorch/script/modules/gridgen.py", line 30, in forward
    loss = torch.sum(loss,2)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)

Pytorch version: 0.3.0.post4
Python 3.6

Bug with 'BCHW' version & Question about high CPU utilization when in GPU mode

Hi, fxia22,

  1. I used the 'BCHW' version of STN for affine transformation (to rotate an image), however, the output image does not have the same size as input. Instead, the output size is abnormal. The input image to the STN is with shape (1, 3, 328, 582), but the returned one is with (1, 3, 582, 2). Here is my simple code:
import sys
import os.path as osp
sys.path.insert(0, osp.expanduser('~/Project/stn.pytorch/script'))

import torch
import numpy as np
from torch.autograd import Variable
from modules.stn import STN
from modules.gridgen import AffineGridGenV2
import matplotlib.pyplot as plt

img = plt.imread(osp.expanduser('~/Project/stn.pytorch/script/cat.jpg'))

# plt.imshow(img)
# plt.show()

img = img / 255.
# shape [3, H, W]
img_batch = img.transpose(2, 0, 1)
# shape [1, 3, H, W]
img_batch = np.expand_dims(img_batch, 0)
inputImages = Variable(torch.from_numpy(img_batch.astype(np.float32)))

print 'inputImages.size:', inputImages.size()

stn = STN(layout='BCHW')
grid_generator = AffineGridGenV2(328, 582)
trans_mat = Variable(torch.from_numpy(
  np.array([[[np.cos(45./180*np.pi), np.sin(45./180*np.pi), 0],
             [np.sin(-45./180*np.pi), np.cos(45./180*np.pi), 0]]],
           dtype=np.float32)),
  requires_grad = True)
grid = grid_generator(trans_mat)
res = stn(inputImages, grid)
res = res.data.cpu().numpy()

print 'res.shape:', res.shape

plt.imshow((res[0].transpose(1, 2, 0)*255).astype(np.uint8))
plt.show()
  1. When I used the GPU version of STN to transform features (I used the GPU mode by transferring input features and transformation matrices to cuda), I found it occupied many threads (with different pids) and nearly 800% CPU. I am sure in my side that it is the STN model that takes up the resources. Why the GPU version takes so much CPU? Is it your intentional design or some bug?

I am scratching my head for these two problems, not that clear about the inner implementation. I hope to see your testing result and your kind explanation. Thank you very much.

GPU support

Does it support GPU now? I saw a commit says add cuda

output blank image

is there any reason why STNFunctionBCHW() would return a blank image? Running the cpu version.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.