fxia22 / stn.pytorch Goto Github PK

pytorch version of spatial transformer networks

License: Other

Python 1.88% C 2.45% C++ 0.09% Jupyter Notebook 93.80% Shell 0.01% Cuda 1.77%

stn.pytorch's Introduction

PyTorch version of spatial transformer network

Ported from https://github.com/qassemoquab/stnbhwd according to pytorch tutorial. Now support CPU and GPU. To use the ffi you need to install the cffi package from pip.

Build and test

cd script
./make.sh #build cuda code, don't forget to modify -arch argument for your GPU computational capacity version
python build.py
python test.py

There is a demo in test_stn.ipynb

Modules

STN is the spatial transformer module, it takes a B*H*W*D tensor and a B*H*W*2 grid normalized to [-1,1] as an input and do bilinear sampling.

AffineGridGen takes a B*2*3 matrix and generate an affine transformation grid.

CylinderGridGen takes a B*1 theta vector and generate a transformation grid to remap equirectangular images along x axis.

DenseAffineGridGen takes a B*H*W*6 tensor and do affine transformation for each pixel. Example of convolutional spatial transformer can be found in test_conv_stn.ipynb.

An example of the landscape of the loss function of a simple STN with L1 Loss can be found in the demo.

Train hacks

set a learning rate multiplier, 1e-3 or 1e-4 would work fine.
add an auxiliary loss to regularized the difference of the affine transformation from identity mapping, to aviod sampling outside the original image.

Complex grid demo

STN is able to handle a complex grid, however, how to parameterize the grid is a problem.

stn.pytorch's People

Contributors

Stargazers

Watchers

Forkers

clcarwin allensmile benjamesbabala lim0606 berleon danielhauagge allankevinrichie xiaolonw hbcbh1999 volkancirik cadene guojm14 runngezhang daixiongming ignacio-rocco zhixinshu chara1273 andyliu93 ivjia lraxue zgsxwsdxg nipandha bemoregt cbasavaraj shuidongliu igodogi ai3dvision afcarl zt1112 coderhhx sunbau streamrock smilewsw ltumorsegmentationtest dapengchen123 fxwispig meikuam xbutterflyx niluanwudidadi owalnuto gug3 smallflyingpig ml-lab andro-demir liu3xing3long hubei-peng xgmiao upgirlnana lijian10086 24werewolf xdr940 chisyliu imbingo95 pingaowang sailfish009 ipandadateset jjz-learning stezpy mercileesb whz1861 tttina hell-to-heaven thankspei suyanzhou626 youtang1993 fenyying sadafgulshad1 zhengmingzhang mymuli yonghoonkwon fenguoo xrosliang jeshy mldl pigcv xuxuan1219 lisx98 luckmonkeys hongbo-sun ai-hans yfzheng11 shaon638 upgenia ethicalsecurity-agency wxaaron blodwolf

stn.pytorch's Issues

how to run the demo

I want to use STN to correct the distorted picture with text ,in order to recognize the text,how to use this?

TypeError: dist must be a Distribution instance

Hi. When I install the repro, it shows ypeError: dist must be a Distribution instance. Thanks.

How to use stn for 3D transform？

hi~ @fxia22 ,I want use stn for 3D transform？Could you give me some advice or some documents?
thank you!

STN added to Pytorch by Soumith Chintala

pytorch: Adding Spatial Transformers w/CuDNN support

Example of use in Training

Is there a example of STN on any dataset, maybe MNIST? Like here
It is not so easy to get it working. This example could explain some ambiguity like:

How to implement auxiliary loss for Grid Generation?
In test_conv_stn.ipynb the very strange stuff are happening like:

conv = self.conv(input2.transpose(2,3).transpose(1,2)) # Why Transpose?
conv = conv.transpose(1,2).transpose(2,3)  # Why Transpose Back ?
iden = Variable(torch.cat([torch.ones(1, 328, 582, 1), torch.zeros(1, 328, 582, 3), torch.ones(1, 328, 582, 1), torch.zeros(1, 328, 582, 1)],3 )) # Why we need it?
out = self.g(conv + iden) # Why we add this values?

I'm planing to try the STN but I'm not sure how I should start.

Windows support ?

Hi, is the module support windows (CPU) ? For your information, I have installed windows based pytorch and VS2017. Thank you.

How can I run make.sh in my MacOSX terminal?

Hi, @berleon @aarcosg @lim0606 @fxia22 @jj0mst.

How can I run make.sh in my MacOSX terminal?

Is it impossible?

Thanks in advances~

iMac i5, (arch=i386), MacOSX High Sierra.

Why is there a lr parameter in AffineGridGenFunction ?

Hi,

Thanks for this awesome project!

I have a straightforward question. Why is there a lr parameter in AffineGridGenFunction?
https://github.com/Cadene/stn.pytorch/blob/master/script/functions/gridgen.py#L8

Also, is it useful to have a lr parameter in CylinderGridGenFunction?
https://github.com/Cadene/stn.pytorch/blob/master/script/functions/gridgen.py#L95

Thanks in advance :)

error in BilinearSampler.updateOutput: invalid device function

I compiled the code and run the test.py, but I got some error message below:

script $ python test.py
Variable containing:
(0 ,.,.) =
0.8000 0.3000 1.0000
0.5000 0.0000 0.0000
[torch.FloatTensor of size 1x2x3]

(1L, 64L, 128L, 2L)
(1L, 2L, 3L)
(64L, 64L, 128L, 64L) time: 0.219668865204
(64L, 64L, 128L, 64L) time: 0.432946920395
error in BilinearSampler.updateOutput: invalid device function
Traceback (most recent call last):
File "test.py", line 48, in
out = s(input1, input2)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "stn.pytorch/script/modules/stn.py", line 12, in forward
return self.f(input1, input2)
File "stn.pytorch/script/functions/stn.py", line 24, in forward
my_lib.BilinearSamplerBHWD_updateOutput_cuda(input1, input2, output, self.device_c)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 177, in safe_call
result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: aborting at stn.pytorch/script/src/my_lib_cuda.c:50

I have been able to compile the code by running ./make.sh. The compilation looks well.

I am using cuda 8.0, pytorch 0.1.12.post2, and ubuntu 16.04.

output blank image

is there any reason why STNFunctionBCHW() would return a blank image? Running the cpu version.

Bug with 'BCHW' version & Question about high CPU utilization when in GPU mode

Hi, fxia22,

I used the 'BCHW' version of STN for affine transformation (to rotate an image), however, the output image does not have the same size as input. Instead, the output size is abnormal. The input image to the STN is with shape (1, 3, 328, 582), but the returned one is with (1, 3, 582, 2). Here is my simple code:

import sys
import os.path as osp
sys.path.insert(0, osp.expanduser('~/Project/stn.pytorch/script'))

import torch
import numpy as np
from torch.autograd import Variable
from modules.stn import STN
from modules.gridgen import AffineGridGenV2
import matplotlib.pyplot as plt

img = plt.imread(osp.expanduser('~/Project/stn.pytorch/script/cat.jpg'))

# plt.imshow(img)
# plt.show()

img = img / 255.
# shape [3, H, W]
img_batch = img.transpose(2, 0, 1)
# shape [1, 3, H, W]
img_batch = np.expand_dims(img_batch, 0)
inputImages = Variable(torch.from_numpy(img_batch.astype(np.float32)))

print 'inputImages.size:', inputImages.size()

stn = STN(layout='BCHW')
grid_generator = AffineGridGenV2(328, 582)
trans_mat = Variable(torch.from_numpy(
  np.array([[[np.cos(45./180*np.pi), np.sin(45./180*np.pi), 0],
             [np.sin(-45./180*np.pi), np.cos(45./180*np.pi), 0]]],
           dtype=np.float32)),
  requires_grad = True)
grid = grid_generator(trans_mat)
res = stn(inputImages, grid)
res = res.data.cpu().numpy()

print 'res.shape:', res.shape

plt.imshow((res[0].transpose(1, 2, 0)*255).astype(np.uint8))
plt.show()

When I used the GPU version of STN to transform features (I used the GPU mode by transferring input features and transformation matrices to cuda), I found it occupied many threads (with different pids) and nearly 800% CPU. I am sure in my side that it is the STN model that takes up the resources. Why the GPU version takes so much CPU? Is it your intentional design or some bug?

I am scratching my head for these two problems, not that clear about the inner implementation. I hope to see your testing result and your kind explanation. Thank you very much.

cannot compute gradients when STN is called multiple times

Hello, I get the following error when the code below gets executed:

<ipython-input-4-7d15b88aed95> in <module>()
     19     loss += crit(input1, input1.detach()*0)
     20 
---> 21 loss.backward()

~/anaconda2/envs/python36/lib/python3.6/site-packages/torch/autograd/variable.py in backward(self, gradient, retain_variables)
    144                     'or with gradient w.r.t. the variable')
    145             gradient = self.data.new().resize_as_(self.data).fill_(1)
--> 146         self._execution_engine.run_backward((self,), (gradient,), retain_variables)
    147 
    148     def register_hook(self, hook):

RuntimeError: could not compute gradients for some functions

This happens whenever the STN module is called more then once per backward (the same bug seems to appear when using cuda). Does this code work for somebody else? Does somebody have an idea where it comes from?

import torch
import numpy as np
from modules.stn import STN
from modules.gridgen import AffineGridGenV2

s = STN()
g = AffineGridGenV2(64, 64)
crit = torch.nn.MSELoss()

inputImages = torch.rand(1, 64, 64, 3)
input = torch.autograd.Variable(torch.from_numpy(np.array([[[1, 0.5, 0], [0.5, 1, 0]]], dtype=np.float32)), requires_grad = True)
input1 = torch.autograd.Variable(inputImages)

loss = 0
for i in range(2):
    out = g(input)
    input1 = s(input1, out)
    loss += crit(input1, input1.detach()*0)
    
loss.backward()

Upgrade to mxnet 1.0

We've recently seen the release of mxnet 1.0. Would you consider an upgrade?

https://github.com/apache/incubator-mxnet/releases/tag/1.0.0

can you add a TPS grid generator in this awesome project~?

i'm a noob in pytorch and it's difficult for me to imple it

Runtime error: dimension out of range

python test.py 
Variable containing:
(0 ,.,.) = 
  0.8000  0.3000  1.0000
  0.5000  0.0000  0.0000
[torch.FloatTensor of size 1x2x3]

Traceback (most recent call last):
  File "test.py", line 28, in <module>
    out, aux = g(input)
  File "/home/kaiyin/miniconda3/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kaiyin/PycharmProjects/stn.pytorch/script/modules/gridgen.py", line 30, in forward
    loss = torch.sum(loss,2)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)

Pytorch version: 0.3.0.post4
Python 3.6

compatibility with DataParallel

Thank you for this implementation! Have you tried using it within a network that is wrapped in a DataParallel in order to make use of multiple graphics cards? I am getting an illegal memory access was encountered error when replacing

with torch.cuda.device(3):
    input1 = input1.cuda()
    input2 = input2.cuda()
    start = time.time()
    out = s(input1, input2)
    print(out.size(), 'time:', time.time() - start)
    start = time.time()
    out.backward(input1.data.cuda())
    print('time:', time.time() - start)

in test.py with

s = torch.nn.DataParallel(s)
if True:
    input1 = input1.cuda()
    input2 = input2.cuda()
    start = time.time()
    out = s(input1, input2)
    print(out.size(), 'time:', time.time() - start)
    start = time.time()
    out.backward(input1.data.cuda())
    print('time:', time.time() - start)

Interestingly, the code works with

export CUDA_VISIBLE_DEVICES="0"

but fails with

export CUDA_VISIBLE_DEVICES="0,1"

I see that you are explicitly setting the CUDA device before executing the kernel, which might be the reason for the illegal memory access. Any ideas? Thank you!

Bug in gpu support

Hi, Thank you for sharing great code.
I found that the code works fine with gpu_id = 0, but it raises cuda runtime error if I tried to run it on different gpus. I guess some parts of codes are implicitly assuming to use default gpu, which would be great to be modified in the future.

How can it do backpropogation?

After reading your code, it seems that the code doesn't mention anything about backpropogation. Does pytorch do it automatically?
Thanks!

BCHW format

Excellent work!

I would like to use this in the middle of my pytorch network, so my tensors are in [Batch x Channel x Height x Width] format. I tried to use torch.permute to change their dimension orders, but it was not successful.
For example, when a = torch.randn((2,3,4,5)), a.stride() is (60, 20, 5, 1), but
if I do b = a.permute((0,1,2,3)), b.stride() is (1, 60, 20, 5) while torch.randn(5,2,3,4).stride() is (24, 12, 4, 1).

Is there an easy and efficient way to do it? or do I need to change .c and .cu files in src?

I guess a.permute((0,1,2,3)).contiguous() might be a solution, but I'm not sure it is safe for Variable (autograd).

Thank you.

nan gradients with BCHW layout

I recently tried to use the new BCHW functions with my network, since i always use that layout and it simplifies my code.

I noticed that all the gradients of my convolutional layers are nan now, which makes also the weights full of nan after the parameters' update.

I'm sure i made all the necessary conversions and i get no error like inconsistency between tensors or anything else.

GPU support

Does it support GPU now? I saw a commit says add cuda