Giter Club home page Giter Club logo

tps_stn_pytorch's People

Contributors

stewartsetha avatar warbean avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tps_stn_pytorch's Issues

Is it possible to add a license to this code?

Hi,

Thanks for sharing the code. It’s very helpful for my own research project. Is it possible to add a license (Eg MIT) so that I can properly cite your code in my own repository?

Thanks!

the args' name problem

In the args list, you set the grid_size with default value of 4. Then you named two args - args.height, args.width- and set their values according to the grid_size.
However, I think the names of these args are misleading!

Actually, the size of the grid in your program is equal to the size of the input image, right?
And in the file -mnist_model.py, there is a sentence:
grid = source_coordinate.view(batch_size, self.args.image_height, self.args.image_width, 2)
So you also know that your args.grid_height and args.grid_width are not the real height and width of the grid in the STN.

the args.grid_size actually refers to the gird of control ponits but not the grid in the STN steps.

So I just think the name of the arg maybe misleading,
Of course, the program works very well ! Good job!

run the training on wider input images (width=height>>28)

great project!
i succeeded to run the code on my own images with input size=28.
I was trying to run the code using a different input size (e.g widht=height=300)
As soon as i modify the args.image_height = args.image_width to any other value than 28 (in my data_loader and mnist_train)
i get the following error

File "/home/myaccount/tps_stn_pytorch/tps_grid_gen.py", line 67, in forward
assert source_control_points.size(1) == self.num_points
AssertionError
'''

I tried to modify the tps_grid_gen code.. but nothing's worked.
Any help please

span_range

what is the meaning of span_range?

Many thanks for kind help!

训练问题

    请问一下,这个是否需要纠正后的groud truth图片来作为学习目标,还是只是一个变换

关于localization network的初始化问题

你好,很感谢你的工作。关于一篇论文《Robust scene Text recognition with automatic rectification》里面提到了得到tps的控制点 初始化时,有时候随机初始化时不work的,网络训练不会收敛。想问一下这个是什么情况呢?

Please help me!!!

Awesome for your repo

I have a question:
I warped my image with grid point and noisy point and it work perfect.
When i have many points on original image and i want to get corresponding position of coordinates on transformed image how to do.

Currently my solution is to traverse each point with the transformed grid and find the corresponding coordinates.
For example the picture below.
Untitled1
.

My code

import os
import torch
from PIL import Image
import cv2
import numpy as np
from sklearn.metrics.pairwise import euclidean_distances
import morphops as mops
import torchvision

....

points = torch.tensor([0.33145134998962683,0.37168062334870805,0.291440680477757,0.36570985028286535,0.5959663317625441,0.35376830415117994,0.6433863845173527,0.3268998253548878,0.6737648558134021,0.3224217455555058,0.7078480187309208,0.32988521188780917,0.7330399217569128,0.34705118445210686,0.705625203758039,0.36645619691609566,0.667096410894757,0.3701879300822474,0.6367179395987077,0.36421715701640467,0.3603479446370884,0.7135073813682026,0.4099908123647787,0.6829071694057588,0.46852493998399575,0.6597704237756183,0.4996443496043389,0.6672338901079217,0.5307637592246821,0.6605167704088487,0.589297886843899,0.6829071694057588,0.6359770012744138,0.716492767901124,0.5841113185738419,0.7403758601644947,0.5441006490619721,0.7530637529294105,0.4981624729557511,0.7575418327287925,0.4529652351738241,0.7538100995626408,0.41369550398624816,0.7388831668980341,0.37442577279867223,0.7105219948352812,0.459633680092469,0.6978341020703654,0.4996443496043389,0.6963414088039048,0.5403959574405026,0.6970877554371352,0.6263448030585934,0.7142537280014329,0.5389140807919149,0.7135073813682026,0.4996443496043389,0.7157464212678936,0.46037461841676297,0.7127610347349723,0.32552384339527574,0.34555849118564624,0.6730239174891082,0.34406579791918557]).reshape((-1,2))

**warped_C = tps(Variable(torch.unsqueeze(source_control_points, 0)))**

dis = euclidean_distances(points.cpu().numpy()*2-1,(warped_C))
dis = np.argmin(dis,1)
Y,X = np.unravel_index(dis,(h,w))
new_point = np.array([[x,y]for x,y in zip(X,Y)]).reshape((-1,2)).astype(int)

It takes a lot of time and memory.
However I wonder if there is another way that relies on matrixs (K,U,P)

求助

学长方便问问您,如果只想利用前面的tps部分让图片旋转,损失函数应该怎么取比较合适呢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.