warbean / tps_stn_pytorch Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of Spatial Transformer Network (STN) with Thin Plate Spline (TPS)
PyTorch implementation of Spatial Transformer Network (STN) with Thin Plate Spline (TPS)
Hi,
Thanks for sharing the code. It’s very helpful for my own research project. Is it possible to add a license (Eg MIT) so that I can properly cite your code in my own repository?
Thanks!
In the args list, you set the grid_size with default value of 4. Then you named two args - args.height, args.width- and set their values according to the grid_size.
However, I think the names of these args are misleading!
Actually, the size of the grid in your program is equal to the size of the input image, right?
And in the file -mnist_model.py, there is a sentence:
grid = source_coordinate.view(batch_size, self.args.image_height, self.args.image_width, 2)
So you also know that your args.grid_height and args.grid_width are not the real height and width of the grid in the STN.
the args.grid_size actually refers to the gird of control ponits but not the grid in the STN steps.
So I just think the name of the arg maybe misleading,
Of course, the program works very well ! Good job!
great project!
i succeeded to run the code on my own images with input size=28.
I was trying to run the code using a different input size (e.g widht=height=300)
As soon as i modify the args.image_height = args.image_width to any other value than 28 (in my data_loader and mnist_train)
i get the following error
File "/home/myaccount/tps_stn_pytorch/tps_grid_gen.py", line 67, in forward
assert source_control_points.size(1) == self.num_points
AssertionError
'''
I tried to modify the tps_grid_gen code.. but nothing's worked.
Any help please
Sorry
Actually I don't understand how to train it , what kind of data should I prepare, and how to rectify distored text images
Hi author:
I'm a little confused about the usage of canvas in grid_sample() in your code.
Thanks!
what is the meaning of span_range?
Many thanks for kind help!
I try to give the same input to your script and another tps version (https://github.com/iwyoo/tf_ThinPlateSpline/blob/master/test2.py).
The results between them have obvious difference, I can not figure out why this happened.
Have you ever compared your method with others?
hello! when I run python mnist_visualize.py --model unbounded_stn --angle 90 --grid_size 4, it appears following:
create model with STN
Traceback (most recent call last):
File "mnist_visualize.py", line 47, in
data_list = target2data_list[target]
KeyError: tensor(7)
any help will be appreciate!
请问一下,source_control_points这个是什么呢
请问一下,这个是否需要纠正后的groud truth图片来作为学习目标,还是只是一个变换
你好,很感谢你的工作。关于一篇论文《Robust scene Text recognition with automatic rectification》里面提到了得到tps的控制点 初始化时,有时候随机初始化时不work的,网络训练不会收敛。想问一下这个是什么情况呢?
Awesome for your repo
I have a question:
I warped my image with grid point and noisy point and it work perfect.
When i have many points on original image and i want to get corresponding position of coordinates on transformed image how to do.
Currently my solution is to traverse each point with the transformed grid and find the corresponding coordinates.
For example the picture below.
.
My code
import os
import torch
from PIL import Image
import cv2
import numpy as np
from sklearn.metrics.pairwise import euclidean_distances
import morphops as mops
import torchvision
....
points = torch.tensor([0.33145134998962683,0.37168062334870805,0.291440680477757,0.36570985028286535,0.5959663317625441,0.35376830415117994,0.6433863845173527,0.3268998253548878,0.6737648558134021,0.3224217455555058,0.7078480187309208,0.32988521188780917,0.7330399217569128,0.34705118445210686,0.705625203758039,0.36645619691609566,0.667096410894757,0.3701879300822474,0.6367179395987077,0.36421715701640467,0.3603479446370884,0.7135073813682026,0.4099908123647787,0.6829071694057588,0.46852493998399575,0.6597704237756183,0.4996443496043389,0.6672338901079217,0.5307637592246821,0.6605167704088487,0.589297886843899,0.6829071694057588,0.6359770012744138,0.716492767901124,0.5841113185738419,0.7403758601644947,0.5441006490619721,0.7530637529294105,0.4981624729557511,0.7575418327287925,0.4529652351738241,0.7538100995626408,0.41369550398624816,0.7388831668980341,0.37442577279867223,0.7105219948352812,0.459633680092469,0.6978341020703654,0.4996443496043389,0.6963414088039048,0.5403959574405026,0.6970877554371352,0.6263448030585934,0.7142537280014329,0.5389140807919149,0.7135073813682026,0.4996443496043389,0.7157464212678936,0.46037461841676297,0.7127610347349723,0.32552384339527574,0.34555849118564624,0.6730239174891082,0.34406579791918557]).reshape((-1,2))
**warped_C = tps(Variable(torch.unsqueeze(source_control_points, 0)))**
dis = euclidean_distances(points.cpu().numpy()*2-1,(warped_C))
dis = np.argmin(dis,1)
Y,X = np.unravel_index(dis,(h,w))
new_point = np.array([[x,y]for x,y in zip(X,Y)]).reshape((-1,2)).astype(int)
It takes a lot of time and memory.
However I wonder if there is another way that relies on matrixs (K,U,P)
学长方便问问您,如果只想利用前面的tps部分让图片旋转,损失函数应该怎么取比较合适呢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.