Comments (14)
I fix some bugs in loading images and added settings section. In settings you should be able to edit the size of slider and other variables.
If you want to use your own data, create a folder in data/gapdet/large/
where you place your images named as label_timestamp.jpg (label is 0 or 1). Images should be 60x120 px, the final crop is done by slider variable in code (height is fixed right now to 60px).
from handwriting-ocr.
Hi,
your code was probably set for usage of gapdata/large which uses large sizes of images (60x120 px). If you want to use smaller sizes, you have to change the size of input placeholder. Just change this line:
x = tf.placeholder(tf.float32, [None, 7200], name='x')
to
x = tf.placeholder(tf.float32, [None, 3600], name='x')
The images are flatten, so the resulting size is 60x60 = 3600.
Hope this helps, feel free to ask if there is anything else.
from handwriting-ocr.
sure, thank you so much,.. Actually i got it fixed. Then i got dimension error. i tried changing it to 'reshape_images = tf.reshape(x, [-1, 32, 2, 1])` which helped me to start training now with your data.
But for my data, its giving such an error.
InvalidArgumentError: Input to reshape is a tensor with 32400 values, but the requested shape requires a multiple of 64
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_x_0_2, Reshape/shape)]]
from handwriting-ocr.
Ou, you reshaping it wrongly, at reshape to:
tf.reshape(x, [-1, 60, 60, 1])
from handwriting-ocr.
Actually I did it before and got the following error. Hence i tried 32,2.
InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [32,2] and labels shape [64]
[[Node: sparse_softmax_cross_entropy_loss/xentropy/xentropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"](add_3, _arg_Placeholder_0_0)]]
from handwriting-ocr.
Ok, I think I may know the problem, could you please share the code you are running (using gist or something similar.
from handwriting-ocr.
thank you so much.. Let me check it out and i shall update you at the earliest,
from handwriting-ocr.
how to prepare your own gap classifier data?
from handwriting-ocr.
First, it depends on what gap classifier do you want to train. I would recommend training the GapClassifier-BiRNN.ipynb
because it gives the best accuracy. For training of this model you need data provided in words2
folder. This folder contains images along with text files (with same name) which contains array of positions (x coordinates) of vertical lines separating letters.
To extend this folder you can use WordClassDM.py
script where you specify data folder as folder containing raw word images. The script loads and normalize the images and then shows them, then you can manually (click and drag) place lines to position where should be letters separated. The lines with image are then saved by pressing s
key.
from handwriting-ocr.
from handwriting-ocr.
It is because I am not predicting the array of x-coordinates, but I am predicting whether or not there is gap on the slide. I think it is more efficient than predicting the the array, but you can try it the other way.
Right now, I am feeding an array of images (slides) into the classifier and I use slider to extract these images from word image. These slides (patches) are overlapping and are processed by CNN before they are feed into the RNN which evaluates each of the slides if there is or isn't the gap. If you want you can replace it by CNN network which will extract these slides (patches) or tf.extract_image_patches
function. But you would have to change the code a bit more to predict the array of x-coordinates.
from handwriting-ocr.
from handwriting-ocr.
from handwriting-ocr.
Yes, it looks a little bit strange, but first in line:
targets_seq[i] = np.ones((length[i])) * NEG
Target sequence is same length as image sequence and represents label for each image in sequence. targets_seq
is initialized with zeros, so I have to calculate indexes of positive labels and change those to ones as you can see in line:
targets_seq[i][ind] = POS
In the line you referring to I was experimenting with making more positive labels around ground truth label.
For example, if you specify gap_span = 3 (3 positive labels for each ground truth label). The indices[i]
stores indexes of ground truth labels. In first iteration of loop offset
is 0, so the indices are unchanged. In second, offset
is 1, so to each ground truth indices is added -1. In third, offset
is 2, so to each ground truth indices is added 1 (for higher gap_span
it continues as -2, 2, -3, 3... and so on).
The trick to notice is that -1 // 2 == -1 (not zero)
from handwriting-ocr.
Related Issues (20)
- Query: Punctuation Marks HOT 1
- Language HOT 3
- not giving output same as in your github ocr.ipynb ctc model HOT 9
- ValueError: zero-size array to reduction operation minimum which has no identity
- unimplementederror: tensor array has size zero, but element shape [?,256] is not fully defined. currently only static shapes are supported when packing zero-size tensorarray
- File models/gap-clas/CNN-CG.meta does not exist.
- No Function : imageNorm ? HOT 1
- 'TrainingPlot' object has no attribute 'updateCost' HOT 2
- Tensor shape error / not training my images HOT 1
- handwriting-ocr/word_classifier_CTC.ipynb question
- ModuleNotFoundError: No module named 'ocr'
- ValueError: too many values to unpack (expected 2) HOT 5
- training time
- How much time it takes for training i am waiting for 2 hours and what is value of LOSS_ITER and also can you check the train.csv, dev.csv, test.csv i have generated are good to use or have some error?
- What does this code doing and how can i visualize it's output. HOT 1
- ValueError: Cannot feed value of shape (13, 1, 3600) for Tensor 'inputs:0', which has shape '(None, 64, None, 1)'
- Javascript implementation HOT 1
- File does not exist. Received: F:\MY_PROJECT\handwriting-ocr-master\src\ocr\../../models/gap-clas/CNN-CG.meta. HOT 1
- Request for resources
- field to access
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from handwriting-ocr.