Giter Club home page Giter Club logo

handwrittentextrecognition_mxnet's People

Contributors

jonomon avatar simoncorstonoliver avatar thomasdelteil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

handwrittentextrecognition_mxnet's Issues

Kernel intizialiser automatically

When I execute this line of code the kernel initialize automatically any help please!!

net.collect_params().initialize(mx.init.Xavier(), ctx=ctx)

Link to alicewonder.txt

Hi
Can you please provide a link to alicewonder.txt which is needed for beam search with language model?

Thank you

Question on the shape of feature map of OCR_LSTM_CTC

In handwriting_recognition.ipynb:

    def forward(self, x):
        x = x.transpose((0, 3, 1, 2))
        x = x.flatten()
        x = x.split(num_outputs=max_seq_len, axis=1) # (SEQ_LEN, N, CHANNELS)
        x = nd.concat(*[elem.expand_dims(axis=0) for elem in x], dim=0)
        x = self.lstm(x)
        x = x.transpose((1, 0, 2)) #(N, SEQ_LEN, HIDDEN_UNITS)
        return x

I notice the input featuremap for EncoderLayer has first been reshaped by: x = x.transpose((0, 3, 1, 2)) , but I think this code maybe useless, as this kind of transpose is usually done for picture array which has channel at the last dimension, but not for featuremap. Is there a special reason for the code?

In addition, for the reshape before doing lstm, I firstly replace code:

 x = x.split(num_outputs=max_seq_len, axis=1) # (SEQ_LEN, N, CHANNELS)
 x = nd.concat(*[elem.expand_dims(axis=0) for elem in x], dim=0)

with x = x.reshape(SEQ_LEN, BATCH_SIZE, -1), and I found the elements are ordered differently with the old one, though their final shapes are the same. Then I wonder if there is some reason to reshape it the way you did?

Training CNN LSTM with images of different size without padding images

Hi,

I trained the CNN LSTM CTC model with the IAM line images of size (128, 1600) and i am getting good results with that. But when i tried to test the model with images of different size (100, 500), i am getting shape errors near LSTM. So i resized the image to (128, 1600), i am not getting good results as the image resolution changed.

Is there a way to train CNN LSTM model with images of different sizes without padding the images with 0? And test the same with images of different sizes?

Any suggestions will be helpful...

Thanks,
Harathi

Pickling error on Windows

when i run 1_b_paragraph_segmentation_dcnn

I got this error.

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Mateen\AppData\Local\conda\conda\envs\pentoscan\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\Mateen\AppData\Local\conda\conda\envs\pentoscan\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'augment_transform' on <module 'main' (built-in)>
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Mateen\AppData\Local\conda\conda\envs\pentoscan\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\Mateen\AppData\Local\conda\conda\envs\pentoscan\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'augment_transform' on <module 'main' (built-in)>
Traceback (most recent call last):
File "", line 1, in

Please help me I am making my final year project it will help me alot

Thanks!

Adding license

I wonder on what license repository can be used. Would it be possible to add one?

Question: is it possible to hybridize lstm_ocr_ctc?

Hi, before all amazing work with HandwrittenTextRecognition_MXNet.
It's also my first time working with MxNet so a please be patience.

I've successfully trained my own word ocr net, although when trying to hybridize the net I've not been able to converge into a usable solution.

I think the problem has been presented in [https://github.com//issues/9](Question on the shape of feature map of OCR_LSTM_CTC) witch is the necessity of making this transformation make us use split which returns a NDArray .

Is there a way in which is possible to Hybridize the net and train with it? Maybe do it for an already trained net?

Unable to replicate the results

Hi I used the same code and param values that were there in the downloads section. I am unable to replicate your results.

screenshot from 2019-01-05 22-53-46

The bounding boxes are always in the same position no mater the image. Can you please lemme know if the handwriting_ocr.ipynb is giving the same problems to you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.