Giter Club home page Giter Club logo

calamari's People

Contributors

alexander-winkler avatar andbue avatar bertsky avatar chreul avatar chwick avatar curtlh avatar fstrunz avatar jeanm avatar kba avatar maxnth avatar mikegerber avatar mr-mojo avatar nesbi avatar poke1024 avatar stweil avatar synap5e avatar wosiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

calamari's Issues

Can calamari predict works using RawDataSet class?

What I want to do is realtime ocr(using calamari)

So, I tried to predict some images from memory(not saved image files)

I found calamari has two types of datasets(FileDataSet and RawDataSet)

During prediction, calamari use FileDataSet now.

I guess if I use RawDataSet class on the process of prediction, It works what I intended.

So, Can calamari predict works well using RawDataSet class???

Can give me any advices about that issue?

Argument list too long error for large train set

This is more of a command line syntax question.

I tried to train a model with 100K images. With all the training images in the same folder,
calamari-train --files folder/*.png
returns Argument list too long error because bash command cannot handle 100K arguments.

I then split the folder into 5 sub-folders with 20K images each. But
calamari-train --files folder1/*.png folder2/*.png folder3/*.png folder4/*.png folder5/*.png
still returns Argument list too long error.

Is there a good way to pass the argument to --files to train calamari with large set of data?

OCR numbers

it seems that the model can not recognize numbers

Single file model format?

We're building an engine-agnostic training server. Throw ground truth at it, track training progress, download (intermediary) models to evaluate, that sort of thing.

As a convenience, we'd like to bundle the multiple files that represent a model. Do you have an opinion on how or maybe even a plan/code to do that?

We thought the easiest way would be to zip

  • xyz.ckpt.data-00000-of-00001
  • xyz.ckpt.index
  • xyz.ckpt.json
  • xyz.ckpt.meta

into a flat zip archive and send it to clients with a media type like application/vnd.calamari.tf+zip.

Use pre-trained Calamari models

Thanks for the great work!

I installed Calamari on a new AWS P2 instance and calamari-models. Tried to test on a simple example by

calamari-predict --checkpoint calamari_models/default/ModernEnglish.ckpt --files data.png

The detected text is way off. I guess it is related to the loading of model.

I got these warnings:

Found 1 files in the dataset
2018-08-05 17:12:16.976735: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Attempting a workaround: New graph and load weights
Using CUDNN compatible LSTM backend on CPU
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/calamari/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:417: calling reverse_sequence (from tensorflow.python.ops.array_ops) with seq_dim is deprecated and will be removed in a future version.
Instructions for updating:
seq_dim is deprecated, use seq_axis instead
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/calamari/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:432: calling reverse_sequence (from tensorflow.python.ops.array_ops) with batch_dim is deprecated and will be removed in a future version.
Instructions for updating:
batch_dim is deprecated, use batch_axis instead
2018-08-05 17:12:20.637472: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key Minimum/ExponentialMovingAverage not found in checkpoint
Attempting workaround: only loading trainable variables
Loading Dataset: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 109.32it/s]
Data Preprocessing: 100%|██████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 104.47it/s]
Prediction: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.74it/s]
Prediction of 1 models took 0.14934062957763672s

Is it due to the tensorflow version that ExponentialMovingAverage are not loaded? Currently installing calamari will install tensowflow 1.9. What tf version do you use in your development?

Thanks!

Check input width and height before training

I try to combine OCR whole process (Text Detection -> pre-processing Image -> Text Recognition)

To make pre-processing, I want to make sure Input size of Text Recognition(Calamari) model

I checked that calamari normalized height(default=48) and width(resized height-proportional )

So input sizes are not fixed. am I right?

What I wonder is if all input size are not fixed, How model is working...

Syntax error with latest release

Running calamari-predict results in a syntax error:

[...]
  File "/usr/local/lib/python3.5/dist-packages/calamari_ocr-0.2.0-py3.5.egg/calamari_ocr/ocr/datasets/abbyy_dataset/reader.py", line 127
    block: Block = Block(type, name, XMLReader.parseRect(blockNode, required=False))
         ^
SyntaxError: invalid syntax

I tried both the Pip installation with virtual environment and an installation from latest Git sources. Both show the same error.

use os.path.splitext instead of internal split_all_ext

I've got a training dataset containing pairs like:

0_1_10129.feature-B.png
0_0_10101.feature-B.gt.txt

Note "." in 2 places.

Implemented in this project split_all_ext works in the following manner:

>>> split_all_ext("/home/m/0_1_10129.feature-B.png")
("/home/m/0_1_10129", ".feature-B.png")

Which is not desired, because corresponding "gt.txt" are not found for prefix /home/m/0_1_10129 and the exception is thrown "Dataset is empty.".

Is there any reason for not using os.path.splitext? It works as follows:

>>> os.path.splitext("/home/m/0_1_10129.feature-B.png")
('/home/m/0_1_10129.feature-B', '.png')

?
Please, let me know if you're ok with this change and I'll do a pull req.

Otherwise, with the current behaviour, there should be at least mentioned in a documentation/help info of the "--files" flag, that there should be no dots in images name other than the one before the final extension.

How to predict calamari with multiple languages one time

I try to analyzed a image which containing both Korean and English texts.

How can I run one process which recognize both Korean and English characters?

I guess I got two options

  1. Making dataset containing both Korean and English text images.

  2. Using voting algorithm.

Am I right?

Can you give me some advice??

Fine tune model by freezing layers

I am training an OCR model for digitalizing historical Arabic documents in Calamari. I wrote an image generator that can produce infinite amount of synthetic images that look similar enough to the original documents. But I only have a very small set of labeled images from the original documents. I would like to explore the idea of training an OCR model using synthetic images, freeze the layers before the LSTM layers, and fine-tuning the model with real data.

I know there are two ways to freeze layers in TensorFlow, by either specifying the trainable variables in the Optimizer
tf.train.AdamOptimizer(lr).minimize(loss, var_list = list_of_trainable_variables)
or modifying the option for the individual parameter
tf.Variable(trainable= False)
I am having trouble figuring out where to apply these changes in Calamari code. Could you give me some suggestions?

Also, I couldn't find any useful literature on fine-tuning OCR model besides Tesseract training guide. Do you have suggestion of other methods/literature for fine-tuning OCR models trained on synthetic data with real data?

Thank you!

How can i make a whitelist?

i notice that the engine can use a whitelist_files to protect some characters not remove during finetuning.I load a pre-trained model and make a txt file list characters.After training,i found the model forget the characters in pre-trained model,it seems the whitelist i made dosen't work.Could you give me a example about the whitelist_files?(better a txt file with some characters)

Questions about the four model files

As for the four model files:
xx.ckpt.data
xx.ckpt.index
xx.ckpt.json
xx.ckpt.meta
Could you please explain what each file contains?
For example, when I add more layers to the network for training new models, the xx.ckpt.data file gets very large (~680MB). Hence, I am wondering about the content of each file.
And which file contains the network hyper-parameters and weights?

Thank you in advance for your help!

Dependency error while using environment_master_gpu.yml

I've clone the repo and executed the following command.
conda env create -f environment_master_gpu.yml

I've encountered the following error.

Solving environment: failed

UnsatisfiableError: The following specifications were found to be in conflict:
  - mkl==2019.0=118
  - numpy==1.15.2=py36h1d66e8a_1 -> numpy-base==1.15.2=py36h81de0dd_1 -> mkl[version='>=2018.0.3,<2019.0a0']
Use "conda info <package>" to see the dependencies for each package.

I think this can be fixed by downgrading mkl version. But I want to bring this into the view of developers to get an official fix.

use a pre-trained model

Hello.
By executing the command
calamari-train --files pathOfDataSet/*.bin.png --weights pathOfPretrainedModel/model_number.ckpt --checkpoint_frequency 1000 --output_dir pathOfOutputModels
I have the impression that the train recommenced again : it did not take the weights of the pre-trained model.
thanks for the help

General

This is a general topic for Calamari.

hmmmm... The Logo, Nothing personal :)

@ChWick @chreul Thank you for your hard work, YOU ARE APPRECIATED!

Your current logo is a work-of-art that many might not fully understand.
So... I was thinking, if you would accept that I provide a logo for the Calamari project? just to show my appreciation to your hard work.

Waiting for your reply

Do color feature affect accuracy?

Dear author,i have some questions about your engine.
first i use the images with white character and black background to train a model and let it to predict the images with black character and white background.The accuracy is too low,then i made a dataset mixed
white and black character ,the accuracy is only 88%.So i want to know that the do color feature may influence accuracy?Because i want to use the engine to predict some complex scene such as headline on TV broadcast.
second i've trained a robust model and want to predict some data,but i found execute the predict.py takes a long time(almost 2s) and i want to reduce the time cost(below 500ms).Because i want to encapsulates a class for real-time prediction.Could you give some advises?Thank you.

Regarding '--network' options

I am training a new model for Chinese+English. I think I need a more complicated network structure.
Hence, I want to add more layers.
When I add a CNN layer and a pooling layer using the parameter '--network', how do I specify the stride length?
What is the default stride length used for CNN and pooling in Calamari-OCR?

Thank you in advance for your help!

Question regarding a code block inside text_processing/text_processor.py

Thank you for providing us with such a nice open-source OCR tool.
I am analyzing the codes of Calamari-OCR. I have a question regarding a code block inside text_processing/text_processor.py.
In class MultiTextProcessor(TextProcessor):

def _apply_single(self, txt):
    for proc in self.sub_processors:
        txt = proc._apply_single(txt)
    return txt

This for loop is very difficult to understand. What does this block of codes do? Can we skip this for loop?
Looking forward to your reply.

Error occured when I trained new model.

I tried to train new model using

calamari-train --files your_images.*.png

During saving ckpt.json, I got some error like this.

Storing checkpoint to '/home/hclee/calamari/model_00001000.ckpt'
Traceback (most recent call last):
  File "/home/hclee/anaconda3/envs/calamari/bin/calamari-train", line 11, in <module>
    load_entry_point('calamari-ocr==0.1.8', 'console_scripts', 'calamari-train')()
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/calamari_ocr-0.1.8-py3.5.egg/calamari_ocr/scripts/train.py", line 233, in main
    run(args)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/calamari_ocr-0.1.8-py3.5.egg/calamari_ocr/scripts/train.py", line 226, in run
    trainer.train(progress_bar=not args.no_progress_bars)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/calamari_ocr-0.1.8-py3.5.egg/calamari_ocr/ocr/trainer.py", line 259, in train
    last_checkpoint = make_checkpoint(checkpoint_params.output_dir, checkpoint_params.output_model_prefix)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/calamari_ocr-0.1.8-py3.5.egg/calamari_ocr/ocr/trainer.py", line 210, in make_checkpoint
    f.write(json_format.MessageToJson(checkpoint_params))
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/protobuf-3.6.0-py3.5.egg/google/protobuf/json_format.py", line 127, in MessageToJson
    return printer.ToJsonString(message, indent, sort_keys)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/site-packages/protobuf-3.6.0-py3.5.egg/google/protobuf/json_format.py", line 178, in ToJsonString
    return json.dumps(js, indent=indent, sort_keys=sort_keys)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/__init__.py", line 237, in dumps
    **kw).encode(obj)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 200, in encode
    chunks = list(chunks)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 429, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 403, in _iterencode_dict
    yield from chunks
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 324, in _iterencode_list
    yield from chunks
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 436, in _iterencode
    o = _default(o)
  File "/home/hclee/anaconda3/envs/calamari/lib/python3.5/json/encoder.py", line 179, in default
    raise TypeError(repr(o) + " is not JSON serializable")

How should I fix this?

Regarding LSTM

Is the LSTM used in Calamari-OCR a 2D-LSTM?
If no, can we use 2D-LSTM in Calamari-OCR?

Is the LSTM used in Calamari-OCR a bidirectional LSTM?

Thank you in advance for your help!

What is the maximum line length that Calamari-OCR can support?

Regarding: Text line length
What is the maximum line length that Calamari-OCR can support?
How does Calamari-OCR determine the line length of an input text line image?
Can we change the maximum line length?

Regarding time-steps:
What is the number of time-steps used in Calamari-OCR?

Thank you in advance for your help!

Can I trained other language using this model???

First of all, Thanks for sharing amazing library.

I want train model for Korean.

Unlike english, Korean uses around 10000 different characters.

I've been training using this model. But It seems not works well.

#00065275: loss=7.06757734 ler=1.00000000 dt=0.05988510s
 PRED: '‪껜‬'
 TRUE: '‪쑤‬'
#00065276: loss=7.07436581 ler=1.00000000 dt=0.06000578s
 PRED: '‪솽‬'
 TRUE: '‪쯔‬'
#00065277: loss=7.07185513 ler=1.00000000 dt=0.06007761s
 PRED: '‪혀‬'
 TRUE: '‪쌈‬'
#00065278: loss=7.07688745 ler=1.00000000 dt=0.06012164s
 PRED: '‪힐‬'
 TRUE: '‪엊‬'
#00065279: loss=7.08528412 ler=1.00000000 dt=0.06010656s
 PRED: '‪뺙‬'
 TRUE: '‪뱐‬'
#00065280: loss=7.10926293 ler=1.00000000 dt=0.06014606s
 PRED: '‪띔‬'
 TRUE: '‪졀‬'
#00065281: loss=7.11099953 ler=1.00000000 dt=0.06006387s
 PRED: '‪솰‬'
 TRUE: '‪팰‬'

Should I change model hyper parameter or model structure ??

Any advice??

A version of data flow on the fly?

Thanks for this open source code!
Succeed training chinese:
screenshot from 2018-11-15 08-38-35

Wonder if there can be a version to train with data on the fly?
The dataset is pretty large! Right now, I just used a small portion of the whole dataset.

Also checked 37.

I think data flow on the fly would be more practicable instead of dump all data in to RAM.

Regarding training new models

When training a new model, calamari-ocr prints "loss, ler, dt" on the screen (default).
I understand loss, ler (line error rate).
What is dt?
How are loss, ler, and dt defined? Or, is there any reference/paper covering these definitions?

Thank you in advance for your help!

Prediction step using very deep neural networks feature of calamari

Hi,
I installed calamari-0.2.4 . Tried to test on this simple example ""https://user-images.githubusercontent.com/33478216/46499779-a909b480-c829-11e8-87f2-d4a34d84ab69.png""
by:
calamari-predict --checkpoint calamari_models/default/ModernEnglish.ckpt --files data.png

It returns this Error 👍
Found 1 files in the dataset
Traceback (most recent call last):
File "/home/pc/my_calamari_env/bin/calamari-predict", line 11, in
load_entry_point('calamari-ocr==0.2.4', 'console_scripts', 'calamari-predict')()
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/scripts/predict.py", line 151, in main
run(args)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/scripts/predict.py", line 61, in run
predictor = MultiPredictor(checkpoints=args.checkpoint, batch_size=args.batch_size, processes=args.processes)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 202, in init
self.predictors = [Predictor(cp, batch_size=batch_size, processes=processes) for cp in checkpoints]
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 202, in
self.predictors = [Predictor(cp, batch_size=batch_size, processes=processes) for cp in checkpoints]
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 100, in init
ckpt = Checkpoint(checkpoint, auto_update=self.auto_update_checkpoints)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/checkpoint.py", line 20, in init
self.json = json.load(f)
File "/usr/lib/python3.5/json/init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/init.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 7 column 1 (char 6)

Thanks for your help :)

predict_raw : Unknown instance of txts: <class 'tuple'>

First of all, I would like to thank you for the work you put in Calamari!

I have the following code snippet:

image = np.zeros((2, 2, 1))  # placeholder image
images = [image]
dataset = RawDataSet(images=images)

checkpoint = get_path_to_model()
predictor = Predictor(checkpoint=checkpoint)
#  TODO: load via network

prediction_results = predictor.predict_dataset(dataset, progress_bar=False)

for prediction, sample in prediction_results:
    pass

The error I get is the following:

File "[...]\calamari_ocr\ocr\predictor.py", line 132, in predict_dataset
for prediction, sample in zip(prediction_results, dataset.samples()):
File "[...]\calamari_ocr\ocr\predictor.py", line 158, in predict_raw
datas, params = zip(*self.data_preproc.apply(datas, processes=self.processes,
progress_bar=progress_bar))
File "[...]\calamari_ocr\ocr\data_processing\data_preprocessor.py", line 21, in apply
raise Exception("Unknown instance of txts: {}. Supported list and str".format(type(data)))
Exception: Unknown instance of txts: <class 'tuple'>. Supported list and str

The zipping found in the predict_dataset method transforms the list, respectively the numpy-array, into a tuple.

data_params = zip(datas, [None] * len(datas))

Calamari-version: 0.2.2
Installed via pip

Can you give me any advices about that issue?

Thank you!

Training crashes when checkpoint is saved

Error Description: The training crashes when it comes to the point where it has to store the model weights to the checkpoint. I reviewed issues #38 and #40 but they didn't work out for me. Here is the complete description of my software stack and error:

OS: Windows 10
Python: 3.6
CUDA / cuDNN: 9.0 / 7.3.1
Training on CPU/GPU (tried both, facing the same error)
Tensorflow version: 1.12.0
Calamari version: 0.2.4
Installed via 'conda env create -f environment_master_gpu.yml'

Full Command:

calamari-train --files \calamari_training_data*.png --output_dir ...\Checkpoints\c7 --checkpoint_frequency=1000 --weights ...\Calamari\antiqua_modern\4.ckpt --whitelist ...Calamari\whitelist.txt

Complete Error Stack:

Traceback (most recent call last):
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\dsureshkumar\Anaconda3\envs\calamari_gpu\Scripts\calamari-train.exe_main
.py", line 9, in
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\calamari_ocr\scripts\train.py", line 310, in main
run(args)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\calamari_ocr\scripts\train.py", line 302, in run
progress_bar=not args.no_progress_bars
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\calamari_ocr\ocr\trainer.py", line 175, in train
self._run_train(train_net, test_net, codec, train_start_time, progress_bar)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\calamari_ocr\ocr\trainer.py", line 319, in run_train
last_checkpoint = make_checkpoint(checkpoint_params.output_dir, checkpoint_params.output_model_prefix)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\calamari_ocr\ocr\trainer.py", line 264, in make_checkpoint
f.write(json_format.MessageToJson(checkpoint_params))
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\google\protobuf\json_format.py", line 127, in MessageToJson
return printer.ToJsonString(message, indent, sort_keys)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\site-packages\google\protobuf\json_format.py", line 178, in ToJsonString
return json.dumps(js, indent=indent, sort_keys=sort_keys)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json_init
.py", line 238, in dumps
**kw).encode(obj)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 201, in encode
chunks = list(chunks)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 430, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 404, in _iterencode_dict
yield from chunks
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 325, in _iterencode_list
yield from chunks
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 437, in _iterencode
o = _default(o)
File "c:\users\dsureshkumar\anaconda3\envs\calamari_gpu\lib\json\encoder.py", line 180, in default
o.class.name)
TypeError: Object of type 'float32' is not JSON serializable
2019-02-28 10:07:34.188243: W tensorflow/core/kernels/data/generator_dataset_op.cc:78] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}} = PyFuncTin=[DT_INT64], Tout=[DT_INT64], token="pyfunc_2"]]

Error after iteration training

os : win10
python : 3.6
train on cpu
tensorflow version: 1.12.0
install through 'pip install calamari'

I tried to train my own model using the following command :
"calamari-train --files img1.png --max_iters=10" and the following error occured.

Total time 15.09201717376709s for 9 iterations.
2018-11-13 11:55:35.169246: W tensorflow/core/kernels/data/generator_dataset_op.cc:78] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}} = PyFuncTin=[DT_INT64], Tout=[DT_INT64], token="pyfunc_2"]]

Training with GPU

I tried to training a new model on GPU by

calamari-train --files data/*.png --batch_size=64

I installed calamari-ocr using pip. Got the following error

Found 181 files in the dataset
Loading Dataset: 100%|██████████████████████████████████████████████████████████████████████████████████████| 181/181 [00:00<00:00, 242.80it/s]
Text Preprocessing: 100%|█████████████████████████████████████████████████████████████████████████████████| 181/181 [00:00<00:00, 14412.59it/s]
Data Preprocessing: 100%|████████████████████████████████████████████████████████████████████████████████████| 181/181 [00:03<00:00, 56.37it/s]
CODEC: ['', ' ', "'", '-', '.', '1', '2', 'ء', 'آ', 'أ', 'ؤ', 'ئ', 'ا', 'ب', 'ة', 'ت', 'ث', 'ج', 'ح', 'خ', 'د', 'ذ', 'ر', 'ز', 'س', 'ش', 'ص', 'ض', 'ط', 'ظ', 'ع', 'غ', 'ف', 'ق', 'ك', 'ل', 'م', 'ن', 'ه', 'و', 'ى', 'ي', 'ً', 'َ']
/home/ubuntu/anaconda3/envs/python3/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
2018-08-08 05:28:36.538017: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-08 05:28:36.622038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-08 05:28:36.622469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-08-08 05:28:36.622503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-08-08 05:28:36.912184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-08 05:28:36.912241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-08-08 05:28:36.912255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-08-08 05:28:36.912544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10761 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)
Using CUDNN LSTM backend on GPU
Using CUDNN LSTM backend on GPU
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/python3/bin/calamari-train", line 11, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/scripts/train.py", line 233, in main
    run(args)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/scripts/train.py", line 226, in run
    trainer.train(progress_bar=not args.no_progress_bars)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/ocr/trainer.py", line 163, in train
    test_net = backend.create_net(restore=None, weights=self.weights, graph_type="test", batch_size=checkpoint_params.batch_size)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_backend.py", line 26, in create_net
    model = TensorflowModel(self.network_proto, self.graph, self.session, graph_type, batch_size, reuse_weights=not self.first_model)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py", line 44, in __init__
    self.create_network(self.inputs, self.input_seq_len, self.dropout_rate, reuse_variables=reuse_weights)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py", line 162, in create_network
    time_major_outputs = gpu_cudnn_lstm_backend(time_major_inputs, lstm_layers[0].hidden_nodes)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py", line 156, in gpu_cudnn_lstm_backend
    time_major_outputs, (output_h, output_c) = rnn_lstm(time_major_inputs)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 329, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 696, in __call__
    self.build(input_shapes)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py", line 362, in build
    initializer=opaque_params_t, validate_shape=False)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1328, in get_variable
    constraint=constraint)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1090, in get_variable
    constraint=constraint)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 427, in get_variable
    return custom_getter(**custom_getter_kwargs)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py", line 294, in _update_trainable_weights
    variable = getter(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 404, in _true_getter
    use_resource=use_resource, constraint=constraint)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 761, in _get_single_variable
    "reuse=tf.AUTO_REUSE in VarScope?" % name)
ValueError: Variable cudnn_lstm_1/opaque_kernel does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

So it seems it find the GPU device, but got problem in

time_major_outputs, (output_h, output_c) = rnn_lstm(time_major_inputs)

Thanks!

Problem with validation set

I'm training models successfully. Thank you for Calamari!

I am not able to implement early stopping during training through the --validation option. Training with calamari-cross-fold-train gives the same problem: It never scores above 0 on the validation set. To debug, I printed the predictions on the validation set, and they consistently look like the output of a freshly initialised network (like the output at iteration 0 during training). In ocr/trainer.py I can't see where test_net would get any weights other than the randomly initialised ones. Nothing from train_net seems to touch the prediction on the validation set. I guess I might be missing something.

Is this a known regression, or something I'm missing? Thank you for your help.

Freeze graph

I wanted to freeze the .ckpt file to .pb model
I had error during loading the .ckpt file using following function

checkpoint_file = "./output/model_00000050.ckpt"
saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))

And it return me the error

KeyError: 'LSTMBlockCell'

License

Hi @ChWick!

OCR Engine based on OCRopy and Kraken

OCRopy and Kraken (and TenserFlow) are released under the Apache 2.0 license.

I ask you to reconsider the license choice of your project.

I hope this request will not be regarded as chutzpah.

Feature Request: default/auto parameters for training

when training a model ensemble it would be nice if there was an "auto" option to automatically choose somewhat sane parameters, e.g. regarding early stopping. these parameters should depend on the number of GT lines available for training.
based on our experiments i propose the following auto defaults:

  • early_stopping_frequency: about half the number of available GT lines, maybe rounded up to the next hundred.
  • early_stopping_nbest: 5, i think 10 is too high. this is a general "issue" and not specific to this auto functionality.
  • max_iters: maybe 10 epochs? not sure about this one.
  • checkpoint_frequency = early_stopping_frequency.

Error during training

TypeError: Failed to convert object of type <class 'tensorflow.python.framework.sparse_tensor.SparseTensor'> to Tensor. Contents: SparseTensor(indices=Tensor("Where:0", shape=(?, 2), dtype=int64), values=Tensor("sub:0", shape=(?,), dtype=int32), dense_shape=Tensor("Shape:0", shape=(2,), dtype=int64)). Consider casting elements to a supported type.

What is the reason, please?

Feature Request: fully automatic two step training with data augmentation

our experiments have shown that, when using data augmentation, a two step training approach is sensible:

  1. train on all available lines (real + augmented).
  2. refine the models resulting from 1 by training on real lines only and use the outcome of 1 as a starting point.

it would be nice if these training steps could be automatically performed by calamari without requiring any futher input by the user. the auto config proposed in #33 might be helpful :-).

Key cnn_lstm/B not found in checkpoint

self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key cnn_lstm/B not found in checkpoint
[[{{node save_1/RestoreV2}} = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2/tensor_names, save_1/RestoreV2/shape_and_slices)]]

how can I solve this problem?

Originally posted by @hsl20130659 in #11 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.