faustomorales / keras-ocr Goto Github PK

View Code? Open in Web Editor NEW

1.4K 50.0 355.0 1.05 MB

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.

Home Page: https://keras-ocr.readthedocs.io/

License: MIT License

Dockerfile 0.39% Makefile 1.07% Python 98.54%

text-detection keras-crnn keras ocr

keras-ocr's Introduction

keras-ocr

This is a slightly polished and packaged version of the Keras CRNN implementation and the published CRAFT text detection model. It provides a high level API for training a text detection and OCR pipeline.

Please see the documentation for more examples, including for training a custom model.

Getting Started

Installation

keras-ocr supports Python >= 3.6 and TensorFlow >= 2.0.0.

# To install from master
pip install git+https://github.com/faustomorales/keras-ocr.git#egg=keras-ocr

# To install from PyPi
pip install keras-ocr

Using

The package ships with an easy-to-use implementation of the CRAFT text detection model from this repository and the CRNN recognition model from this repository.

import matplotlib.pyplot as plt

import keras_ocr

# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()

# Get a set of three example images
images = [
    keras_ocr.tools.read(url) for url in [
        'https://upload.wikimedia.org/wikipedia/commons/b/bd/Army_Reserves_Recruitment_Banner_MOD_45156284.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/e/e8/FseeG2QeLXo.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/b/b4/EUBanana-500x112.jpg'
    ]
]

# Each list of predictions in prediction_groups is a list of
# (word, box) tuples.
prediction_groups = pipeline.recognize(images)

# Plot the predictions
fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))
for ax, image, predictions in zip(axs, images, prediction_groups):
    keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)

Comparing keras-ocr and other OCR approaches

You may be wondering how the models in this package compare to existing cloud OCR APIs. We provide some metrics below and the notebook used to compute them using the first 1,000 images in the COCO-Text validation set. We limited it to 1,000 because the Google Cloud free tier is for 1,000 calls a month at the time of this writing. As always, caveats apply:

No guarantees apply to these numbers -- please beware and compute your own metrics independently to verify them. As of this writing, they should be considered a very rough first draft. Please open an issue if you find a mistake. In particular, the cloud APIs have a variety of options that one can use to improve their performance and the responses can be parsed in different ways. It is possible that I made some error in configuration or parsing. Again, please open an issue if you find a mistake!
We ignore punctuation and letter case because the out-of-the-box recognizer in keras-ocr (provided by this independent repository) does not support either. Note that both AWS Rekognition and Google Cloud Vision support punctuation as well as upper and lowercase characters.
We ignore non-English text.
We ignore illegible text.

model	latency	precision	recall
AWS	719ms	0.45	0.48
GCP	388ms	0.53	0.58
keras-ocr (scale=2)	417ms	0.53	0.54
keras-ocr (scale=3)	699ms	0.5	0.59

Precision and recall were computed based on an intersection over union of 50% or higher and a text similarity to ground truth of 50% or higher.
keras-ocr latency values were computed using a Tesla P4 GPU on Google Colab. scale refers to the argument provided to keras_ocr.pipelines.Pipeline() which determines the upscaling applied to the image prior to inference.
Latency for the cloud providers was measured with sequential requests, so you can obtain significant speed improvements by making multiple simultaneous API requests.
Each of the entries provides a link to the JSON file containing the annotations made on each pass. You can use this with the notebook to compute metrics without having to make the API calls yourself (though you are encoraged to replicate it independently)!

Why not compare to Tesseract? In every configuration I tried, Tesseract did very poorly on this test. Tesseract performs best on scans of books, not on incidental scene text like that in this dataset.

Advanced Configuration

By default if a GPU is available Tensorflow tries to grab almost all of the available video memory, and this sucks if you're running multiple models with Tensorflow and Pytorch. Setting any value for the environment variable MEMORY_GROWTH will force Tensorflow to dynamically allocate only as much GPU memory as is needed.

You can also specify a limit per Tensorflow process by setting the environment variable MEMORY_ALLOCATED to any float, and this value is a float ratio of VRAM to the total amount present.

To apply these changes, call keras_ocr.config.configure() at the top of your file where you import keras_ocr.

Contributing

To work on the project, start by doing the following. These instructions probably do not yet work for Windows but if a Windows user has some ideas for how to fix that it would be greatly appreciated (I don't have a Windows machine to test on at the moment).

# Install local dependencies for
# code completion, etc.
make init

# Build the Docker container to run
# tests and such.
make build

You can get a JupyterLab server running to experiment with using make lab.
To run checks before committing code, you can use make format-check type-check lint-check test.
To view the documentation, use make docs.

To implement new features, please first file an issue proposing your change for discussion.

To report problems, please file an issue with sample code, expected results, actual results, and a complete traceback.

Troubleshooting

This package is installing opencv-python-headless but I would prefer a different opencv flavor. This is due to aleju/imgaug#473. You can uninstall the unwanted OpenCV flavor after installing keras-ocr. We apologize for the inconvenience.

keras-ocr's People

Contributors

Stargazers

Watchers

Forkers

amarnathv9 erlonted kmfeng gehongpeng shitoubiao wuxiaolianggit allensmile antonizdp sonfire186 laksh9950 johnjjung chaytanyasinha fakhraddin vyaslkv shashisingh hsouporto lnt28 maxqai stjordanis qf6101 smalgireddy anasvp444 aashishrn awesome-archive saonam samuelyi wadnaa adrianmargin mbdeveci makhthum up1 fitrialif bayethiernodiop anu1rag hjc3613 yashmukaty cherish24 jeffzhengye mrm8488 terragona codeslord gm19900510 yesmung hell-to-heaven remoyson gabrielfiletti wkryst avinash987 wetgi semaraugusto juhyung-son huyhoang17 lizy331 hoonmokmoon prashanthbn ciel-zhang beosro alighofrani95 muayyad-alsadi studian srinivasgutta7 windowxiaoming 0xrad7 rajiv2806 axing620 xiaowenhe felixzhang7 chaithanya21 willongwill intjun jizxgit tukjet tianyouchen xrosliang msmilevski zeta1999 xiaomei1995 ricardorangel2017 saurabhshahare kymillev madderyoung gkumbhat bhavitvyamalik ivsanro1 nguyenhn terratenney killerwuhan papaass dsp05 benhoff alwc cuda-convnet cuteofdragon vishnunkumar davidtranno1 amelmusic vunguyen597 davidko3 diegosiqueir4 jackcplusplus

keras-ocr's Issues

End to end training tensorflow.python.framework.errors_impl.ResourceExhaustedError

I encountered following error when I ran end to end training code.
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,64,160,160] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node model_1/upconv3.conv.3/Conv2D (defined at train2.py:140) ]]

Train2.py contains a slightly modified version of your end to end training software. I had to modify your code to suit my Windows environment e.g. Windows file names cannot contain characters like colon and to support re-run.

Error occurs at following line:

detector.model.fit_generator(
generator=detection_train_generator,
steps_per_epoch=math.ceil(len(background_splits[0]) / detector_batch_size),
epochs=1000,
workers=0,
callbacks=[
tf.keras.callbacks.EarlyStopping(restore_best_weights=True, patience=5),
tf.keras.callbacks.CSVLogger(f'{detector_basepath}.csv'),
tf.keras.callbacks.ModelCheckpoint(filepath=f'{detector_basepath}.h5')
],
validation_data=detection_val_generator,
validation_steps=math.ceil(len(background_splits[1]) / detector_batch_size)
)

I have attached the file containing the source code below.
train2py.pdf

I have also attached all of the messages displayed at the Windows command line when I ran
python train2.py.

log1.log

I have also attached the ouput of my pip list so that you can see what python modules and versions that I have installed.

pip-list.txt

My gpu is just a Nvidia GeForce MX150 with 4GB RAM. My PC has 16GB RAM.

Text threshold vs detection threshold

I think you might have mixed up the detection_threshold and text_threshold variables in detection.py since they differ from the original CRAFT implementation and sine there is an incongruity between a comment and the variable name in one instance.

Here you have:

_, text_score = cv2.threshold(textmap,
    thresh=text_threshold,
    maxval=1,
    type=cv2.THRESH_BINARY)

The original implementation uses the low_text parameter here (what you're calling detection_threshold). So I believe text_threshold should be replaced with detection_threshold.

And then a few lines down you have:

# If the maximum value within this connected component is less than
# text threshold, we skip it.
if np.max(textmap[labels == component_id]) < detection_threshold:
     continue

Should detection_threshold be text_threshold as the comment says and as it is in the original implementation?

If so, the fix seems easy. I could open the PR if you'd like. Or you as the maintainer can take it.

Again, thanks for your awesome work on this 👏

Prediction not running on GPU

@faustomorales prediction always running on cpu ,Is there any option to change it to gpu or is there any minimum requirement for gpu, because i had only 2gb of Quadro K3000M.

Error when running

Hi,

When I run the program with my business card test image, getting this error

TypeError: resize_bilinear() got an unexpected keyword argument 'half_pixel_centers'

What does it mean? What am I doing wrong? Are there any example programs I can try?

Thanks,
Suyash

Very slow detector inference speed

It seems the getBoxes method in the detector is very slow and the main bottleneck for good inference speeds. I was able to modify the detection part to work on batches using the tf.data.datasets API, which made the detection part quite fast. I used a batch size of 8 on a V100 GPU.

Time for detection (8660 images 1000x1000): 1032 s
Time for getBoxes of those images: 11 729 s

Any ideas how to improve the performance of the getBoxes step? I assume it is slow cause it has to process each result one at a time, get connectedcomponents and then process those one at a time.

I'm gonna try some things with Numba to speed up the loop over each connectedcomponent.

opencv-contrib-python-headless VS opencv-contrib-python

Would be nice if keras-ocr would use opencv-contrib-python instead of the headless version. Because the headless version does not have any GUI functionality. This way a user don't has to uninstall and install the desired opencv packages if he wants to have opencv gui functionalities.
What do you think?

error terminate called after throwing an instance of 'std::bad_alloc'

i use your demo code, and my app was crash. It show

2019-11-19 16:05:38.112665: W tensorflow/core/framework/allocator.cc:107] Allocation of 3121348608 exceeds 10% of system memory.
2019-11-19 16:05:40.014946: W tensorflow/core/framework/allocator.cc:107] Allocation of 3121348608 exceeds 10% of system memory.
2019-11-19 16:05:40.014951: W tensorflow/core/framework/allocator.cc:107] Allocation of 3121348608 exceeds 10% of system memory.
2019-11-19 16:05:45.262924: W tensorflow/core/framework/allocator.cc:107] Allocation of 3121348608 exceeds 10% of system memory.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

I do not have GPU, my memory is 32GB.
the code i use

import matplotlib.pyplot as plt

import keras_ocr

detector = keras_ocr.detection.Detector(pretrained=True)
image = keras_ocr.tools.read('tests/test_image.jpg')

boxes = detector.detect(images=[image])[0]
canvas = keras_ocr.detection.drawBoxes(image, boxes)
plt.imshow(canvas)

How can i run this?

Struggling to train with custom alphabet including spaces...

Hi, Fausto
First of all, thanks for sharing this implementation, I have actually started using your codebase recently because it's much nicer and cleaner than what I did a few months ago! :)

I'm trying to train the model using a custom dataset, and a custom alphabet including all letters, some special symbols and the space ' ' character. Everything works great if I ommit the space character from the sequences. When the ' ' (or whatever replacement I'm using) I can't get the model to train, I see 2 issues arise:

loss is inf.
a message in the console showing an error in the ctc funcion (./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.)
[2] happens some times during normal trianing but it's not very common and the model keeps training correctly.

I've tried freezing the backbone layers with no success... (It actually achieves much better results when training the whole network).

I tried adding a start sequence character '\t' and an end sequence '\n' and that can be learned on a sencence level, but not on a word level...

I also changed the get_batch_generator function to remove the .strip() calls to each sentence.

Do you have any ideas that I could try or changes to the code that might prevent this issue?

Thank you!
Chepe

Version

Hi, Can you please specify the versions of Python, Tensorflow and Keras compatible with this API?

ValueError: operands could not be broadcast together with shapes

Hi, I was just running the old example code from some versions before with the current release (this time under windows):

import matplotlib.pyplot as plt

import keras_ocr

# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()

image = keras_ocr.tools.read('test7.png')

# Predictions is a list of (text, box) tuples.
predictions = pipeline.recognize(image)

# Plot the results.
fig, ax = plt.subplots()
ax.imshow(keras_ocr.tools.drawBoxes(image, predictions, boxes_format='predictions'))
for text, box in predictions:
    ax.annotate(s=text, xy=box[0], xytext=box[0] - 50, arrowprops={'arrowstyle': '->'})
    print(text)
plt.show()

It seems to not work anymore. It errors out on line 12 (predictions = ...) with the error:

File "bla.py", line 12, in
predictions = pipeline.recognize(image)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\pipeline.py", line 55, in recognize
box_groups = self.detector.detect(images=images, **detection_kwargs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\detection.py", line 647, in detect
images = [compute_input(tools.read(image)) for image in images]
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\detection.py", line 647, in
images = [compute_input(tools.read(image)) for image in images]
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\detection.py", line 40, in compute_input
image -= mean * 255
ValueError: operands could not be broadcast together with shapes (1280,6) (3,) (1280,6)

I guess something changed and I have to alter the code a bit. Any idea whats wrong?

PS. It would be nice to have super easy examples. Something like "Detect Text Using Pretrained Model", "Recognize Text Using Pretrained Model" and both together. Just with one image.

about python version

hello:
i wonder which python version support the project....

Script fails with Warning: Allocation of xxx exceeds 10% of system memory.

Issue with memory usage, when running demo script.

Windows 10
RAM: 16GB
CPU: i7-9750H
GPU: GeForce GTX 1660 Ti
GPU computeCapability: 7.5

The following lines of demo code run successfully:

import matplotlib.pyplot as plt
import keras_ocr

# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()

# Get a set of three example images
images = [
    keras_ocr.tools.read(url) for url in [
        'https://upload.wikimedia.org/wikipedia/commons/b/bd/Army_Reserves_Recruitment_Banner_MOD_45156284.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/e/e8/FseeG2QeLXo.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/b/b4/EUBanana-500x112.jpg'
    ]
]

The following line of code cause an error:

prediction_groups = pipeline.recognize(images)

Error:

Looking for C:\Users\...\.keras-ocr\crnn_kurapan.h5
2020-02-25 16:33:00.010608: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 1006632960 exceeds 10% of system memory.
2020-02-25 16:33:00.356063: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 1006632960 exceeds 10% of system memory.

The script then fails.

I have read (https://www.raspberrypi.org/forums/viewtopic.php?t=242471) that this should be a warning, and shouldn't cause the script to fail.

Minor type errors in code for section "Use the model for inference"

Perhaps these two lines from the section "Use the model for inference"
pipeline = keras_ocr.pipelines.Pipeline(detector=detector, recognizer=recognizer)
image, text, lines = next(image_generators[0])

should be
pipeline = keras_ocr.pipeline.Pipeline(detector=detector, recognizer=recognizer)
image, lines = next(image_generators[0])

build_params is ignored in Recognizer init method

Hi,
I think I found a simple bug, build_params is ignored when calling

Recognizer(weights=None, alphabet=alphabet, build_params=build_params)

Changing the default of the build_params kwarg from None to {} and this change to line 315 would do it, allowing to pass only some keys and keep the defaults for the rest.

build_params = {
                k: build_params.get(k, DEFAULT_BUILD_PARAMS[k]) for k, v in DEFAULT_BUILD_PARAMS.items()
            }

TY
...or a cleaner version of that ;)
Cheers!

Could you share the conversion code?

I'd like to know how to convert pth model to h5 model. Could you please share me with this code?Thx

cairo vs opencv

Hello, thanks for this great tool, In the original example on Keras Cairo was used to generate datasets, I just want to know why change to OpenCV

Custom fine-tune model

Is there currently any way of fine-tunning on a different model than the default one? (like passing a .ht file to the recognizer)

Also is there any way of training the recognizer with multiple gpus?

Thank you :)

ValueError : Too many values to unpack ( OpenCV issue )

The error is related to OpenCV findContours meethod call.

My Env:

Python : 3.6..4
OpenCV: 3.4.2.16

Code to reproduce:

image = keras_ocr.tools.read('X00016469612.jpg')

# Predictions is a list of (text, box) tuples.
predictions = pipeline.recognize(image=image)

Error:

ValueError: too many values to unpack (expected 2)

Points to getBoxes method:

contours, _ = cv2.findContours(segmap.astype('uint8'),mode=cv2.RETR_TREE, method=cv2.CHAIN_APPROX_SIMPLE)

Solution:

findContours returns 3 values. Changing the above line to below should fix it.

__ , contours , __ = cv2.findContours(segmap.astype('uint8'),mode=cv2.RETR_TREE, method=cv2.CHAIN_APPROX_SIMPLE)

How to load the recognition model after training?

Hi, I am having the issue of loading the model after training the recogniser.

TensorFlow warnings about unnecessary retracing

The following minimal example (main.py)

import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()

image_urls = [
    "https://i.imgur.com/euIw5Dt.png",
    "https://i.imgur.com/fAT6keX.png",
    "https://i.imgur.com/RlxBrvX.png",
    "https://i.imgur.com/pWBX9z5.png",
    "https://i.imgur.com/tzfitxz.png",
    "https://i.imgur.com/VPPpRJg.png"
]

for image_url in image_urls:
    image = keras_ocr.tools.read(image_url)
    predictions = pipeline.recognize([image])

produces there TensorFlow warnings:

WARNING:tensorflow:5 out of the last 5 calls to <function _make_execution_function.<locals>.distributed_function at 0x7f0284309cb0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
WARNING:tensorflow:6 out of the last 6 calls to <function _make_execution_function.<locals>.distributed_function at 0x7f0284309cb0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.

It seems this can negatively impact performance.

The situation can be reproduced by running

docker build -t deleteme .

with the following Dockerfile:

FROM python:3.7

ENV CUDA_VISIBLE_DEVICES="-1"
RUN pip install tensorflow==2.1.0 keras-ocr==0.8.3

# Disable the Docker cache from this stage on, see https://stackoverflow.com/a/58801213/1866775
ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache

ADD ./main.py /
RUN python /main.py

recognition improvement

hi, how can i improve the recognition accuracy of the noisy images and text with same background, i have attached some examples here

Hi, I got an Error when I import keras_ocr, could you help me solve the problem?

I have tensorflow 2.0.0 python 3.7
When I import keras_ocr
I got the error:
2020-03-11 08:32:13.540050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\jim\Anaconda3\Lib\site-packages\keras_ocr_init_.py", line 1, in
from . import (detection, recognition, tools, data_generation, pipeline, evaluation, datasets,
File "C:\Users\jim\Anaconda3\Lib\site-packages\keras_ocr\detection.py", line 31, in
from . import tools
File "C:\Users\jim\Anaconda3\Lib\site-packages\keras_ocr\tools.py", line 14, in
from shapely import geometry
File "C:\Users\jim\Anaconda3\Lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\jim\Anaconda3\Lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "C:\Users\jim\Anaconda3\Lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\jim\Anaconda3\Lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\jim\Anaconda3\lib\ctypes_init.py", line 356, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

Question: Purpose of `rnn_steps_to_discard` in recognition

Hi!

Thanks for the good library.

I'm looking at the code for the detector, namely the last Lamda function of the Recognition model.
https://github.com/faustomorales/keras-ocr/blob/master/keras_ocr/recognition.py#L285
x = keras.layers.Lambda(lambda x: x[:, rnn_steps_to_discard:])(x)

What is the purpose of the Lamda function that discards the first few steps of the RNN? I would have thought that all the RNN steps are needed - what are the advantages of ignoring them?

Kind regards indeed, Franco

High Loss while training for Kanji Character

How can the loss be reduced?
Is there a way so that the loss could converge and results could improve.

Fine-tuning with custom datasets?

Hi, is it possible for me to fine-tune the model with a custom dataset after training it on syth dataset?

use recognizer on multiple images

Hello, it would be very useful to be able to use just the recognizer on multiple images like when using the pipeline.
the use case: I use YOLO to extract and crop some text area I am interested in and i want to predict the texts using just the recognizer.
Any suggestions or guides?
Thanks.

Unsupported depth of input image

I just ran the example from:

https://keras-ocr.readthedocs.io/en/latest/examples/using_pretrained_models.html

with several png and jpg files and always got the following error:

(keraspypi) retrohelix@retrohelix-P64-HJ-HK1:~/.virtualenvs/test$ python bla.py
Looking for /home/retrohelix/.keras-ocr/craft_mlt_25k.h5
2020-01-04 16:40:15.855346: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-04 16:40:15.877424: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2808000000 Hz
2020-01-04 16:40:15.878135: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3dfe2c0 executing computations on platform Host. Devices:
2020-01-04 16:40:15.878149: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /home/retrohelix/.virtualenvs/keraspypi/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py:5783: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
Looking for /home/retrohelix/.keras-ocr/crnn_kurapan.h5
Traceback (most recent call last):
File "bla.py", line 12, in
predictions = pipeline.recognize(image=image)
File "/home/retrohelix/.virtualenvs/keraspypi/lib/python3.6/site-packages/keras_ocr/pipeline.py", line 49, in recognize
**recognition_kwargs)
File "/home/retrohelix/.virtualenvs/keraspypi/lib/python3.6/site-packages/keras_ocr/recognition.py", line 398, in recognize_from_boxes
[cv2.cvtColor(crop, cv2.COLOR_RGB2GRAY)[..., np.newaxis] for crop in crops])
File "/home/retrohelix/.virtualenvs/keraspypi/lib/python3.6/site-packages/keras_ocr/recognition.py", line 398, in
[cv2.cvtColor(crop, cv2.COLOR_RGB2GRAY)[..., np.newaxis] for crop in crops])
cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.simd_helpers.hpp:94: error: (-2:Unspecified error) in function 'cv::impl::{anonymous}::CvtHelper<VScn, VDcn, VDepth, sizePolicy>::CvtHelper(cv::InputArray, cv::OutputArray, int) [with VScn = cv::impl::{anonymous}::Set<3, 4>; VDcn = cv::impl::{anonymous}::Set<1>; VDepth = cv::impl::{anonymous}::Set<0, 2, 5>; cv::impl::{anonymous}::SizePolicy sizePolicy = (cv::impl::::SizePolicy)2u; cv::InputArray = const cv::_InputArray&; cv::OutputArray = const cv::_OutputArray&]'

Unsupported depth of input image:
'VDepth::contains(depth)'
where
'depth' is 4 (CV_32S)

Any idea what I can do about this?

Add ctpn?

Batch Processing

Hi,

Thanks for this work.

Is there any way we can perform batch processing during inference instead of passing single image every time?

Image from cv2.VideoCapture not working

I don't know if this is a bug or not but when trying to run the pipeline.recognize on an image coming from a cv2.VideoCapture I get the following error:

Traceback (most recent call last):
File "testing.py", line 106, in
predictions = pipeline.recognize([newimg])[0]
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\pipeline.py", line 58, in recognize
**recognition_kwargs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\recognition.py", line 437, in recognize_from_boxes
for row in self.prediction_model.predict(X, **kwargs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 909, in predict
use_multiprocessing=use_multiprocessing)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 715, in predict
x, check_steps=True, steps_name='steps', steps=steps)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 2472, in _standardize_user_data
exception_prefix='input')
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 565, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_2 to have 4 dimensions, but got array with shape (0, 1)

The image I get from cap.read is a numpy.ndarray and the shape is the same as when using keras_ocr.tools.read so I think it should be ok to use. Any idea whats wrong?
Here is the relevant snippet of my code:

cap = cv2.VideoCapture(args.video)
while cap.isOpened():
    # read frame
    ret, frame = cap.read()
    print(frame.shape)
    print(type(frame))
    
    # predictions is a list of (text, box)
    predictions = pipeline.recognize([frame])[0]

Performance on CPU

Thanks for this wonderful work! I've run sample code(use pretrained model) on mac with intel i7 + 16G memory, the result is pretty well, but the performance of pipeline is really poor(over 10min on a 3968x2976 phone image). 
Also every core of CPU is fully used.Compared with model's performance on both two origin repo, I guess that there should be bottleneck in the pipeline. 
I wonder if you were tested it on similar system @faustomorales .

use augmenter in `convert_image_generator_to_recognizer_input`

hello @faustomorales it would be cool to pass an augmenter to the convert_image_generator_to_recognizer_input to add some noise in the generated crop, I know it's possible to pass an augmenter to the get_image_generator but that augmenter is applied before adding the text to the background, however most of the time we want to add the background after adding the text since real-life images are in that form. if you know an existing way of having this behavior please let me know otherwise I would be glad to implement it (I have the use case)

unable to install keras-ocr

$ python3 -V
Python 3.6.10

$ lsb_release -a

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.6 LTS
Release:	16.04
Codename:	xenial

$ pip3 install keras-ocr

Defaulting to user installation because normal site-packages is not writeable
Collecting keras-ocr
  Using cached keras-ocr-0.6.3.tar.gz (165 kB)
  WARNING: Generating metadata for package keras-ocr produced metadata for project name unknown. Fix your #egg=keras-ocr fragments.
Requirement already satisfied (use --upgrade to upgrade): unknown from https://files.pythonhosted.org/packages/c9/26/97b09f82ee62d3958bc8cd2745e4ea1120b3a231d4b14ef2ee1cfff23d5f/keras-ocr-0.6.3.tar.gz#sha256=594311d7edd7e261bbc8884aec9b5aa19dfb40d451076f55680ecf2a13d2d044 in /home/stark/.local/lib/python3.6/site-packages
Building wheels for collected packages: unknown, unknown
  Building wheel for unknown (setup.py) ... done
  Created wheel for unknown: filename=UNKNOWN-0.6.3-py3-none-any.whl size=1561 sha256=3a266adc697c74c26294d415166fbf5b78281940a53ce8a39bab868703ad7ec2
  Stored in directory: /home/stark/.cache/pip/wheels/0e/3a/51/59648d8e35c96ef61a1ca90c7024bbc80d3bf533c899ed6762
  Building wheel for unknown (setup.py) ... done
  Created wheel for unknown: filename=UNKNOWN-0.6.3-py3-none-any.whl size=1561 sha256=3a266adc697c74c26294d415166fbf5b78281940a53ce8a39bab868703ad7ec2
  Stored in directory: /tmp/pip-ephem-wheel-cache-iqq8m6ge/wheels/b7/9e/31/a6d40c047ea2a4d8f43c101412ba0c98453486f7591b6900dd
Successfully built unknown unknown

$ python3
Python 3.6.10 (default, Dec 19 2019, 23:04:32) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras_ocr
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'keras_ocr'

Undesired behavior when detecting and recognizing numbers with commas

Hello. First of all, thanks for this library, it is really useful and even with the problem I'm going to describe, I find the results very good overall

I am working with scanned documents.

I have found that in most scenarios where there's a number with a comma (decimal number), the text detection separates the number at the left of the comma and the one at the right of the comma. I have also found that when this happens, the comma either:

Falls in the left text box, and it is usually recognized as a "1", or
Is not detected at all

In this example you can find both behaviors:

Maybe this is because training data for text detection is not contemplating these cases?

text detector - weights and test with efficientnet

Thanks for the amazing work.

did you try craft text-detector with efficientnet?
I saw you have implemented the detection part with two backbones (vgg) and (efficientnet)

but the part of calling the efficientnet and weights are not provided
is there any plan to include it in future?

Do keras OCR suited for scanned documents ?

HI,
I'm new to the universe of OCR and document analysis. So to discover that, I'm working on large documents like scanned invoices with a lot of noises. Do you think that using keras-ocr instead of tesseract would be possible on this type of data ? Or there are another tools build for OCR on these documents exist ?

Jupyter notebook crashing

Whenever the Pipeline function is activated in two different kernels, one of them crashes.

Reproducible exemple:

import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()

Try adding this cell to 2 different kernels and run them.

Issue with Clean Installation

I have been having an issue with a clean installation.

Supported versions (https://pypi.org/project/keras-ocr/)

keras-ocr supports Python >= 3.6 and TensorFlow >= 2.0.0.

My environment:

Windows 10
Python 3.7.4
Anaconda

Installation steps:

Create a new virtual environment

mkdir venv
cd venv
mkdir project
python -m venv project
project\Scripts\activate.bat

Install keras-ocr, and tensorflow

python -m pip install --upgrade pip
pip install keras-ocr
pip install tensorflow

Pip freeze:

absl-py==0.9.0
astor==0.8.1
cachetools==4.0.0
certifi==2019.11.28
chardet==3.0.4
cycler==0.10.0
decorator==4.4.1
editdistance==0.5.3
efficientnet==1.0.0
essential-generators==0.9.2
fonttools==4.4.0
gast==0.2.2
google-auth==1.11.2
google-auth-oauthlib==0.4.1
google-pasta==0.1.8
grpcio==1.27.2
h5py==2.10.0
idna==2.9
imageio==2.8.0
imgaug==0.4.0
Keras-Applications==1.0.8
keras-ocr==0.6.2
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
Markdown==3.2.1
matplotlib==3.1.3
networkx==2.4
numpy==1.18.1
oauthlib==3.1.0
opencv-python==4.2.0.32
opt-einsum==3.1.0
Pillow==7.0.0
protobuf==3.11.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyclipper==1.1.0.post3
pyparsing==2.4.6
python-dateutil==2.8.1
PyWavelets==1.1.1
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
scikit-image==0.16.2
scipy==1.4.1
Shapely==1.7.0
six==1.14.0
tensorboard==2.1.0
tensorflow==2.1.0
tensorflow-estimator==2.1.0
termcolor==1.1.0
tqdm==4.43.0
urllib3==1.25.8
validators==0.14.2
Werkzeug==1.0.0
wrapt==1.12.0

Run example script (https://pypi.org/project/keras-ocr/):

import matplotlib.pyplot as plt

import keras_ocr

# keras-ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()

# Get a set of three example images
images = [
    keras_ocr.tools.read(url) for url in [
        'https://upload.wikimedia.org/wikipedia/commons/b/bd/Army_Reserves_Recruitment_Banner_MOD_45156284.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/e/e8/FseeG2QeLXo.jpg',
        'https://upload.wikimedia.org/wikipedia/commons/b/b4/EUBanana-500x112.jpg'
    ]
]

# Each list of predictions in prediction_groups is a list of
# (word, box) tuples.
prediction_groups = pipeline.recognize(images)

# Plot the predictions
fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))
for ax, image, predictions in zip(axs, images, prediction_groups):
    keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)

Error:

(project) (base) C:\Users\...\Documents>python main.py

2020-02-23 15:03:06.027923: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-02-23 15:03:06.031828: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

Traceback (most recent call last):
  File "main.py", line 3, in <module>
    import keras_ocr
  File "C:\Users\...\Documents\venv\project\lib\site-packages\keras_ocr\__init__.py", line 1, in <module>
    from . import (detection, recognition, tools, data_generation, pipeline, evaluation, datasets,
  File "C:\Users\...\Documents\venv\project\lib\site-packages\keras_ocr\detection.py", line 31, in <module>
    from . import tools
  File "C:\Users\...\Documents\venv\project\lib\site-packages\keras_ocr\tools.py", line 14, in <module>
    from shapely import geometry
  File "C:\Users\...\Documents\venv\project\lib\site-packages\shapely\geometry\__init__.py", line 4, in <module>
    from .base import CAP_STYLE, JOIN_STYLE
  File "C:\Users\...\Documents\venv\project\lib\site-packages\shapely\geometry\base.py", line 18, in <module>
    from shapely.coords import CoordinateSequence
  File "C:\Users\...\Documents\venv\project\lib\site-packages\shapely\coords.py", line 8, in <module>
    from shapely.geos import lgeos
  File "C:\Users\...\Documents\venv\project\lib\site-packages\shapely\geos.py", line 145, in <module>
    _lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
  File "C:\Users\...\Anaconda3\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found

What am I doing wrong?

I got an error,who can help me

E:\Anaconda\python.exe E:/keras-ocr/keras_ocr_texst.py
Traceback (most recent call last):
File "E:/keras-ocr/keras_ocr_texst.py", line 1, in
import keras_ocr
File "E:\Anaconda\lib\site-packages\keras_ocr_init_.py", line 1, in
from . import (detection, recognition, tools, data_generation, pipeline, evaluation, datasets,
File "E:\Anaconda\lib\site-packages\keras_ocr\detection.py", line 31, in
from . import tools
File "E:\Anaconda\lib\site-packages\keras_ocr\tools.py", line 14, in
from shapely import geometry
File "E:\Anaconda\lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "E:\Anaconda\lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "E:\Anaconda\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "E:\Anaconda\lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "E:\Anaconda\lib\ctypes_init.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

Process finished with exit code 1

windows10
tensorflow-cpu =2.0
python= 3.7

Combining word boxes into lines or paragraphs

Is there anyway (maybe any built-in function) to detect EOL chars in a large text? Or, maybe it must be done by the client by comparing the words position vector.
Thanks in advance.

advice on building dataset for generique OCR

Hi everyone, some guidelines or links on how to generate a dataset for a generic OCR, I am concerned more in the background part. Also, what would be the "maximum" mean edit distance of an OCR model that is used in production.

CUDNN_STATUS_EXECUTION_FAILED

I'm running the text detection and the recognition on every frame of a video to extract hardcoded subtitles (on windows). This works quite well although its a bit slow. But letting my program run for some minutes (the time differs) I always get this error: CUDNN_STATUS_EXECUTION_FAILED
I don't think its a bug of keras-ocr but I don't have a clue how to resolve this error or were to ask. From what I found by searching the internet it could be a driver issue... Any idea?

Here is the full log:

2020-04-09 17:08:52.593789: E tensorflow/stream_executor/dnn.cc:588] CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1796): 'cudnnRNNForwardTraining( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, input_desc.handles(), input_data.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), rnn_desc.params_handle(), params.opaque(), output_desc.handles(), output_data->opaque(), output_h_desc.handle(), output_h_data->opaque(), output_c_desc.handle(), output_c_data->opaque(), workspace.opaque(), workspace.size(), reserve_space.opaque(), reserve_space.size())'
2020-04-09 17:08:52.605393: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at cudnn_rnn_ops.cc:1498 : Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 128, 128, 1, 50, 4, 128]
2020-04-09 17:08:52.612678: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 128, 128, 1, 50, 4, 128]
[[{{node CudnnRNN}}]]
2020-04-09 17:08:52.621696: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Cancelled: [Derived]RecvAsync is cancelled.
[[{{node decode/PadV2/paddings/_78}}]]
[[decode/Shape_1/_76]]
2020-04-09 17:08:52.624991: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Cancelled: [Derived]RecvAsync is cancelled.
[[{{node decode/PadV2/paddings/_78}}]]
Traceback (most recent call last):
File "VideoSubDetect.py", line 199, in
recognizedtext = recognizer.recognize_from_boxes([frame], [sorted_box_group])
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\keras_ocr\recognition.py", line 439, in recognize_from_boxes
for row in self.prediction_model.predict(X, **kwargs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 909, in predict
use_multiprocessing=use_multiprocessing)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 722, in predict
callbacks=callbacks)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 393, in model_iteration
batch_outs = f(ins_batch)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3740, in call
outputs = self._graph_fn(*converted_inputs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\eager\function.py", line 1081, in call
return self._call_impl(args, kwargs)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\eager\function.py", line 1121, in _call_impl
return self._call_flat(args, self.captured_inputs, cancellation_manager)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\eager\function.py", line 1224, in _call_flat
ctx, args, cancellation_manager=cancellation_manager)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\eager\function.py", line 511, in call
ctx=ctx)
File "C:\Users\RetroHelix\Envs\test\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.CancelledError: [Derived]RecvAsync is cancelled.
[[{{node decode/PadV2/paddings/_78}}]] [Op:__inference_keras_scratch_graph_15223]

Non-English Characters Problem

I am trying to train recognizer model with my custom alphabet:

alphabet = ''.join(
    [ 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
     'Y', 'Z',  '[', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'v', 'w',
     'x', 'y', 'z', '|', 'Ç', 'Ö', 'Ü', 'ç', 'ö', 'ü', 'Ğ', 'ğ', 'İ', 'ı', 'Ş', 'ş'])
recognizer_alphabet = ''.join(sorted(set(alphabet)))

And I got the error below:
for c in ''.join(sentences)), 'Found illegal characters in sentence.'
AssertionError: Found illegal characters in sentence.

And these are sentences:
sentences: ['gıda', 'inş. iç v', 'e dış. ti', 'c. ltd.', 'a', 'zie baina', 'a', 'dr']

Weakly-Supervised Training for Detection

The CRAFT authors used a weakly-supervised training method to handle the fact that most datasets don't annotate at the character level. I saw in your docs that a future release will support weakly-supervised training of the detector model, presumably following section 3.2.2 of the original paper. Have you made a start on this and, if so, do you have an idea of when this would be released? I might have time to try this myself, but figured I'd ask first.

Also, kudos and thanks for this cool project!

Multi gpu Training

Hello, how would someone use multiple gpus to train, since with tf2 the recomended way is using tf.distribute.Strategy but this suppose you do the model definition and compilation in the strategy context, do we still use multi_gpu_model from keras on the recognizer.training_model for example : recognizer.training_model = multi_gpu_model(recognizer.training_model, gpus=4)

Detector training not working

The example of fine-tuning the detector in the docs isn't working with the 0.8.0 release, although other examples, like this one, are working.

Downgrading to 0.6.3 got the example working again (intermediate versions, e.g. 0.7.x, were also failing with the same error, which is detailed below).

To reproduce, create an empty python 3.7.4 conda environment with the following installs on Windows 10:

conda install -c anaconda tensorflow-gpu
pip install keras-ocr
pip install scikit-learn
conda install -c conda-forge shapely

I then copy-pasted that fine-tuning example into train.py and got the following when running it:

python train.py
2020-04-03 13:30:07.403362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Looking for .\icdar2013\Challenge2_Training_Task12_Images.zip
Downloading .\icdar2013\Challenge2_Training_Task12_Images.zip
Looking for .\icdar2013\Challenge2_Training_Task2_GT.zip
Downloading .\icdar2013\Challenge2_Training_Task2_GT.zip
Looking for C:\Users\scottmcallister\.keras-ocr\craft_mlt_25k.h5

...
...<LOTS OF TENSORFLOW GPU MESSAGES>
...

WARNING:tensorflow:sample_weight modes were coerced from
  ...
    to
  ['...']
WARNING:tensorflow:sample_weight modes were coerced from
  ...
    to
  ['...']
Train for 183 steps, validate for 46 steps
Epoch 1/1000
  1/183 [..............................] - ETA: 1:35WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available. Available metrics are:
  1/183 [..............................] - ETA: 2:20Traceback (most recent call last):
  File "train.py", line 67, in <module>
    validation_steps=math.ceil(len(validation) / batch_size)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1306, in fit_generator
    initial_epoch=initial_epoch)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 342, in fit
    total_epochs=epochs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 128, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 98, in execution_function
    distributed_function(input_fn))
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 568, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 615, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 497, in _initialize
    *args, **kwds))
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\function.py", line 2389, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\function.py", line 2703, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\function.py", line 2593, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\framework\func_graph.py", line 978, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 439, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 85, in distributed_function
    per_replica_function, args=args)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 763, in experimental_run_v2
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 1819, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 2164, in _call_for_each_replica
    return fn(*args, **kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 292, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 433, in train_on_batch
    output_loss_metrics=model._output_loss_metrics)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 312, in train_on_batch
    output_loss_metrics=output_loss_metrics))
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 253, in _process_single_batch
    training=training))
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 171, in _model_loss
    reduction=losses_utils.ReductionV2.NONE)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\keras\utils\losses_utils.py", line 107, in compute_weighted_loss
    losses, sample_weight)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\ops\losses\util.py", line 148, in scale_losses_by_sample_weight
    sample_weight = weights_broadcast_ops.broadcast_weights(sample_weight, losses)
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\ops\weights_broadcast_ops.py", line 167, in broadcast_weights
    with ops.control_dependencies((assert_broadcastable(weights, values),)):
  File "C:\Users\scottmcallister\anaconda3\envs\keras-ocr-test\lib\site-packages\tensorflow_core\python\ops\weights_broadcast_ops.py", line 103, in assert_broadcastable
    weights_rank_static, values.shape, weights.shape))
ValueError: weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(None, None, None). weights.shape=(None,).

The source of the error can probably be uncovered here, likely within detection.py. I'd try to uncover myself, but your familiarity with the source might be more expeditious.

faustomorales / keras-ocr Goto Github PK

keras-ocr's Introduction

keras-ocr

Getting Started

Installation

Using

Comparing keras-ocr and other OCR approaches

Advanced Configuration

Contributing

Troubleshooting

keras-ocr's People

Contributors

Stargazers

Watchers

Forkers

keras-ocr's Issues

Recommend Projects

Recommend Topics

Recommend Org