Giter Club home page Giter Club logo

Comments (12)

faustomorales avatar faustomorales commented on July 20, 2024 2

You are exactly right! I've just pushed 95f6209 to fix this.

from keras-ocr.

Dobiasd avatar Dobiasd commented on July 20, 2024 1

Thanks a lot for the quick and well-written response. It matches the overall high quality of your project, which is very refreshing compared to many other repositories in the ML zoo. They often are quite cumbersome to understand/use/integrate. keras-ocr, on the other hand, is very convenient to use and well documented. 👍

Yes, the code using batching does eliminate the warnings. The thing is, my use-case does not depend on maximum throughput, but on minimum latency, and the images come in 1-by-1, so I have to process each one immediately.

To measure how much more runtime per image I have to accept for this, I just wrote the following mini benchmark:

import time

import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()

images = [
    keras_ocr.tools.read(url) for url in [
        'https://i.imgur.com/euIw5Dt.png',
        'https://i.imgur.com/fAT6keX.png',
        'https://i.imgur.com/RlxBrvX.png',
        'https://i.imgur.com/pWBX9z5.png',
        'https://i.imgur.com/tzfitxz.png',
        'https://i.imgur.com/VPPpRJg.png'
    ]
]


def measure(f):
    start = time.time()
    f()
    duration = time.time() - start
    print(f'{f.__name__.ljust(10)}: {duration:.2f} s', flush=True)


def one_by_one():
    return [pipeline.recognize([image]) for image in images]


def batch():
    return pipeline.recognize(images)


for _ in range(3):
    measure(one_by_one)
    measure(batch)

Output (on my machine, no matter if ran inside Docker or outside of it):

one_by_one: 24.96 s
batch     : 53.37 s
one_by_one: 22.06 s
batch     : 52.83 s
one_by_one: 22.30 s
batch     : 52.68 s

So it seems like the one_by_one version actually is faster compared to the batch version. This, of course, does not match what we expect to see. Do you get similar results (without GPU, i.e., with CUDA_VISIBLE_DEVICES="-1")?

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

Thanks so much for reporting this issue!

You are getting this warning because pipeline.recognize uses a two-step process where images are first passed through the detector and then the cropped word boxes are passed through the recognizer. In your code, you are passing images in one at a time, which means that inference takes place as follows:

  • Detect text on image 1
  • Recognize text on boxes from image 1
  • Detect text on image 2
  • Recognize text on boxes from image 2
  • Detect text on image 3
  • Recognize text on boxes from image 3
  • ...

TensorFlow is having to flip back and forth between the detector and recognizer using a batch size of 1, which means that every call ends up requiring retracing (the details of this are hazy to me, to be honest. but that's what I've surmised from dealing with this warning in the past).

To be more efficient, pipeline.recognize will optimize inference to perform the first step (detection) for all the images and then performing the second step (recognition) for all the word boxes. You can take advantage of this by batching according to the pattern suggested in the README, which loads a batch of images first and then passes that batch to pipeline.recognize.

This means that, instead of the above, inference can take place as follows:

  • Detect text on images 1, 2, 3, 4, ...
  • Recognize text on boxes for images 1, 2, 3, 4 ...

I believe this altered version of your code will eliminate the warnings (and also get you much better performance).

import keras_ocr

pipeline = keras_ocr.pipeline.Pipeline()

image_urls = [
    "https://i.imgur.com/euIw5Dt.png",
    "https://i.imgur.com/fAT6keX.png",
    "https://i.imgur.com/RlxBrvX.png",
    "https://i.imgur.com/pWBX9z5.png",
    "https://i.imgur.com/tzfitxz.png",
    "https://i.imgur.com/VPPpRJg.png"
]

images = [keras_ocr.tools.read(image_url) for image_url in image_urls]
predictions = pipeline.recognize(images)

This resolves the issue on my machine but please do let me know if the warning persists on your end. Again, thanks for reporting the issue and for giving keras-ocr a try!

from keras-ocr.

Dobiasd avatar Dobiasd commented on July 20, 2024

OK, same as with issue 65, the warnings are not a problem caused by keras-ocr, but one coming from TensorFlow. They introduced this problem between versions 2.0.1 and 2.1.0 (see tensorflow/issues/38598). When I use TF 2.0.1, everything is fine, i.e., there are no warnings. 👍

The one_by_one version still (TF 2.0.1) is faster compared to the batch version in the experiment. It's still surprising, but not an actual problem, so I'll close this issue here. Thanks again. 🙂

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

First, thanks for all the kind words above! And thank you for your patience -- I usually don't have the cycles to provide thoughtful answers to questions until the weekend so I'm glad you've been able to get most of what you need by other means

I think the the counterintuitive results are caused by the way we do batching (see below).

images = [
tools.resize_image(image, max_scale=self.scale, max_size=self.max_size)
for image in images
]
max_height, max_width = np.array([image.shape[:2] for image, scale in images]).max(axis=0)
scales = [scale for _, scale in images]
images = np.array(
[tools.pad(image, width=max_width, height=max_height) for image, _ in images])

We first upscale all the images according to the pipeline.scale parameter. Then, because batching requires all images to be the same size, we pad all the images to the size of the largest image. So if you have 5 small images and 1 large one, you end up paying a premium for the smaller images in that batch. On a GPU, the balance comes out in favor of batching. On a CPU, though, it appears to go the other way depending on the relative sizes of the images in the batch.

I've found it difficult to balance all the edge cases and provide the right facility for users to choose the best option for them. If, in your use of the library, you find new / better ways to handle the trade-offs, please don't hesitate to propose them or file a PR. For example, perhaps we should batch images of similar size together somehow? But then that would complicate things somewhat and cause even more surprising behavior. I definitely don't know the right answer. :/

Again, thanks for the feedback and questions!

from keras-ocr.

Dobiasd avatar Dobiasd commented on July 20, 2024

Ah, thanks for the explanation. The padding thing totally makes sense.

I would not try to provide some super-smart batching logic in the library. If the user knows about what you just explained, they can decide on their own which images to batch or not. The fact that the way to get the best performance depends on the user's system is another indicator that this should be a user-land decision.

from keras-ocr.

Dobiasd avatar Dobiasd commented on July 20, 2024

Then, because batching requires all images to be the same size, we pad all the images to the size of the largest image.

Wouldn't it then make sense to actually run just one predict call on the detection model?

Currently, it seems to be done one-by-one, basically like this:

for image in images:
    [...]
    self.model.predict(image[np.newaxis], [...]),

Instead, I expected something like that:

self.model.predict(images)

from keras-ocr.

jasdeep06 avatar jasdeep06 commented on July 20, 2024

@faustomorales training in batches also seems to give this retracing warning.IMHO-
Currently the images are being resized(padded) to the maximum height and width of that particular batch.When the images are different in size across batches(as is case in my application),this maximum height/width would be different for different batches.As you can see here -

"On the other hand, TensorFlow graphs require static dtypes and shape dimensions. tf.function bridges this gap by retracing the function when necessary to generate the correct graphs. Most of the subtlety of tf.function usage stems from this retracing behavior."

As the shape of resized images change in every batch,tf has to retrace graph taking 10x time to run model.predict() in detection module.

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

from keras-ocr.

jasdeep06 avatar jasdeep06 commented on July 20, 2024

@faustomorales If you dig further into documentation of tf.function,here and look at the "Input signatures",it states the following-

"An "input signature" can be optionally provided to tf.function to control the graphs traced. The input signature specifies the shape and type of each Tensor argument to the function using a tf.TensorSpec object. More general shapes can be used. This is useful to avoid creating multiple graphs when Tensors have dynamic shapes."

Thus,specifying input signature with shape [batch_size,None,None,3] should solve the retracing problem.The tricky part here is,behind the scene,on specifying tf.function signature tensorflow converts the underlying function to a graph.This graph works best when everything a tensorflow op.(preprocessing + predict) which is not true for this repository.{Even if every op is not a tensorflow op,IMHO the signature would still solve the retracing problem although it will be comparatively slower.}

Moreover tf.function signature needs to sit on call method of our model which is only available if the model is written using subclassing.In this repository functional API is used which does not expose the call method.I also tried to add tf.function signature to predict function but it was not supported as mentioned here.

IMHO,rewriting model using subclassing,adding tf.function along with input signature on the call method and calling prediction as model(....) instead of model.predict(...) would enable inference on different sized batches without retracing.Otherwise an option for a fixed resize can always be provided taking the option of multi sized inference away.

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

Re-opening because we probably ought to fix this so we can get improved inference speed. Not sure when I'll be able to get to it but PRs are welcome!

from keras-ocr.

zaheerbeg21 avatar zaheerbeg21 commented on July 20, 2024

how can we use our own images on this model? I mean from local machine or google drive images,Kindly suggest something

from keras-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.