Giter Club home page Giter Club logo

Comments (4)

faustomorales avatar faustomorales commented on July 20, 2024

Thanks for taking a close look under the hood! You're right that the variable names don't match. But I believe the values used in the function (and therefore the net effect) do match.

parser.add_argument('--text_threshold', default=0.7, type=float, help='text confidence threshold')
parser.add_argument('--low_text', default=0.4, type=float, help='text low-bound score')
parser.add_argument('--link_threshold', default=0.4, type=float, help='link confidence threshold')

[source]

The variable names are hard for me to keep in my head so I made a table to summarize the differences.

purpose keras-ocr variable name keras-ocr variable value CRAFT-pytorch variable name CRAFT-pytorch
threshold the text map text_threshold 0.4 low_text 0.4
threshold the link map link_threshold 0.4 link_threshold 0.4
filter out detections detection_threshold 0.7 text_threshold 0.7

That still leaves us with the question of whether we should change the variable names to match the original implementation. In my humble opinion, the new names are more semantically descriptive. text_threshold and link_threshold are used in similar ways, so I think it makes sense for their variable names to have similar structure. This is in contrast with the low_text / link_threshold naming which, to me, implies that these values are used in different ways. Could you share your thoughts on that?

I'd also appreciate you checking to see if I've made a mistake above -- this can all be a little confusing and I may have misread something.

from keras-ocr.

csmcallister avatar csmcallister commented on July 20, 2024

You're right, your values do match the original implementation, so there's no impact on post-processing. It was, however, a little confusing when I adjusted the values and wasn't seeing the effects I expected.

I do agree with you about not changing the variable names to match the original ones. Yours are more descriptive, particularly with detection_threshold, which is used with the connected-component labeling that actually does the "detecting". I also like how you added the size_threshold, which was originally just hardcoded.

As such, I think a little more documentation that highlights the meaning and usage of these params is all you need. A docstring in the source would probably suffice, as most users likely won't want/need/know to alter these for their use case. This discussion in the original repo does a pretty good job at describing each's purpose in plain English.

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

I think updating the docstring is a fine idea. Unfortunately, the description in the linked comment defines the variables in terms of their perceived effect as opposed to how they are actually used. Below are what I believe are more accurate definitions that discuss how the values are used in addition to their effect. I'm interested in your feedback on the definitions. If you agree, I'll add them to the docstring for detector.detect. Or, since it was your idea, you're welcome to file a PR. I'd very much like you to get credit!

  • text_threshold: When the text map is processed, it is converted from confidence (float from zero to one) values to classification (0 for not text, 1 for text) using binary thresholding. The text_threshold value determines the breakpoint at which a value is converted to a 1 or a 0. For example, if text_threshold is 0.4 and a value for a particular point on the text map is 0.5, that value gets converted to a 1. The higher this value is, the less likely it is that characters will be merged together into a single word. The lower this value is, the more likely it is that non-text will be detected. Therein lies the balance.
  • link_threshold: This is the same as text_threshold, but is applied to the link map instead of the text map.
  • detection_threshold: We want to avoid including boxes that may have represented large regions of low confidence text predictions. To do this, we do a final check for each word box to make sure the maximum confidence value exceeds some detection threshold. This is the threshold used for this check.

from keras-ocr.

faustomorales avatar faustomorales commented on July 20, 2024

I've added this documentation in 717bfcd. Thanks for raising these questions!

from keras-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.