I think you might have mixed up the detection_threshold</cod

I've added this documentation in <a class="commit-link" data-hovercard-type="commit" d

Text threshold vs detection threshold about keras-ocr HOT 4 CLOSED

faustomorales commented on July 20, 2024

Text threshold vs detection threshold

from keras-ocr.

Comments (4)

faustomorales commented on July 20, 2024

Thanks for taking a close look under the hood! You're right that the variable names don't match. But I believe the values used in the function (and therefore the net effect) do match.

parser.add_argument('--text_threshold', default=0.7, type=float, help='text confidence threshold')
parser.add_argument('--low_text', default=0.4, type=float, help='text low-bound score')
parser.add_argument('--link_threshold', default=0.4, type=float, help='link confidence threshold')

[source]

The variable names are hard for me to keep in my head so I made a table to summarize the differences.

purpose	keras-ocr variable name	keras-ocr variable value	CRAFT-pytorch variable name	CRAFT-pytorch
threshold the text map	text_threshold	0.4	low_text	0.4
threshold the link map	link_threshold	0.4	link_threshold	0.4
filter out detections	detection_threshold	0.7	text_threshold	0.7

That still leaves us with the question of whether we should change the variable names to match the original implementation. In my humble opinion, the new names are more semantically descriptive. text_threshold and link_threshold are used in similar ways, so I think it makes sense for their variable names to have similar structure. This is in contrast with the low_text / link_threshold naming which, to me, implies that these values are used in different ways. Could you share your thoughts on that?

I'd also appreciate you checking to see if I've made a mistake above -- this can all be a little confusing and I may have misread something.

from keras-ocr.

csmcallister commented on July 20, 2024

You're right, your values do match the original implementation, so there's no impact on post-processing. It was, however, a little confusing when I adjusted the values and wasn't seeing the effects I expected.

I do agree with you about not changing the variable names to match the original ones. Yours are more descriptive, particularly with detection_threshold, which is used with the connected-component labeling that actually does the "detecting". I also like how you added the size_threshold, which was originally just hardcoded.

As such, I think a little more documentation that highlights the meaning and usage of these params is all you need. A docstring in the source would probably suffice, as most users likely won't want/need/know to alter these for their use case. This discussion in the original repo does a pretty good job at describing each's purpose in plain English.

from keras-ocr.

faustomorales commented on July 20, 2024

I think updating the docstring is a fine idea. Unfortunately, the description in the linked comment defines the variables in terms of their perceived effect as opposed to how they are actually used. Below are what I believe are more accurate definitions that discuss how the values are used in addition to their effect. I'm interested in your feedback on the definitions. If you agree, I'll add them to the docstring for detector.detect. Or, since it was your idea, you're welcome to file a PR. I'd very much like you to get credit!

text_threshold: When the text map is processed, it is converted from confidence (float from zero to one) values to classification (0 for not text, 1 for text) using binary thresholding. The text_threshold value determines the breakpoint at which a value is converted to a 1 or a 0. For example, if text_threshold is 0.4 and a value for a particular point on the text map is 0.5, that value gets converted to a 1. The higher this value is, the less likely it is that characters will be merged together into a single word. The lower this value is, the more likely it is that non-text will be detected. Therein lies the balance.
link_threshold: This is the same as text_threshold, but is applied to the link map instead of the text map.
detection_threshold: We want to avoid including boxes that may have represented large regions of low confidence text predictions. To do this, we do a final check for each word box to make sure the maximum confidence value exceeds some detection threshold. This is the threshold used for this check.

from keras-ocr.

faustomorales commented on July 20, 2024

I've added this documentation in 717bfcd. Thanks for raising these questions!

from keras-ocr.

Text threshold vs detection threshold about keras-ocr HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent