Is there any rules for confidence threshold when calculating precision and recall? I'v

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Confidence threshold about object-detection-metrics HOT 8 CLOSED

uxdiin commented on June 9, 2024

Confidence threshold

from object-detection-metrics.

Comments (8)

rafaelpadilla commented on June 9, 2024

Hi @uxdiin ,

I believe the scikitlearn AP is not designed for object detection. In object detection context, each point of the AP curve shows the precision and recall for a given confidence level. The sorting we do in our code is just an implementation way of evaluating the whole dataset for a given threshold. See the example below:

Let's say we have 3 objects detected with the following confidence levels:
object A: confidence = 0.3
object B: confidence = 0.95
object C: confidence = 0.8

If the confidence threshold is 0.5, it will miss object A and only detect objects B and C. Thus, to evaluate the model with confidence 0.6, what is the point of checking object A again if it already failed in a lower threshold? Got the idea? This is why we sort the confidences.

Hope it helped

from object-detection-metrics.

uxdiin commented on June 9, 2024

What about manually setting the threshold?. let's reuse or example. Now use this list as the threshold [0.35, 0.4,.....,0.7, 0.75]. With these thresholds, confidence at 0.3 will never be evaluated and 0.8, 0.95 will be always detected because the highest confidence is 0.75. Wouldn't this lead to different precision and recall values? is there any rule for setting the confidence threshold?

from object-detection-metrics.

rafaelpadilla commented on June 9, 2024

Hi @uxdiin ,

Let me make a small comment on what you said: When you use thresholds > 0.35, it does not mean that confidence 0.3 will never be evaluated. It means that detections with confidence < 0.3 will be ignored.

Sure, you could set manually the threshold. Actually, this is exactly what must be done once you put your model into production. Nevertheless, the AP is a metric to evaluate the performance of a model over all confidence thresholds. It does not aim to find the best operating point (best confidence threshold).

The rule depends on your application. You can choose an operating point on the Precision x Recall curve that has the best trade-off between these values. If in your application it is more important to retrieve as many objects as possible even if you detect false ones, you would pick a confidence threshold with a high recall. On the other hand, if you don't not mind missing some objects, and the detections you make must be as precise as possible, you would chose a threshold that gives you a high precision. It all depends on your application and the price you are willing to pay for missing objects or detecting wrong ones.

from object-detection-metrics.

uxdiin commented on June 9, 2024

I am sorry if I worded it wrong. I know on production we choose our own confidence. I meant we manually created a list of threshold values for mAP evaluation, not for production. Because I've seen someone did that. But from ur answer "the AP is a metric to evaluate the performance of a model over ALL confidence thresholds" it seems we should use all threshold values from the detection. Is that right?

from object-detection-metrics.

rafaelpadilla commented on June 9, 2024

Hi @uxdiin ,

Yes, you should use all confidence threshold values when plotting the the Precision x Recall curve. I never see anyone making it different. Please, post here a reference for works that do it differently.

Maybe you are making a confusion about how to measure the AP value from the Precision x Recall curve. In this case, there are different interpolation approaches.

Also, the Precision x Recall curve can be plotted considering different IOU thresholds, which is a totally different concept than the confidence thresholds.

Regards,
Rafael

from object-detection-metrics.

uxdiin commented on June 9, 2024

https://blog.paperspace.com/mean-average-precision/.
This implementation is a bit similar to sklearn's implementation which is not interpolated. The difference is they are not using all confidence thresholds. You said that sklearn's implementation is not meant for object detection, but this popular Faster R-CNN method uses it https://github.com/you359/Keras-FasterRCNN.

from object-detection-metrics.

rafaelpadilla commented on June 9, 2024

Hi @uxdiin ,

The blog link you pointed seems to select a range of confidence thresholds. Doing this way, the results are going to ignore detections outside the threshold range. Maybe the model produces bad detections for lower confidences and so the author does not want to show his model's bad performance on this case. This is not the standard way of computing AP. You can check the PASCAL VOC and COCO official codes. You will see they do exactly how I mentioned.

Regarding the second code, could you point in the file where they use the sklearn to evaluate the model? Note that this is not an official FasterCNN implementation.

I'm not saying this is the case of the implementations you are referring to, but many opensource implementations (specially those made to evaluate a specific model) usually modify the official AP metric to obtain better results in their only model. My understanding is that if you want to benchmark your model, you must always use the standard way of computing the metrics. This way your results can be fairly compared to other works.

from object-detection-metrics.

uxdiin commented on June 9, 2024

The problem is many object detection-related papers doesn't even tell us what kind of mAP they were using just like you mentioned in your paper. That's why I come here to clear my doubt.

you can check measureMap.py, at the bottom they use sklearn's average_precision_score. And there is something weird too in same file. You can check in get_map function, they treated not detected ground truth bounding box as positive detection with 0 confidence which will make the model looks like it performs well on confidence threshold of 0. I know it is not the official implementation, but it is quite popular and has been forked by many.

But I think you have cleared my doubt. I will close this issue if you don't have anything to add.
Thanks for all of your answers.

from object-detection-metrics.

Confidence threshold about object-detection-metrics HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent