An example in which output AP is wrong: when (1) data points are ordered as positi

vl_pr and vl_roc are dependent on data point ordering when scores are equal about vlfeat HOT 3 OPEN

vlfeat commented on May 5, 2024

vl_pr and vl_roc are dependent on data point ordering when scores are equal

from vlfeat.

Comments (3)

vedaldi commented on May 5, 2024

Thank you for reporting this issue.

Unfortunately there does not seem to be a clean solution. PR curves are defined by the rank of the labels, and uniform scores do not uniquely specify that.

One could of course return an average of all possible AP scores for all possible rankings compatible with the input scores. However, uniform scores are probably caused by a bug in the application, and this default behaviour would mask that. Furthermore, the reported AP would not correspond to the area of the curve anymore.

Possibly the best solution is to print a warning message when uniform scores are found.

from vlfeat.

dineshj1 commented on May 5, 2024

All scores being equal is just one case that I used to make the point. In
truth, any two scores being equal will cause the PR curves and
subsequently the APs, to depend on ordering. With algorithms like,
say, decision trees, the set of (probability) scores assigned to data
points is discrete and quite small, so there will inevitably be multiple
data points which are pigeon-holed into the same score, so this issue will
affect the evaluation for very real scenarios (as I found out with
something I was working on).

I think the simplest solution is to go not by ranking, but by the scores
themselves i.e. all data points above a certain operating point are
positives, others are negatives and the operating point is varied to get
the PR/ROC curve. This will produce a discontinuity in the PR/ROC curve
every time a score is shared by multiple data points. This is the same as
assigning the same rank to all operating points with the same score.
Matlab's native perfcurve implementation appears to do something like this.

On Thu, Mar 27, 2014 at 12:21 PM, Andrea Vedaldi
[email protected]:

Thank you for reporting this issue.

Unfortunately there does not seem to be a clean solution. PR curves are
defined by the rank of the labels, and uniform scores do not uniquely
specify that.

One could of course return an average of all possible AP scores for all
possible rankings compatible with the input scores. However, uniform scores
are probably caused by a bug in the application, and this default behaviour
would mask that. Furthermore, the reported AP would not correspond to the
area of the curve anymore.

Possibly the best solution is to print a warning message when uniform
scores are found.

Reply to this email directly or view it on GitHubhttps://github.com//issues/51#issuecomment-38833720
.

Dinesh Jayaraman,
The University of Texas at Austin

from vlfeat.

dineshj1 commented on May 5, 2024

In case the above comment wasn't clear, this same issue is discussed in Fawcett, ROC Graphs: Notes and Practical Considerations for Researchers, 2004 (http://binf.gmu.edu/mmasso/ROC101.pdf) in section 4.1, together with pseudo-code to handle the issue. Came across this today, and thought it might serve to make the issue clear.

Turns out my suggestion above would have ended up in one of the discussed "pessimistic"/"optimistic" cases in the paper: the final method proposed is the "expected" case.

from vlfeat.

vl_pr and vl_roc are dependent on data point ordering when scores are equal about vlfeat HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent