Giter Club home page Giter Club logo

metrics's Introduction

Note: the current releases of this toolbox are a beta release, to test working with Haskell's, Python's, and R's code repositories.

Build Status

Metrics provides implementations of various supervised machine learning evaluation metrics in the following languages:

  • Python easy_install ml_metrics
  • R install.packages("Metrics") from the R prompt
  • Haskell cabal install Metrics
  • MATLAB / Octave (clone the repo & run setup from the MATLAB command line)

For more detailed installation instructions, see the README for each implementation.

EVALUATION METRICS

Evaluation MetricPythonRHaskellMATLAB / Octave
Absolute Error (AE)
Average Precision at K (APK, AP@K)
Area Under the ROC (AUC)
Classification Error (CE)
F1 Score (F1)
Gini
Levenshtein
Log Loss (LL)
Mean Log Loss (LogLoss)
Mean Absolute Error (MAE)
Mean Average Precision at K (MAPK, MAP@K)
Mean Quadratic Weighted Kappa
Mean Squared Error (MSE)
Mean Squared Log Error (MSLE)
Normalized Gini
Quadratic Weighted Kappa
Relative Absolute Error (RAE)
Root Mean Squared Error (RMSE)
Relative Squared Error (RSE)
Root Relative Squared Error (RRSE)
Root Mean Squared Log Error (RMSLE)
Squared Error (SE)
Squared Log Error (SLE)

TO IMPLEMENT

  • F1 score
  • Multiclass log loss
  • Lift
  • Average Precision for binary classification
  • precision / recall break-even point
  • cross-entropy
  • True Pos / False Pos / True Neg / False Neg rates
  • precision / recall / sensitivity / specificity
  • mutual information

HIGHER LEVEL TRANSFORMATIONS TO HANDLE

  • GroupBy / Reduce
  • Weight individual samples or groups

PROPERTIES METRICS CAN HAVE

(Nonexhaustive and to be added in the future)

  • Min or Max (optimize through minimization or maximization)
  • Binary Classification
    • Scores predicted class labels
    • Scores predicted ranking (most likely to least likely for being in one class)
    • Scores predicted probabilities
  • Multiclass Classification
    • Scores predicted class labels
    • Scores predicted probabilities
  • Regression
  • Discrete Rater Comparison (confusion matrix)

metrics's People

Contributors

benhamner avatar dan-blanchard avatar eduardofv avatar ujjwalkarn avatar wendykan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metrics's Issues

Suggested modification to apk function

Note that in the event that the 'actual' vector contains no items, the following modification will prevent the return of NaN

if (length(actual) > 0) {
        score <- score / min(length(actual), k)
        }
    else {
        score <- score / k
    }

ml_metrics fails to install via pip

$ pip --version
pip 1.4.1 from /usr/local/lib/python2.7/site-packages/pip-1.4.1-py2.7.egg (python 2.7

$ pip install ml_metrics
Downloading/unpacking ml-metrics
Downloading ml_metrics-0.1.3.zip
Running setup.py egg_info for package ml-metrics
Traceback (most recent call last):
File "", line 16, in
File "/Users/ndronen/Source/dissertation/projects/iclr-2014/build/ml-metrics/setup.py", line 6, in
requirements = [x.strip() for x in open("requirements.txt")]
IOError: [Errno 2] No such file or directory: 'requirements.txt'
Complete output from command python setup.py egg_info:
Traceback (most recent call last):

File "", line 16, in

File "/Users/ndronen/Source/build/ml-metrics/setup.py", line 6, in

requirements = [x.strip() for x in open("requirements.txt")]

IOError: [Errno 2] No such file or directory: 'requirements.txt'

Input to the function

What should the input format to the function mapk be? (ie, in what format should actual and predicted be?)

wrong ap@k

After I run the code in my anaconda3

pip install ml_metrics
Collecting ml_metrics
Requirement already satisfied: numpy in /home/westwood/anaconda3/lib/python3.7/site-packages (from ml_metrics) (1.15.1)
Requirement already satisfied: pandas in /home/westwood/anaconda3/lib/python3.7/site-packages (from ml_metrics) (0.23.4)
Requirement already satisfied: python-dateutil>=2.5.0 in /home/westwood/anaconda3/lib/python3.7/site-packages (from pandas->ml_metrics) (2.7.3)
Requirement already satisfied: pytz>=2011k in /home/westwood/anaconda3/lib/python3.7/site-packages (from pandas->ml_metrics) (2018.5)
Requirement already satisfied: six>=1.5 in /home/westwood/anaconda3/lib/python3.7/site-packages (from python-dateutil>=2.5.0->pandas->ml_metrics) (1.11.0)
Installing collected packages: ml-metrics
Successfully installed ml-metrics-0.1.4

In the file

image

Is wrong !!!

And different with

image

Metrics R package has been orphaned on CRAN

Hi Ben,
I just noticed that the maintainer status of the Metrics R package has been changed to "ORPHANED" on April 21, 2017. The CRAN maintainers must have sent you some emails about issues with the package and couldn't reach you so after a certain amount of time, they set the maintainer to "ORPHANED" and incremented the package version number to 0.1.2.

I fixed the CRAN issues, made updates to the documentation, added examples to all the functions, and incremented the version number to 0.1.3. I've pushed the updates, which you can review on my fork here. Are you interested in re-establishing yourself as the maintainer? If so, I'll submit a PR with my changes and you can submit version 0.1.3 to CRAN directly. If not, let me know and I can help you find someone to take over as the maintainer and have them submit version 0.1.3 to CRAN.

CRAN check output from running R CMD check --as-cran Metrics_0.1.3.tar.gz:

* using log directory ‘/Users/me/code/github-myforks/Metrics/Metrics.Rcheck’
* using R version 3.3.2 (2016-10-31)
* using platform: x86_64-apple-darwin13.4.0 (64-bit)
* using session charset: UTF-8
* using option ‘--as-cran’
* checking for file ‘Metrics/DESCRIPTION’ ... OK
* this is package ‘Metrics’ version ‘0.1.3’
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Ben Hamner <[email protected]>’

Days since last update: 4

New maintainer:
  Ben Hamner <[email protected]>
Old maintainer(s):
  ORPHANED

License components with restrictions and base license permitting such:
  BSD_3_clause + file LICENSE
File 'LICENSE':
  YEAR: 2012-2017
  COPYRIGHT HOLDER: Ben Hamner
  ORGANIZATION: copyright holder

CRAN repository db overrides:
  X-CRAN-Comment: Orphaned and corrected on 2017-04-21 as check errors
    were not corrected despite reminders.
  Maintainer: ORPHANED
CRAN repository db conflicts: ‘Maintainer’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘Metrics’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking examples ... OK
* checking PDF version of manual ... OK
* DONE

Status: 1 NOTE
See
  ‘/Users/me/code/github-myforks/Metrics/Metrics.Rcheck/00check.log’
for details.

requirements.txt missing

Hey Ben,

When I try to install with easy_install or pip, I get:

IOError: [Errno 2] No such file or directory: 'requirements.txt'

setup.m should adjust for pc vs mac/unix

Hi Ben,

your setup.m script in Metrics/MATLAB/setup.m doesn't work well on Windows computers because of the different path delimiters / vs . I realized this when .git paths showed up on my Matlab path.

It's works for both if you replace

thisPathSplit = strread(thisPath,'%s','delimiter','/'); 

with

if ismac | isunix
    thisPathSplit = strread(thisPath,'%s','delimiter','/');
elseif ispc
    thisPathSplit = strread(thisPath,'%s','delimiter','\\');
end  

best,
Lukas

Become maintainer of this package

Hi

Do you have any interest in being the maintainer of this package? If not, would you mind if I help revive its status on CRAN?

Thanks
Michael Frasco

Metrics::auc fails due to integer overflow

The auc function cannot support large datasets due to integer overflow. The algorithm that the function uses multiplies the number of positive cases by the number of negative cases. If either of these numbers is large enough, there can be integer overflow.

Would you be open to a pull request that fixed this bug?

Installation problem

when tried to install with pip in virtual environment, it throws an error as below.

pip install ml_metrics
Collecting ml_metrics
Using cached ml_metrics-0.1.4.tar.gz (5.0 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in ml_metrics setup command: use_2to3 is invalid.
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Forgot an `import sys` in `setup.py`?

It's such a small thing I think I may be the one missing something here.

I cloned and ran python setup.py build and ran into

Traceback (most recent call last):
  File "setup.py", line 9, in <module>
    if sys.version_info >= (3,):
NameError: name 'sys' is not defined

Of course, doing an import sys fixes it right up.

Just thought I'd let you know!

wrong ap@k

it should return
return score / num_hits
rather than
return score / min(len(actual), k)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.