Giter Club home page Giter Club logo

Comments (10)

oegedijk avatar oegedijk commented on May 18, 2024

That's in the latest release right? So we would have to set sklearn>=0.24 in the requirements. Worth it?

from explainerdashboard.

raybellwaves avatar raybellwaves commented on May 18, 2024

So we would have to set sklearn>=0.24 in the requirements

Potentially. As far as I can tell, anyone pip installing explainerdashboard today will get version 0.24.0 of scikit-learn.

conda create -n test_env python=3.8 --y
conda activate test_env
pip install explainerdashboard
conda list
...
scikit-learn              0.24.0                   pypi_0    pypi
...

from explainerdashboard.

oegedijk avatar oegedijk commented on May 18, 2024

I saw at least one library (PyCaret) that fixed scikit-learn<=0.23 due to some breaking change, so don't want to force it on the users. However MAPE is still a nice and intuitive metric, and right now there are not so many regression metrics in the dashboard, so we could probably come up with some numpy one-liner to calculate it?

from explainerdashboard.

raybellwaves avatar raybellwaves commented on May 18, 2024
epsilon = np.finfo(np.float64).eps
mape = np.absolute(y - yhat) / np.maximum(np.absolute(y), epsilon)

Taken from https://github.com/scikit-learn/scikit-learn/blob/7ed972193590c2a11839e15db87fa4818089de1a/sklearn/metrics/_regression.py#L261

from explainerdashboard.

oegedijk avatar oegedijk commented on May 18, 2024

Problem with MAPE is what to do with y close to or equal to 0, when I apply

def mape_score(y_true, y_pred):
    epsilon = np.finfo(np.float64).eps
    absolute_percentage_errors = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
    mape = np.average(absolute_percentage_errors)
    return mape

and apply it to the titanic_fare() dataset I get:

>>>mape_score(explainer.y, explainer.preds)
279419869598872.44

Due to the fact that one ticket price==0.0

from explainerdashboard.

raybellwaves avatar raybellwaves commented on May 18, 2024

Yes it's an age old problem with mape. Feel like it still adds useful information alongside the other metrics. For the dashboard point of view wonder if you can use some kind of scientific notion for large numbers

>>> '{:.2e}'.format(mape_score(explainer.y, explainer.preds))
'2.79e+14'

from explainerdashboard.

oegedijk avatar oegedijk commented on May 18, 2024

Notation aside, that number still doesn't tell you anything about how good a fit the regression gave you, just that there was one y_true very close or equal to zero. Could exclude all absolute_percentage_errors > 100 maybe? Or make that a parameter?

from explainerdashboard.

raybellwaves avatar raybellwaves commented on May 18, 2024

Could make the inclusion of MAPE a parameter.

ExplainerDashboard(
    explainer,
    hide_mape=False)

Most (I would to like think) data scientists know which metrics they want to look at.

from explainerdashboard.

oegedijk avatar oegedijk commented on May 18, 2024

So I added it to the latest dev branch: https://github.com/oegedijk/explainerdashboard/tree/dev

throws a warning if MAPE > 2

Also adds a parameter show_metrics with which you can pass a list of the metrics that you want to display (and you can also pass custom function to the list, which also get stored and loaded from yaml!)

You can give it a try and let me know if it works...

from explainerdashboard.

oegedijk avatar oegedijk commented on May 18, 2024

https://github.com/oegedijk/explainerdashboard/releases/tag/v0.3.2

from explainerdashboard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.