MAPE (mean_absolute_percentage_error) was added in scikit-learn 0.24 <a href="https://

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Could make the inclusion of MAPE a parameter. <div class="snippet-clipboard-conten

So I added it to the latest dev branch: <a href="https://github.com/oegedijk/explainer

Include MAPE in regression statistics about explainerdashboard HOT 10 CLOSED

oegedijk commented on May 18, 2024

Include MAPE in regression statistics

from explainerdashboard.

Comments (10)

oegedijk commented on May 18, 2024

That's in the latest release right? So we would have to set sklearn>=0.24 in the requirements. Worth it?

from explainerdashboard.

raybellwaves commented on May 18, 2024

So we would have to set sklearn>=0.24 in the requirements

Potentially. As far as I can tell, anyone pip installing explainerdashboard today will get version 0.24.0 of scikit-learn.

conda create -n test_env python=3.8 --y
conda activate test_env
pip install explainerdashboard
conda list
...
scikit-learn              0.24.0                   pypi_0    pypi
...

from explainerdashboard.

oegedijk commented on May 18, 2024

I saw at least one library (PyCaret) that fixed scikit-learn<=0.23 due to some breaking change, so don't want to force it on the users. However MAPE is still a nice and intuitive metric, and right now there are not so many regression metrics in the dashboard, so we could probably come up with some numpy one-liner to calculate it?

from explainerdashboard.

raybellwaves commented on May 18, 2024

epsilon = np.finfo(np.float64).eps
mape = np.absolute(y - yhat) / np.maximum(np.absolute(y), epsilon)

Taken from https://github.com/scikit-learn/scikit-learn/blob/7ed972193590c2a11839e15db87fa4818089de1a/sklearn/metrics/_regression.py#L261

from explainerdashboard.

oegedijk commented on May 18, 2024

Problem with MAPE is what to do with y close to or equal to 0, when I apply

def mape_score(y_true, y_pred):
    epsilon = np.finfo(np.float64).eps
    absolute_percentage_errors = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
    mape = np.average(absolute_percentage_errors)
    return mape

and apply it to the titanic_fare() dataset I get:

>>>mape_score(explainer.y, explainer.preds)
279419869598872.44

Due to the fact that one ticket price==0.0

from explainerdashboard.

raybellwaves commented on May 18, 2024

Yes it's an age old problem with mape. Feel like it still adds useful information alongside the other metrics. For the dashboard point of view wonder if you can use some kind of scientific notion for large numbers

>>> '{:.2e}'.format(mape_score(explainer.y, explainer.preds))
'2.79e+14'

from explainerdashboard.

oegedijk commented on May 18, 2024

Notation aside, that number still doesn't tell you anything about how good a fit the regression gave you, just that there was one y_true very close or equal to zero. Could exclude all absolute_percentage_errors > 100 maybe? Or make that a parameter?

from explainerdashboard.

raybellwaves commented on May 18, 2024

Could make the inclusion of MAPE a parameter.

ExplainerDashboard(
    explainer,
    hide_mape=False)

Most (I would to like think) data scientists know which metrics they want to look at.

from explainerdashboard.

oegedijk commented on May 18, 2024

So I added it to the latest dev branch: https://github.com/oegedijk/explainerdashboard/tree/dev

throws a warning if MAPE > 2

Also adds a parameter show_metrics with which you can pass a list of the metrics that you want to display (and you can also pass custom function to the list, which also get stored and loaded from yaml!)

You can give it a try and let me know if it works...

from explainerdashboard.

oegedijk commented on May 18, 2024

https://github.com/oegedijk/explainerdashboard/releases/tag/v0.3.2

from explainerdashboard.

Include MAPE in regression statistics about explainerdashboard HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent