Giter Club home page Giter Club logo

Comments (13)

banderlog avatar banderlog commented on May 5, 2024 4

In my case, I had to manually create Explanation object with correct shape :

rfc = RandomForestClassifier()
rfc.fit(train_x, train_y)

# test_x is a np.ndarray with (2936911, 41) shape
df_tmp = pd.DataFrame(test_x, columns=feature_names)

masker = shap.maskers.Independent(df_tmp, max_samples=1000)
explainer = shap.TreeExplainer(rfc, masker)

# shap_values.values.shape ==  (10000, 41, 2)
# shap_values[1].values.shape == (41, 2)
shap_values = explainer(df_tmp.sample(10000, random_state=69))

# tmp.values.shape == tmp.data.shape == (10000, 41)
# I took all observations for all features just for positive prediction result
tmp = shap.Explanation(shap_values[:, :, 1], data=df_tmp, feature_names=feature_names)

# now both plots work
shap.plots.beeswarm(tmp)
#shap.summary_plot(tmp)

Plot without feature names (NDA and similar stuff):

image

Solutions above did not work:

shap.plots.beeswarm(shap_values)
>>> ValueError: "The beeswarm plot does not support plotting explanations with instances that have more than one dimension!"

shap.plots.beeswarm(shap_values[1])
>>>IndexError: "tuple index out of range"

But the strange thing, that some time ago shap.summary_plot(shap_values[1], tmp) worked for other RandomForest model and different data.

PS: shap-0.39.0, numpy-'1.20.3'

from shap.

LiWangSH avatar LiWangSH commented on May 5, 2024 1

Hi @slundberg,

Thank you for the thread!

I was reading about plotting the shap.summary_plot(shap_values, X) for random forest and XGB binary classifiers, where shap_values = shap.TreeExplainer(clf).shap_values(X).

The interesting thing is that for the XGB classifier, shap_values in the summary plot is just as is in the calculation, whereas for the random forest, the shap_values needs to be shap_values[1], basically only the array for the positive label. I am interested in knowing why there is a discrepancy. Thank you so much!

Below, I included the example implementations for the random forest and XGB classifiers.
RF: https://medium.com/python-in-plain-english/random-forest-classifier-and-shap-how-to-understand-your-customers-and-interpret-a-black-box-model-6166d86820d9
Screen Shot 2020-12-08 at 6 29 18 PM

XGB: https://github.com/slundberg/shap/blob/master/notebooks/tree_explainer/Census%20income%20classification%20with%20XGBoost.ipynb
Screen Shot 2020-12-08 at 6 35 07 PM

from shap.

mxshen avatar mxshen commented on May 5, 2024 1

When I use shap.TreeExplainer to explain RandomForestRegressor, it is very slow, but it is very fast when using shap.TreeExplainer to explain XGBRegressor. Does anyone have the same issue or know the reasons? Thx!

from shap.

slundberg avatar slundberg commented on May 5, 2024

XGBoost does do bagging, and has parameters that can make it very similar to a random forest (using the DART parameters). So there is no reason TreeSHAP couldn't apply to any tree model, however it does scale quadratically with tree depth, so it would run a bit slower with random forest models since they tend to be really deep.

Making a general purpose tree library in Python or R is challenging since it is important to use a high performance language such as C++ to get good results since the algorithm involves loops and recursion, which are slow in a typical interpreted language.

Hope that helps!

from shap.

dswatson avatar dswatson commented on May 5, 2024

Very cool, I'd love to see TreeSHAP in action on a random forest. And I'm sure you're right that it would be super slow to do general purpose tree building in Python or R directly. I was thinking more about using those languages as user-friendly frontends to fast lower level implementations, as they do with gbm, ranger, etc. I suspect the algorithm will eventually be incorporated into other packages for tree-based modeling the way it was with XGBoost. Looking forward to that!

from shap.

pietervosnl avatar pietervosnl commented on May 5, 2024

Hi,

Nice work! Is it possible to provide an example on how to use the SHAP package with a random forest model? That would be much appreciated!

thanks

Pieter

from shap.

PurenBITeam avatar PurenBITeam commented on May 5, 2024

Hello everybody,

I got an issue with random forest regessor here. The same code which is working for a xgb model, brings the following error (see below). Can anybody explain why this isn't working with sklearn randomforestregressor?

Error: 'i' format requires -2147483648 <= number <= 2147483647

although I'm not using 'i' anywhere in the code

thanks and br

christoph

from shap.

slundberg avatar slundberg commented on May 5, 2024

from shap.

PurenBITeam avatar PurenBITeam commented on May 5, 2024

Dear Scott,
thanks for your fast answer.

Please see below 2 code snippets: The first shows the xgb case, which is working fine. The second shows the RandomForestRegressor(sklearn) case, which gives the error above.

XGB:
import shap
explainer = shap.TreeExplainer(model)
shap.initjs()
shap_values = explainer.shap_values(PredData, approximate=True)

model: <xgboost.core.Booster at 0x7f641316cdd8>

RF:
import shap
explainer = shap.TreeExplainer(modelrf)
shap.initjs()
shap_values = explainer.shap_values(PredData, approximate=True)

modelrf: RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=35,
max_features=20, max_leaf_nodes=None, min_impurity_decrease=0,
min_impurity_split=None, min_samples_leaf=1,
min_samples_split=5, min_weight_fraction_leaf=0.0,
n_estimators=750, n_jobs=4, oob_score=False,
random_state=201810, verbose=1, warm_start=False)

PredData:
image

In case you need more information; just tell me.

thanks and br

christoph

from shap.

slundberg avatar slundberg commented on May 5, 2024

@christophgm this runs fine when I use another dataset, can you provide a full example?

from shap.

zhihaoyan avatar zhihaoyan commented on May 5, 2024

Hi
when I apply shap to random forest regressor, I got error like
'RandomForestRegressor' object has no attribute 'estimators_'

can anybody help me on this one?

from shap.

arturomoncadatorres avatar arturomoncadatorres commented on May 5, 2024

Hi
when I apply shap to random forest regressor, I got error like
'RandomForestRegressor' object has no attribute 'estimators_'

can anybody help me on this one?

I think this means that you need to fit your RandomForestRegressor. This is a common error if you create your model and run it through a grid search or a cross-validation, since the model is passed as an argument and thus fitted "internally" in those functions.

from shap.

condran999 avatar condran999 commented on May 5, 2024

Hi, would like to know. In Randomforest if a variable/feature is out of the historic range, how will the SHAP value for the feature will be, eg will it have 0 impact on the force plot or a higher + value (if it exceed the upper limit of the historic data). SOS..!!

Note: as per my output the SHAP value is +0.25 close to zero, howevre i was expecting the SHAP value to be on the higer side, am i missing somthing ??

from shap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.