Hi Marco, Currently it's hardcoded to use Ridge in feature selection

Model used for feature selection about lime HOT 7 CLOSED

yanghuidong commented on August 26, 2024

Model used for feature selection

from lime.

Comments (7)

marcotcr commented on August 26, 2024 1

If you're using a decision tree as an explainer, I would suggest not using any feature selection algorithm, but applying some constraint on the tree to keep it interpretable (e.g. maximum number of leaves, or maximum depth).
I've done this before, with interesting results.

from lime.

marcotcr commented on August 26, 2024 1

I have done both, but I think explaining each label at a time is better.
The explanation of a prediction is a local approximation, and most of the time you only want to explain the prediction of the top class anyway.
Having the extra signal that comes from using prediction probabilities of the top class and doing regression (instead of using the label and doing a locally weighted classifier instead of regressor) is also nice.

from lime.

marcotcr commented on August 26, 2024 1

Depends on how you define 'best features'. If you take clf.feature_importances_, you end up with a linear contribution score, and lose the interpretability of having a tree.
If you take a restricted tree, you have a tree that was (greedily) optimized to predict (or more precisely, to reduce impurity of predictions) in the neighborhood of the instance you're explaining. The features you get there should be the best (greedy) choices in terms of reducing impurity.

If the tree seems unintuitive, I would maybe try DecisionTreeClassifier instead of DecisionTreeRegressor and using 0-1 labels. You do lose the probability signal and may end up with nonsensical trees (e.g. always predict 1 for neighborhoods that are very positive), but the resulting tree will not care about making mistakes within the predicted class.

from lime.

marcotcr commented on August 26, 2024

We do have some other options for feature selection, but honestly I think the regressor that's used in the instance explanation doesn't matter that much, if it is linear.
Do you have a use case for other regressors, or other feature selectors?
Best,

from lime.

yanghuidong commented on August 26, 2024

I even tried DecisionTreeRegressor (with some adjustments since it has no intercept), and on the example from your tutorial, there seemed to be no difference compared to using Ridge as the explainer.

I was asking because at the end of your LIME paper you mentioned that using decision tree as explainer could be interesting, too. So naively, if I'm going to try that just for fun, do you think I should also use decision tree for feature selection?

Thanks!

from lime.

yanghuidong commented on August 26, 2024

So I was using DecisionTreeRegressor to make local explanations, and I see that LIME uses One vs. Rest approach, i.e. it's trying to explain the prediction probability of one label (class value) at a time, hence we have to use a regressor instead of a classifier. Now a naive question: since decision tree natively handles multi-class, do you think it's better to explain all the labels using a single tree (i.e. using DecisionTreeClassifier), instead of using a separate regressor to explain the probability of each label?

from lime.

yanghuidong commented on August 26, 2024

I noticed that if I limit the tree complexity (e.g. max_depth=4, max_leaf_nodes=16), the resultant does not seem to use the very top "most important" features (in terms of impurity reduction), which is stored in clf.feature_importances_ (and those "top weights" seem to be quite similar to those obtained using linear regressors).

So naive question again: it seems that a restricted tree is not using the best features, how do you view this? Is it a trade-off?

from lime.

Model used for feature selection about lime HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent