Giter Club home page Giter Club logo

Comments (7)

marcotcr avatar marcotcr commented on August 26, 2024 1

If you're using a decision tree as an explainer, I would suggest not using any feature selection algorithm, but applying some constraint on the tree to keep it interpretable (e.g. maximum number of leaves, or maximum depth).
I've done this before, with interesting results.

from lime.

marcotcr avatar marcotcr commented on August 26, 2024 1

I have done both, but I think explaining each label at a time is better.
The explanation of a prediction is a local approximation, and most of the time you only want to explain the prediction of the top class anyway.
Having the extra signal that comes from using prediction probabilities of the top class and doing regression (instead of using the label and doing a locally weighted classifier instead of regressor) is also nice.

from lime.

marcotcr avatar marcotcr commented on August 26, 2024 1

Depends on how you define 'best features'. If you take clf.feature_importances_, you end up with a linear contribution score, and lose the interpretability of having a tree.
If you take a restricted tree, you have a tree that was (greedily) optimized to predict (or more precisely, to reduce impurity of predictions) in the neighborhood of the instance you're explaining. The features you get there should be the best (greedy) choices in terms of reducing impurity.

If the tree seems unintuitive, I would maybe try DecisionTreeClassifier instead of DecisionTreeRegressor and using 0-1 labels. You do lose the probability signal and may end up with nonsensical trees (e.g. always predict 1 for neighborhoods that are very positive), but the resulting tree will not care about making mistakes within the predicted class.

from lime.

marcotcr avatar marcotcr commented on August 26, 2024

We do have some other options for feature selection, but honestly I think the regressor that's used in the instance explanation doesn't matter that much, if it is linear.
Do you have a use case for other regressors, or other feature selectors?
Best,

from lime.

yanghuidong avatar yanghuidong commented on August 26, 2024

I even tried DecisionTreeRegressor (with some adjustments since it has no intercept), and on the example from your tutorial, there seemed to be no difference compared to using Ridge as the explainer.

I was asking because at the end of your LIME paper you mentioned that using decision tree as explainer could be interesting, too. So naively, if I'm going to try that just for fun, do you think I should also use decision tree for feature selection?

Thanks!

from lime.

yanghuidong avatar yanghuidong commented on August 26, 2024

So I was using DecisionTreeRegressor to make local explanations, and I see that LIME uses One vs. Rest approach, i.e. it's trying to explain the prediction probability of one label (class value) at a time, hence we have to use a regressor instead of a classifier. Now a naive question: since decision tree natively handles multi-class, do you think it's better to explain all the labels using a single tree (i.e. using DecisionTreeClassifier), instead of using a separate regressor to explain the probability of each label?

from lime.

yanghuidong avatar yanghuidong commented on August 26, 2024

I noticed that if I limit the tree complexity (e.g. max_depth=4, max_leaf_nodes=16), the resultant does not seem to use the very top "most important" features (in terms of impurity reduction), which is stored in clf.feature_importances_ (and those "top weights" seem to be quite similar to those obtained using linear regressors).

So naive question again: it seems that a restricted tree is not using the best features, how do you view this? Is it a trade-off?

from lime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.