Comments (7)
If you're using a decision tree as an explainer, I would suggest not using any feature selection algorithm, but applying some constraint on the tree to keep it interpretable (e.g. maximum number of leaves, or maximum depth).
I've done this before, with interesting results.
from lime.
I have done both, but I think explaining each label at a time is better.
The explanation of a prediction is a local approximation, and most of the time you only want to explain the prediction of the top class anyway.
Having the extra signal that comes from using prediction probabilities of the top class and doing regression (instead of using the label and doing a locally weighted classifier instead of regressor) is also nice.
from lime.
Depends on how you define 'best features'. If you take clf.feature_importances_, you end up with a linear contribution score, and lose the interpretability of having a tree.
If you take a restricted tree, you have a tree that was (greedily) optimized to predict (or more precisely, to reduce impurity of predictions) in the neighborhood of the instance you're explaining. The features you get there should be the best (greedy) choices in terms of reducing impurity.
If the tree seems unintuitive, I would maybe try DecisionTreeClassifier instead of DecisionTreeRegressor and using 0-1 labels. You do lose the probability signal and may end up with nonsensical trees (e.g. always predict 1 for neighborhoods that are very positive), but the resulting tree will not care about making mistakes within the predicted class.
from lime.
We do have some other options for feature selection, but honestly I think the regressor that's used in the instance explanation doesn't matter that much, if it is linear.
Do you have a use case for other regressors, or other feature selectors?
Best,
from lime.
I even tried DecisionTreeRegressor (with some adjustments since it has no intercept), and on the example from your tutorial, there seemed to be no difference compared to using Ridge as the explainer.
I was asking because at the end of your LIME paper you mentioned that using decision tree as explainer could be interesting, too. So naively, if I'm going to try that just for fun, do you think I should also use decision tree for feature selection?
Thanks!
from lime.
So I was using DecisionTreeRegressor to make local explanations, and I see that LIME uses One vs. Rest approach, i.e. it's trying to explain the prediction probability of one label (class value) at a time, hence we have to use a regressor instead of a classifier. Now a naive question: since decision tree natively handles multi-class, do you think it's better to explain all the labels using a single tree (i.e. using DecisionTreeClassifier), instead of using a separate regressor to explain the probability of each label?
from lime.
I noticed that if I limit the tree complexity (e.g. max_depth=4, max_leaf_nodes=16), the resultant does not seem to use the very top "most important" features (in terms of impurity reduction), which is stored in clf.feature_importances_ (and those "top weights" seem to be quite similar to those obtained using linear regressors).
So naive question again: it seems that a restricted tree is not using the best features, how do you view this? Is it a trade-off?
from lime.
Related Issues (20)
- Can LIME only be used for text classification tasks?
- How to explain prediction for a data with just a few features (from all features of training dataset)? HOT 2
- Trying to explaine a simple Neural Network using LIME HOT 1
- submodular pick
- ans = self.domain_mapper.map_exp_ids(self.local_exp[label_to_use], **kwargs) KeyError: 1
- Getting error when my predict_fn is actually a method from a class HOT 1
- `show_in_notebook` shows no bars
- importance score for all tokens
- Unexpected keyword argument 'progress_bar' for lime_image.LimeImageExplainer().explain_instance()
- Issues with LIME Implementation in Image Classification
- How to use lime with 3 dim time series data and lstm
- Distances derivation, explain_instance() method in Tabular LIME
- How to handle binary features
- RecurrentTabularExplainer for regression task
- Question: using LIME for text generation HOT 1
- LIME and multi-correlated variables
- update Python support
- Inverse scaling/ normalization to get actual unscaled values in explanation : Old issue but I made a way around
- Met the issue about TypeError when using lime_image.LimeImageExplainer()
- Saving the output of show_in_notebook as static image
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lime.