teamhg-memex / eli5 Goto Github PK
View Code? Open in Web Editor NEWA library for debugging/inspecting machine learning classifiers and explaining their predictions
Home Page: http://eli5.readthedocs.io
License: MIT License
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Home Page: http://eli5.readthedocs.io
License: MIT License
It seems that in case of explain_prediction we can call InvertableHashingVectorizer.fit method automatically.
https://github.com/facebookresearch/fastText
There is a Python wrapper (https://github.com/salestock/fastText.py), but I'm not sure I like its API, and it segfaults for me sometimes. Time to create another wrapper? :)
Original LIME code uses linear regression; it'd be nice to have it here as well and compare results.
We should use StandardScaler or RobustScaler to make LIME implementation work better when features have different scale.
Just like we have it for feature weights, this could show the number of remaining features with non-zero feature importance - it can still be lower than total number of features, I think.
ivec.get_feature_names()[:100]
doesn't work for InvertableHashingVectorizer because FeatureName doesn't handle slice objects.
See also: http://stackoverflow.com/questions/13855288/turn-slice-into-range-in-python; there is also a nice comment
Does it help that in Python 3 you can slice a range object to get a new range object?
-- given that a feature is not super-important we could decide to support it only in Python 3 :)
We need a proper documentation.
Currently only classification support is built-in. It shouldn't be hard to add regression support.
Currently in order to change formatting options in IPython notebook user has to do something like this:
from IPython.display import HTML
expl = explain_weights(clf, vec=fe, top=20)
HTML(format_as_html(expl, highlight_spaces=False, horizontal_layout=False))
It'd be nice to reduce it to a one-liner.
When highlighting a feature, we can highlight it regardless of length (currently in master), or try to preserve density, so coloring longer feature with a less intense color. I tried that second approach in preserve-density branch, here are some screenshots with master behaviour on top (links to notebooks: https://github.com/TeamHG-Memex/eli5/blob/preserve-density/notebooks/explain_text_prediction.ipynb for words and https://github.com/TeamHG-Memex/eli5/blob/preserve-density/notebooks/explain_text_prediction_char.ipynb for chars).
I tried to add tests for scikit-learn 0.17, but it turns out compatibility shims in eli5.lime don't work - e.g. KFold has different API. What do you think about dropping scikit-learn 0.17 support, and supporting only 0.18.x? //cc @lopuhin
See #21 (comment).
Oversampling is a common strategy for handling imbalanced datasets; we should take a look at how is data over-sampled - maybe there are ideas to steal. See https://github.com/scikit-learn-contrib/imbalanced-learn.
These notebooks got broken after we switched to @attrs
instead of using dicts for explanation results:
I think FeatureUnhasher.get_feature_names should have an option to use nan / None as feature names instead of generated FEATURE[%d]
string names. Creating all these string is the slowest part of this code, and it looks unnecessary because printing/formatting code can easily generate missing feature names itself.
Not sure what is the best solution here - always return the bias even with feature_re, or add an option?
Just adding features from .transformer_list
, possibly with prefixes, should be enough
https://travis-ci.org/TeamHG-Memex/eli5/jobs/173112065 - I think this is the same failure I already saw, I added random_state but it did not help:
=================================== FAILURES ===================================
________________________________ test_fit_proba ________________________________
def test_fit_proba():
X = np.array([
[0.0, 0.8],
[0.0, 0.5],
[1.0, 0.1],
[0.9, 0.2],
[0.7, 0.3],
])
y_proba = np.array([
[0.0, 1.0],
[0.1, 0.9],
[1.0, 0.0],
[0.55, 0.45],
[0.4, 0.6],
])
y_bin = y_proba.argmax(axis=1)
# fit on binary labels
clf = LogisticRegression(C=10, random_state=42)
clf.fit(X, y_bin)
y_pred = clf.predict_proba(X)[:,1]
mae = mean_absolute_error(y_proba[:,1], y_pred)
print(y_pred, mae)
# fit on probabilities
clf2 = LogisticRegression(C=10, random_state=42)
fit_proba(clf2, X, y_proba, expand_factor=200)
y_pred2 = clf2.predict_proba(X)[:,1]
mae2 = mean_absolute_error(y_proba[:,1], y_pred2)
print(y_pred2, mae2)
assert mae2 * 1.2 < mae
# let's get 3th example really right
sample_weight = np.array([0.1, 0.1, 0.1, 10.0, 0.1])
clf3 = LogisticRegression(C=10, random_state=42)
fit_proba(clf3, X, y_proba, expand_factor=200, sample_weight=sample_weight)
y_pred3 = clf3.predict_proba(X)[:,1]
print(y_pred3)
val = y_proba[3][1]
assert abs(y_pred3[3] - val) * 1.5 < abs(y_pred2[3] - val)
> assert abs(y_pred3[3] - val) * 1.5 < abs(y_pred[3] - val)
E assert (0.077946544208881308 * 1.5) < 0.10327808741270417
E + where 0.077946544208881308 = abs((0.3720534557911187 - 0.45000000000000001))
E + and 0.10327808741270417 = abs((0.34672191258729584 - 0.45000000000000001))
tests/test_lime_utils.py:53: AssertionError
----------------------------- Captured stdout call -----------------------------
[ 0.92137462 0.87156298 0.26152978 0.34672191 0.49837953] 0.114698148448
[ 0.99854408 0.90620802 0.1122826 0.31398412 0.59140365] 0.0529117527887
[ 0.9862338 0.94839957 0.23016764 0.37205346 0.59652343]
The signs are wrong since the beginning (#12)
I think it makes sense to add something like asdict
method to Explanation that will return a JSON-serializable object (it will just call attr.asdict(self)
).
And also add test that check that it is indeed json-serializable (right now it can have some numpy ints that are not seriazable).
On a screenshot the letter 'd' in 'medication' word has -0.084 weight in the top example and -0.079 weight in the bottom example, according to titles displayed on mouse hover. But the bottom 'd' is brighter. Is it because weight range is different? If so, does it make sense to use the same weight range for all classes?
Currently it is not possible to show an explanation of an individual sklearn_crfsuite.CRF prediction.
And maybe also some other functions? They are needed if we want to render weights in html similar to how it is done in the html formatter.
Another option would be to use an object instead of (name, weight) tuple, and add hsl_color
attribute to it. I'm not sure which is better, making functions public feels less committing.
Hi guys, I really like this tool! I have a pipeline say
mlb = MultiLabelBinarizer()
y_train = mlb.fit_transform(y_train)
vec = TfidfVectorizer(ngram_range=(1, 2), stop_words='english')
clf = OneVsRestClassifier(LogisticRegressionCV())
pipeline = make_pipeline(vec, clf)
pipeline.fit(X_train, y_train)
show_prediction
works neatly, but I run into 'LogisticRegressionCV' object has no attribute 'classes_' when calling eli5.show_weights(clf.estimator, vec=vec, target_names=mlb.classes_)
or unsupported class if I use clf
directly.
Is it possible to work around this problem or do you plan adding support for this soon?
Cheers!
Simon
Sometimes it is useful to check coefficients only for some of the features. For example, here (scroll down to "What are important features?") one may want to check how e.g. query:... features affect the result, without looking at all other features. This also can be helpful when adding a new feature.
What about adding 'feature_re' or 'feature_patterns' argument to explain_weights functions?
Weight tables don't fit the column at http://eli5.readthedocs.io/en/latest/tutorials/sklearn_crfsuite.html, and there is no scrolling - tables are just truncated.
I was able to fix it locally using custom CSS, but had to revert the fix (153ab4a) because on readthedocs.io original stylesheet was not loaded for some reason.
A minor issue, but since targets are rendered in separate tables, their headers might get misaligned, for example here: http://eli5.readthedocs.io/en/latest/tutorials/sklearn-text.html
It should be easy to move headers rendering out, to the top-level table.
Order in text is wrong:
$ py.test tests/test_sklearn_explain_prediction.py::test_explain_linear_regression[reg0] -s
============================================================================== test session starts ===============================================================================
platform darwin -- Python 3.5.1, pytest-3.0.2, py-1.4.31, pluggy-0.3.1
rootdir: /Users/kostia/shub/memex/eli5, inifile:
plugins: hypothesis-3.4.2
collected 25 items
tests/test_sklearn_explain_prediction.py {'estimator': 'ElasticNet(alpha=1.0, copy_X=True, fit_intercept=True, '
'l1_ratio=0.5,\n'
' max_iter=1000, normalize=False, positive=False, '
'precompute=False,\n'
" random_state=42, selection='cyclic', tol=0.0001, "
'warm_start=False)',
'method': 'linear model',
'targets': [{'feature_weights': {'neg': [('x10', -19.656206335733643),
('x12', -16.947217711388856),
('x9', -3.368443508747657),
('x7', -0.73147197826808674)],
'neg_remaining': 0,
'pos': [('<BIAS>', 38.96972344614295),
('x5', 6.8348858609128671),
('x11', 4.8082096167385444),
('x8', 1.8485323743243427),
('x0', 0.23929256935816867)],
'pos_remaining': 0},
'score': 11.997304333338633,
'target': 'y'}]}
Explained as: linear model
'y' (score=11.997) top features
----------------
+38.970 <BIAS>
+6.835 x5
+4.808 x11
+1.849 x8
+0.239 x0
-19.656 x10
-16.947 x12
-3.368 x9
-0.731 x7
It might be possible to improve explain_prediction performance for large number of items by vectorizing predictions and argmax calculations (see for example http://vmprof.com/#/d7e9c13357f50a5ee4d00a69200693f5?id=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0&view=flames). But it's not clear if this is worth additional complications.
Because we include only weights, but there are not weights in this case, and it is not clear what happened.
A follow-up to #10 and #18: when deciding if a feature should be in top positive or in top negative features we should take in account sign of the most popular term, e.g. instead of
(-)people | considered | approximately +1.739
(as it is now)
it should be better to show
people | (-)considered | (-)approximately -1.739
We need a helper to generate examples similar to a given example. Helpful links:
Scrolling in the notebook is noticeably laggy when the text with highlighted features is in sight. Maybe it will help if we merge spans with the same weight (at least in case of word analyzers).
Currently if there are several classes weights for them are shown separately. I think sometimes it could be helpful to show them all in the same table (or even highlight them all in text).
A widget may allow to change options, e.g.:
Currently they are tested only via integrational test which checks that they produce reasonable explanations when used with explain_weights. As the logic is not straightforward it makes sense to add unit tests for them.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.