Awesome Interpretable Machine Learning
Opinionated list of resources facilitating model interpretability (introspection, simplification, visualization, explanation).
Interpretable Models
- Simple decision trees
- Rules
- Interpretable Classifiers Using Rules and Bayesian Analysis: Building a Better Stroke Prediction Model https://arxiv.org/pdf/1511.01644.pdf
- (Regularized) linear regression
Models Offering Feature Ranking
- Random Forest
- Boosted Trees
- Linear regression (with a grain of salt)
Good Old Feature Selection
- Filters
- Wrappers
- Embedded methods
- Varia
- Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective
- Universal (model agnostic) Variable Importance Measures
- https://arxiv.org/pdf/1801.01489
- https://github.com/aaronjfisher/mcr
- Feature Engineering and Selection by Kuhn & Johnson
- Sligtly off-topic, but very interesting book
- http://www.feat.engineering/index.html
- https://bookdown.org/max/FES/
- https://github.com/topepo/FES
- Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective
Model Explanations
Philosophy
- Magnets by R. P. Feynman https://www.youtube.com/watch?v=wMFPe-DwULM
- The Mythos of Model Interpretability
- The Promise and Peril of Human Evaluation for Model Interpretability
- Towards Rigorous Science of Model Interpretability https://arxiv.org/pdf/1702.08608
- The Book of Why: The New Science of Cause and Effect by Judea Pearl
- Looking Inside the Black Box, presentation of Leo Breiman
Model Agnostic Explanations
- LIME (Local Interpretable Model-agnostic Explanations)
- SHAP (SHapley Additive exPlanations), generalizing LIME
- Anchors: High-Precision Model-Agnostic Explanations, another improvement over LIME
- Explanations of Model Predictions with live and breakDown Package
- Model Explanation System by Ryan Turner
- Understanding Black-box Predictions via Influence Functions
- A review book - Interpretable Machine Learning. A Guide for Making Black Box Models Explainable by Christoph Molnar
Model Specific Explanations - Neural Networks
- Visualizing and Understanding Convolutional Networks
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- Understanding Neural Networks Through Deep Visualization
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
- Generating Visual Explanations
- Rationalizng Neural Network Predictions
- Pixel entropy can be used to detect relevant picture regions (for CovNets)
- See Visualization section and Fig. 5 of the paper
High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks
- See Visualization section and Fig. 5 of the paper
- SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability
- Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks
- Axiomatic Attribution for Deep Networks
- Proposes Integrated Gradients Method
- https://arxiv.org/pdf/1703.01365.pdf
- Code: https://github.com/ankurtaly/Integrated-Gradients
- See also: Gradients of Counterfactuals https://arxiv.org/pdf/1611.02639.pdf
- Learning Important Features Through Propagating Activation Differences
- Proposes Deep Lift method
- https://arxiv.org/pdf/1704.02685.pdf
- Code: https://github.com/kundajelab/deeplift
- Videos: https://www.youtube.com/playlist?list=PLJLjQOkqSRTP3cLB2cOOi_bQFw6KPGKML
- The (Un)reliability of saliency methods
- Review of failures for methods extracting most important pixels for prediction
- https://arxiv.org/pdf/1711.00867.pdf
- Classifier-agnostic Saliency Map Extraction
- The Building Blocks of Interpretability
- https://distill.pub/2018/building-blocks
- Has some embeded links to notebooks
- Uses Lucid library https://github.com/tensorflow/lucid
Extracting Interpretable Models From Complex Ones
- Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples
- Distilling a Neural Network Into a Soft Decision Tree
Model Visualization
- Visualizing Statistical Models: Removing the blindfold
- Partial dependence plots
- http://scikit-learn.org/stable/auto_examples/ensemble/plot_partial_dependence.html
- pdp: An R Package for Constructing Partial Dependence Plots https://journal.r-project.org/archive/2017/RJ-2017-016/RJ-2017-016.pdf https://cran.r-project.org/web/packages/pdp/index.html
- ggfortify: Unified Interface to Visualize Statistical Results of Popular R Packages
- RandomForestExplainer
- ggRandomForest
Selected Review Talks
- Tutorial on Interpretable machine learning at ICML 2017
- P. Biecek, Show Me Your Model tools for visualisation of statistical models
- S. Ritchie, Just-So Stories of AI
- C. Jarmul, Towards Interpretable Accountable Models
- I. Oszvald, Machine Learning Libraries You’d Wish You’d Known About
- A large part of the talk covers model explanation and visualization
- Video: https://www.youtube.com/watch?v=nDF7_8FOhpI
- Associated notebook on explaining regression predictions: https://github.com/ianozsvald/data_science_delivered/blob/master/ml_explain_regression_prediction.ipynb
Venues
- Interpretable ML Symposium (NIPS 2017) (contains links to papers, slides and videos)
- http://interpretable.ml/
- Debate, Interpretability is necessary in machine learning
- 2017 Workshop on Human Interpretability in Machine Learning (WHI) (in conjunction with ICML 2017) (contains links to papers and slides)
Software
Software related to papers is mentioned along with each publication. Here only standalone software is included.
- ELI5 - Python package dedicated to debugging machine learning classifiers and explaining their predictions
- yellowbrick - visual analysis and diagnostic tools to facilitate machine learning model selection
- lime - R package implementing LIME
- forestmodel - R package visualizing coefficients of different models with the so called forest plot
- DALEX - Descriptive mAchine Learning EXplanations
- Lucid - a collection of infrastructure and tools for research in neural network interpretability