Giter Club home page Giter Club logo

xai-papers's Introduction

Papers on Explainable Artificial Intelligence

This is an on-going attempt to consolidate interesting efforts in the area of understanding / interpreting / explaining / visualizing a pre-trained ML model.


GUI tools

  • DeepVis: Deep Visualization Toolbox. Yosinski et al. 2015 code | pdf
  • SWAP: Generate adversarial poses of objects in a 3D space. Alcorn et al. 2018 code | pdf

Libraries

Surveys

  • Methods for Interpreting and Understanding Deep Neural Networks. Montavon et al. 2017 pdf
  • Visualizations of Deep Neural Networks in Computer Vision: A Survey. Seifert et al. 2017 pdf
  • How convolutional neural network see the world - A survey of convolutional neural network visualization methods. Qin et al. 2018 pdf
  • A brief survey of visualization methods for deep learning models from the perspective of Explainable AI. Chalkiadakis 2018 pdf
  • A Survey Of Methods For Explaining Black Box Models. Guidotti et al. 2018 pdf
  • Understanding Neural Networks via Feature Visualization: A survey. Nguyen et al. 2019 pdf

Definitions of Interpretability

  • The Mythos of Model Interpretability. Lipton 2016 pdf
  • Towards A Rigorous Science of Interpretable Machine Learning. Doshi-Velez & Kim. 2017 pdf
  • Interpretable machine learning: definitions, methods, and applications. Murdoch et al. 2019 pdf

Books

  • A Guide for Making Black Box Models Explainable. Molnar 2019 pdf

A. Explaining inner-workings

A1. Visualizing Preferred Stimuli

Synthesizing images / Activation Maximization

  • AM: Visualizing higher-layer features of a deep network. Erhan et al. 2009 pdf
  • DeepVis: Understanding Neural Networks through Deep Visualization. Yosinski et al. 2015 pdf | url
  • MFV: Multifaceted Feature Visualization: Uncovering the different types of features learned by each neuron in deep neural networks. Nguyen et al. 2016 pdf | code
  • DGN-AM: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. Nguyen et al. 2016 pdf | code
  • PPGN: Plug and Play Generative Networks. Nguyen et al. 2017 pdf | code
  • Feature Visualization. Olah et al. 2017 url
  • Diverse feature visualizations reveal invariances in early layers of deep neural networks. Cadena et al. 2018 pdf

Real images / Segmentation Masks

  • Visualizing and Understanding Recurrent Networks. Kaparthey et al. 2015 pdf
  • Object Detectors Emerge in Deep Scene CNNs. Zhou et al. 2015 pdf
  • Understanding Deep Architectures by Interpretable Visual Summaries pdf

A2. Inverting Neural Networks

  • Understanding Deep Image Representations by Inverting Them pdf
  • Inverting Visual Representations with Convolutional Networks pdf
  • Neural network inversion beyond gradient descent pdf

A3. Distilling DNNs into more interpretable models

  • Interpreting CNNs via Decision Trees pdf
  • Distilling a Neural Network Into a Soft Decision Tree pdf
  • Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation. Tan et al. 2018 pdf
  • Improving the Interpretability of Deep Neural Networks with Knowledge Distillation. Liu et al. 2018 pdf

A4. Quantitatively characterizing hidden features

  • TCAV: Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors. Kim et al. 2018 pdf | code
    • Automating Interpretability: Discovering and Testing Visual Concepts Learned by Neural Networks. Ghorbani et al. 2019 pdf
  • SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability. Raghu et al. 2017 pdf | code
  • A Peek Into the Hidden Layers of a Convolutional Neural Network Through a Factorization Lens. Saini et al. 2018 pdf
  • Network Dissection: Quantifying Interpretability of Deep Visual Representations. Bau et al. 2017 url | pdf
    • GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. Bau et al. 2018 pdf
    • Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks. Fong & Vedaldi 2018 pdf

A5. Network surgery

  • How Important Is a Neuron? Dhamdhere et al. 2018 pdf

A6. Sensitivity analysis

  • NLIZE: A Perturbation-Driven Visual Interrogation Tool for Analyzing and Interpreting Natural Language Inference Models. Liu et al. 2018 pdf

B. Decision explanations

B1. Heatmaps

White-box / Gradient-based

  • A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations. Nie et al. 2018 pdf
  • A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks pdf
  • CAM: Learning Deep Features for Discriminative Localization. Zhou et al. 2016 code | web
  • Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Selvaraju et al. 2017 pdf
  • Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks. Chattopadhyay et al. 2017 pdf | code
  • LRP: Beyond saliency: understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation pdf
    • DTD: Explaining NonLinear Classification Decisions With Deep Tayor Decomposition pdf
  • Regional Multi-scale Approach for Visually Pleasing Explanations of Deep Neural Networks. Seo et al. 2018 pdf
  • Interpretable Explanations of Black Boxes by Meaningful Perturbation. Fong et al. 2017 pdf
  • Integrated Gradients: Axiomatic Attribution for Deep Networks. Sundararajan et al. 2018 pdf | code
  • I-GOR: Visualizing Deep Networks by Optimizing with Integrated Gradients. Qi et al. 2019 pdf
  • Visual explanation by interpretation: Improving visual feedback capabilities of deep neural networks. Oramas et al. 2019 pdf

Black-box / Perturbation-based

  • RISE: Randomized Input Sampling for Explanation of Black-box Models. Petsiuk et al. 2018 pdf
  • LIME: Why should i trust you?: Explaining the predictions of any classifier. Ribeiro et al. 2016 pdf | blog

Evaluating heatmaps

  • The (Un)reliability of saliency methods. Kindermans et al. 2018 pdf
  • Sanity Checks for Saliency Maps. Adebayo et al. 2018 pdf

B2. Learning to explain

  • Learning how to explain neural networks: PatternNet and PatternAttribution pdf
  • Deep Learning for Case-Based Reasoning through Prototypes pdf
  • Unsupervised Learning of Neural Networks to Explain Neural Networks pdf
  • Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions pdf
    • Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations pdf
  • Towards robust interpretability with self-explaining neural networks. Alvarez-Melis and Jaakola 2018 pdf

C. Counterfactual explanations (what would have happen)

  • Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections. Zhang et al. 2018 pdf

D. Unclassified

  • Yang, S. C. H., & Shafto, P. Explainable Artificial Intelligence via Bayesian Teaching. NIPS 2017 pdf
  • Explainable AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation pdf
  • ICADx: Interpretable computer aided diagnosis of breast masses. Kim et al. 2018 pdf
  • Neural Network Interpretation via Fine Grained Textual Summarization. Guo et al. 2018 pdf
  • LS-Tree: Model Interpretation When the Data Are Linguistic. Chen et al. 2019 pdf

xai-papers's People

Contributors

anguyen8 avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.