sfschouten / court-of-xai Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 4.0 50.2 MB

Court of XAI - A Python library for the systematic comparison of feature additive explanation methods.

License: MIT License

Python 63.89% Jsonnet 36.11%

court-of-xai's People

Contributors

Stargazers

Watchers

Forkers

michaeljneely a-lucic ripankundu

court-of-xai's Issues

nan-loss on entmax with Quora dataset.

It seems the entmax function can't handle an alpha of <=1, but it still returns gradients that allow it to get there during training.
This causes the loss to become 'nan'.

Display progress of interpretation

Use Tqdm to show how far along the interpretation is for each SaliencyInterpreter. So we can have some idea of how much longer it will take.

Instantiate Evaluator from JsonNet file.

Allow for specifying:
- directory of trained model to experiment on;
- which interpreters to include and/or which combinations of interpreters to include;
- wether to run experiments on cuda;
- batch size to use during experiment;
Rename Evaluator to AttentionExperiment[er] to avoid confusion with the evaluate command in AllenNLP.
[Optional] Add attn_experiment command that starts the experiment from the commandline.

We can use Params.from_file and FromParams.from_params like it is done in allennlp/commands/train.py

Upgrade AllenNLP Version

AllenNLP 2.0+ introduces some breaking changes we will need to address to upgrade.

The LIME code fails when there's a datareader that concatenates the sequences for DistilBERT. We convert instances back to text, but the concatenated sequence gets converted to a single string, but the datareader expects two separate string sequences to make an Instance.

There's a way to fix this that would also make the code nicer. The implementation of LIME we use assumes unprocessed text, which is why we convert our tokenized Instances back to strings. Instead we should replace some of the LIME code so we can do the replacing of some of the tokens with UNKs ourselves, that way the datareader wouldn't be involved anymore.

This would also have as an added benefit that it would eliminate (possible) mistakes occuring in converting back to raw text and then re-tokenizing. I suspect this is introducing at least some weirdness.

Add Machine Translation Task

Attention was originally proposed in the context of Machine Translation, so it makes since to include it in our list of tasks.

Components:

Select dataset
Add Recurrent seq2seq model
Add Transformer seq2seq model

Use Captum Leave One Out Implementation

Our implementation doesn't work on the pair sequence classifier.

sfschouten / court-of-xai Goto Github PK

court-of-xai's People

Contributors

Stargazers

Watchers

Forkers

court-of-xai's Issues

nan-loss on entmax with Quora dataset.

Display progress of interpretation

Instantiate Evaluator from JsonNet file.

Upgrade AllenNLP Version

LIME on DistilBERT

Add Machine Translation Task

Use Captum Leave One Out Implementation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent