Giter Club home page Giter Club logo

factcheck-acl2021's Introduction

Code for the fact-check rationalization paper @ ACL 2021.

Paper:

Structurizing Misinformation Stories via Rationalizing Fact-Checks
Shan Jiang, Christo Wilson
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2021
Paper available at: https://shanjiang.me/publications/acl21_paper.pdf

Contact:

Shan Jiang ([email protected])

General instructions.

Install required dependencies:

pip install -r requirements.txt

Download and process data following README.md in [DATA_NAME] folder:

cd data/[DATA_NAME]

Train models or analyze rationales with run.py:

python rationalize/run.py --mode=[MODE] --data_name=[DATA_NAME] --config_name=[CONFIG_NAME]

[MODE]:

  • train: train a model.
  • evaluate: evaluate a model.
  • output: output rationales.
  • binarize: binarize rationales to 0/1 (soft rationalization only).
  • vectorize: generate vectors/embeddings for rationales.
  • cluster: cluster rationales and plot figures.

[DATA_NAME]:

  • movie_reviews: the dataset of movie reviews.
  • personal_attacks: the dataset of fact-checks.
  • fact-checks: the dataset of fact-checks.
  • glove: pretrained GloVe embeddings.

[CONFIG_NAME]:

  • e.g., soft_rationalizer or any .config files in [DATA_NAME] folder.

Instructions for replicating results in the paper.

Replicating results for Table 1.

Here is the instruction to replicate the movie_reviews column of Table 1. To replicate another column simply replace movie_reviews to personal_attacks in all the command lines.

First make sure that the dataset and embeddings are prepared:

cd data/movie_reviews
./prepare_data.sh
cd ../glove
./prepare_data.sh
cd ../..

Then, run the following command, each line corresponds to an experiment from h0-h3 and s0-s1:

python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=hard_rationalizer          # h0
python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_w_domain # h1
python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_wo_regu  # h2
python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_w_anti   # h3
python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=soft_rationalizer          # s0
python rationalize/run.py --mode=train --data_name=movie_reviews --config_name=soft_rationalizer_w_domain # s1

To replicate the results for s2-s3, run:

python rationalize/run.py --mode=output --data_name=movie_reviews --config_name=soft_rationalizer_w_domain
python rationalize/run.py --mode=binarize --data_name=movie_reviews --config_name=soft_rationalizer_w_domain

Replicating results for Figures 3-5.

We have logged data to plot Figures 3-5.

To plot Figure 3, run:

python rationalize/run.py --mode=cluster --data_name=fact-checks --config_name=soft_rationalizer_w_domain

The results can be found in data/fact-checks/soft_rationalizer_w_domain.cluster.

To plot Figures 4 and 5, run:

cd data/fact-checks
python result_visualizer.py

The results can be found in data/fact-checks/soft_rationalizer_w_domain.results.

If you would like to train the model from scratch, run the following command in sequence.

cd data/fact-checks
python data_downloader.py     # Download fact-checks.
python data_extractor.py      # Extract text from HTML.
python data_cleaner.py        # Clean fact-checks.
python data_word2vec.py       # Build word2vec.
cd ../..
python rationalize/run.py --mode=train --data_name=fact-checks --config_name=soft_rationalizer_w_domain
python rationalize/run.py --mode=output --data_name=fact-checks --config_name=soft_rationalizer_w_domain
python rationalize/run.py --mode=vectorize --data_name=fact-checks --config_name=soft_rationalizer_w_domain
cd data/fact-checks
python rationale_filterer.py  # Filter vectors.
cd ../..
python rationalize/run.py --mode=cluster --data_name=fact-checks --config_name=soft_rationalizer_w_domain
cd data/fact-checks
python rationale_mapper.py    # Map rationales.
python result_visualizer.py   # Plot results.

factcheck-acl2021's People

Contributors

dependabot[bot] avatar printfoo avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.