Giter Club home page Giter Club logo

analogies_mining's Introduction

๐ŸŽช Life is a Circus and We are the Clowns ๐Ÿคก: Automatically Finding Analogies between Situations and Processes

This repository contains the code for the paper: https://arxiv.org/abs/2210.12197.
Authors: Oren Sultan, Dafna Shahaf, The Hebrew University of Jerusalem, Israel.
Conference: The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022).

Setup

The code is implemented in python 3.8.12. To run it, please install the requirements.txt file:

pip install -r minimalrequirements.txt

Where to start?

Explore the paper_experiments_results folder for restoring the results in the experiment (each folder contains a separate README file).
Run runner.py for running our algorithm on a specific example of pairs of texts.
Note that you don't need to run coreference and qa_srl, as the output files have already exist in the repo. (You should run coreference and qa_srl only if you use a new input text files, by setting run_coref=False, run_qasrl=False in analogous_matching_algorithm function)

Important folders

paper_experiments_results:
Contains the datasets, the labels of the annotators, as well as the data which generates the results in the figures and tables of the three experiments. Each inner folder contains a separate README file.

data:
Includes the following folders:
original_text_files -- all the original texts files (including the stories and paragraphs from ProPara).
coref_text_files -- all the texts files after coreference (including the stories and paragraphs from ProPara).
propara -- data files relevant to ProPara dataset, output files of the ranking lists for the different models (see Section 4.1 in the paper), and some code files to read and print stats on ProPara and the methods.

s2e-coref:
Contains the implementation code for the coreference model that we used (see Section 3.1 in the paper).

qasrl-modeling
Contains the implementation code for the QA-SRL model that we used (see Section 3.2 in the paper).

Important py. files

Algorithm's code files

runner.py -- runner of our analogous matching algorithm on given pairs.
find_mappings.py -- run FMQ method on a given pair of texts (called from outside to generate_mappings function).
find_mappings_verbs.py -- run FMV method on a given pair of texts (called from outside to generate_mappings function).
sentence_bert.py -- run SBERT on a given pair of texts.
coref.py -- run our coreference implementation on input files.
qa-srl.py -- run our QA-SRL implementation on texts files (after coref).

Experiment's code files

run_propara_all_pairs_exp.py -- run experiment 1 (see Section 4.1 in the paper).
analogies_mining_exp_annotators_consistency.py -- run annotators consistency confusion matrix (see Section 4.1 in the paper).
run_mappings_evaluation_exp.py -- run experiment 2 (see Section 4.2 in the paper).
run_robustness_to_paraphrases_exp.py -- run experiment 3 (see Section 4.3 in the paper).

Cite

@article{sultan2022life,
title={Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes},
author={Sultan, Oren and Shahaf, Dafna},
journal={arXiv preprint arXiv:2210.12197},
year={2022}
}

Contact

For inquiries, please send an email to [email protected].

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.