Giter Club home page Giter Club logo

argu's Introduction

SentArg: A Hybrid Doc2Vec/DPH Model with Sentiment Analysis Refinement

Argument retrieval model for Touché @ CLEF 2020 - 1st Shared Task on Argument Retrieval.

Prerequisites

  1. Download and extract the Args.me Corpus
  2. Put args-me.json and the topic.xml in your input directory

Tira Runs

  1. $ docker pull mongo:4.4
  2. $ docker build -t argu .
  3. $ chmod +x ./tira_run.sh
  4. $ ./tira_run.sh -s <run-type> -i $inputDataset -o $outputDir
    • Run Types
      • no: No sentiments
      • emotional: Emotional is better
      • neutral: Neutral is better

Docker

  1. $ docker build -t argu .
    • Build the image
  2. $ docker run --name argu-mongo -p 27017:27017 -d --rm mongo
    • Starts a MonoDB container
  3. $ docker run -e RUN_TYPE=<run-type> -v <input-dir-path>:/input -v <output-dir-path>:/output --name argu --rm -it --network="host" argu
    • Runs the ArgU container
    • Run Types
      • no: No sentiments
      • emotional: Emotional is better
      • neutral: Neutral is better
    • Input directory with args-me.json and topics.xml
    • Output directory will get the results as run.txt
  4. $ docker stop argu-mongo
    • Remove MongoDB container

Command Line

  1. $ pip install -r requirements.txt

  2. $ python argU/preprocessing/mongodb.py -i <input-dir-path>

    • Create mapping (mongoDB ID <--> argument.id); store into MongoDB
    • Store arguments with the new ID into MongoDB
    • Clean arguments and store as train-arguments into MongoDB
    • Read Sentiments and store into MongoDB
  3. $ python argU/preprocessing/trec.py -i <input-dir-path>

    • Create a .trec-file for Terrier (train and queries)
  4. $ python argU/indexing/a2v.py -f

    • Generate CBOW
    • Generate argument embeddings and store them into MongoDB
  5. Install and run Terrier (see Dockerfile)

    • Calculate DPH for queries
    • copy result file in resources
  6. $ python -m argU -d

    • Compare given queries with argument embeddings; store Top-N DESM scores into MongoDB
  7. Merge DESM, Terrier and Sentiments to create final scores

    1. $ python -m argU -m -s no -o <output-dir-path>
      • R1: No sentiments
    2. $ python -m argU -m -s emotional -o <output-dir-path>
      • R2: Emotional is better
    3. $ python -m argU -m -s neutral -o <output-dir-path>
      • R3: Neutral is better

Submodul Excecution

  • For individual moduls, cd into directory and run $ python -m [modulname]

Documentation

Modules

Built With

Relevant Information for Subtask (1)

Task

Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. We invite to participate in the first lab on Argument Retrieval at CLEF 2020 featuring two subtasks:

(1) retrieval in a focused argument collection to support argumentative conversations.

The (1) subtask is motivated by the support of users who search for arguments directly, e.g., by supporting their stance, and targets argumentative conversations. The task is to retrieve arguments from the provided dataset of the focused crawl with content from online debate portals for the 50 given topics, covering a wide range of controversial issues.

Data

Argument topics for subtask (1) and comparative questions for subtask (2) will be send to each team via email upon completed registration. The topics will be provided as XML files.

Example topic for subtask (1):

1 <title>Is climate change real?</title> You read an opinion piece on how climate change is a hoax and disagree. Now you are looking for arguments supporting the claim that climate change is in fact real. Relevant arguments will support the given stance that climate change is real or attack a hoax side's argument.

Document collections. To search for relevant arguments, you can use your own index based on the dataset args-me or for simplicity deploy an API of the search engine args.me.

Runs Submission

We encourage participants to use TIRA for their submissions to increase replicability of the experiments. We provide a dedicated TIRA tutorial for Touché and are available to walk you through. You can also submit runs per email. In both cases, we will review your submission promptly and provide feedback.

Runs may be either automatic or manual. An automatic run is made without any manual manipulation of the given topic titles. Your run is automatic if you do not use description and narrative for developing approaches. A manual run is anything that is not an automatic run. Please let us know which of your runs are manual upon submission.

The submission format for both tasks will follow the standard TREC format:

qid Q0 doc rank score tag

With:

  • qid: The topic number.
  • Q0: Unused, should always be Q0.
  • doc: The document id returned by your system for the topic qid:
    • For subtask (1): Use the official args-me id.
  • rank: The rank the document is retrieved at.
  • score: The score (integer or floating point) that generated the ranking. The score must be in descending (non-increasing) order. It is important to handle tied scores. (trec_eval sorts documents by the score values and not your rank values.)
  • tag: A tag that identifies your group and the method you used to produce the run.

The fields should be spectated with a whitespace. The width of the columns in the format is not important, but it is important to include all columns and have some amount of white space between the columns.

An example run for task 1 is:

1 Q0 10113b57-2019-04-18T17:05:08Z-00001-000 1 17.89 myGroupMyMethod
1 Q0 100531be-2019-04-18T19:18:31Z-00000-000 2 16.43 myGroupMyMethod
1 Q0 10006689-2019-04-18T18:27:51Z-00000-000 3 16.42 myGroupMyMethod
...

Material

argu's People

Contributors

luckyos-code avatar t4khosu avatar ossama833 avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.