Scripts to convert the Universal Anaphora format to jsonlines
-
helper.py
contains scripts to convert UA to jsonlines format -
preprocess.py
contains scripts to parse annotation structure from UA documents
import helper
-
UA to jsonlines
helper.convert_coref_ua_to_json(UA_PATH, JSON_PATH, MODEL="coref-hoi", SEGMENT_SIZE=512, TOKENIZER_NAME="bert-base-cased")
-
jsonlines to UA
helper.convert_coref_json_to_ua(JSON_PATH, UA_PATH, MODEL="coref-hoi")
NOTE: Currently, these scripts only support conversion to and from the format used by models that use bert/spanbert embeddings. E.g. coref-hoi.
-
UA to jsonlines
helper.convert_bridg_ua_to_json(UA_PATH, JSON_PATH, MODEL="dali_bridging")
-
jsonlines to UA
helper.convert_bridg_json_to_ua(JSON_PATH, UA_PATH, MODEL="dali-bridging")
NOTE: Currently, these scripts only support conversion to and from the format used by dali-bridging.
- Previous Utterance Baseline (for "this", "that")
helper.discourse_deixis_baseline(IN_UA_PATH, PRED_UA_PATH, MODEL="previous-utterance")
Model | AMI | LIGHT | Persuasion | Swbd | ARRAU (Trains91) | |
---|---|---|---|---|---|---|
Identity Anaphora (CoNLL Avg. F1) | coref-hoi | TODO | TODO | TODO | TODO | TODO |
Bridging (Entity F1) | dali-bridging | TODO | TODO | TODO | TODO | TODO |
Discourse Deixis (CoNLL Avg. F1) | prev-utterance | TODO | TODO | TODO | TODO | TODO |