Implementation of the translation pipeline, automatic sampling and scoring,human evaluation and experiments of our NLP4ConvAI@ACL2023 paper: IE-SemParse: Evaluating Interbilingual Semantic Parsing for Indian Languages. To explore the dataset online visit dataset page.
Below are the details about the IE_SemParse datasets and scripts for reproducing the results reported in the NLP4ConvAI@ACL2023 Paper.
In this paper we proposed a novel task for Inter-Bilingual semantic parsing task where the utterance is in indic language and the model is required to generate logical form with english slot values.
- Approach A: Translate to English then parse to logical form.
- Approach B: Separate parser and dialogue manager for each language
- Approach C: Inter-bilingual Semantic Parsing.
Inter-bilingual Semantic Parsing is a good middle ground approach to enhance model’s multilingual semantic parsing ability and reduce system latency and redundancy.
The code is present in the translation_notebooks
folder containing 2 notebooks, namely dataset_translation.ipynb
and post_processing.ipynb
.
The Dataset IE-mTOP, IE-multilingualTOP and IE-multiATIS++ are present i huggingface datasets dataset page.
We Experiment with 4 approached of train test strategies better described in the paper.
To run all experiments in your setup just run the following
bash setup.
bash run_tests.sh
All the analysis notebooks are in analysis_notebooks
folder.
All the datasets created as part of this work will be released under a CC-0 license and all models & code will be release under an MIT license
@misc{aggarwal2023evaluating,
title={Evaluating Inter-Bilingual Semantic Parsing for Indian Languages},
author={Divyanshu Aggarwal and Vivek Gupta and Anoop Kunchukuttan},
year={2023},
eprint={2304.13005},
archivePrefix={arXiv},
primaryClass={cs.CL}
}