The event_query_extract from vt-nlp

Hi @zyf-zone, thank you for your interest in our work!

Problem about the argument detection dataset

Dear authors:
I am glad to learn your work. I have a question about the argument detection dataset. In the file run_arg_detection.py, the training dataset(batch data) used in code is different from the dataset in the file run_trigger_detection.py. But I only find the file save_dataset.py which is used to save the trigger detection dataset. Could you tell me hou can I store the argument detection dataset into .pt file? Thank you for your work!

亲爱的作者：
您好！我很高兴能阅读您的论文。我注意到代码中论元抽取代码里导入的.pt文件应该与事件检测导入的.pt文件不同。但是save_dataset.py文件中只有保存事件检测数据集的代码，请问您能告诉我如何保存论元检测的数据到.pt文件中吗？

数据处理

对作者所写文章非常感兴趣，读了几遍，想复现论文。但是数据集以及数据处理始终困扰，请问raw_data在经过ACE_ERE_Scripts处理过后的.csv文件放进./data/ace_en/processed_data and ./data/ere_en/processed_data respectively之后；运行bssh setup.sh之前。还要经过什么处理呢

About Time and Value arguments

Dear Author:

When i use the file you provided to process the data i got this error:

Traceback (most recent call last):
File "./preprocess/save_dataset.py", line 305, in
read_from_source(args)
File "./preprocess/save_dataset.py", line 289, in read_from_source
raw_data = read_data_from(path, config.tokenizer, ace=args.ace)
File "F:\Event_Query_Extract-main.\utils\data_to_dataloader.py", line 120, in read_data_from
output = list(map(partial(unpack_ace_with_vt, tokenizer=tokenizer), output))
File "F:\Event_Query_Extract-main.\utils\data_to_dataloader.py", line 60, in _unpack_ace_with_vt
entities = data[-5].split(' ')
IndexError: list index out of range

I checked the code and guessed the possible cause of the problem
After run python preprocess/ace/read_args_with_entity_ace.py
I found the content of the output file as follows:

'Text'
'B-I-O TAG'
'xxxxxxxxxxx.csv'

It is seems to contradict function _unpack_ace_with_vt
Where is the Time and Value arguments and where is the POS-TAG?
What should I do?

Problem about entity_detection

Dear Author:

I noticed that the trained entity detection model needs to be used when running eval.py, but there is no file named scripts/run_entity_detection.py.

I found that model/entity_dection.py exists. So how should I train the model?

Questions about arg_detection and zero-shot task

Dear Author:

I ran into a few problems when trying to run this codebase. Could you help with the following questions? Thanks!

In run_arg_detection.py Line8 and Line14, you imported Write2HtmlArg and FactContainer which are not included in the codebase.
The preprocess code in save_dataset.py only provides the dataset for run_trigger_detection.py, but not for run_arg_detection.py. In run_arg_detection.py, there are 16 values to be unpacked from a batch, but only 7 is preprocessed. So I don't know how to generate the dataset for EAE task.
I think there is a bug when you're preprocessing the arguments for each event. In https://github.com/VT-NLP/Event_Query_Extract/blob/main/preprocess/save_dataset.py#L256, if there is multiple events in a sentence, the arg_list only contains the arguments of the last one. Since the evaluation code of EAE isn't included in this codebase like mentioned in 1., I don't know how you manage to align multiple event mentions with only one arg_list in the end.
The paper mentioned zero-shot EE, but zero-shot related code isn't included in the repository. Since zero-shot requires a different data format (in the paper, you said that the prototype triggers aren't included), I think adding the code for this part may help us reproduce the result.
We need to add train.doc.txt,dev.doc.txt,test.doc.txt in data/splits/ACE05-E/ before running ./setup.sh, each file representing the document split of the corresponding dataset. The format of each file is that every line contains the document name. I think you could state that in README.md to help.
The pos_tag ,time and value wouldn't appear if we stick to your preprocessing procedure in ACE_ERE_scripts repository. So if we want to run save_dataset.py successfully, we need to change _unpack_ace_with_vt like I mentioned in #4.

Problem about the argument detection dataset

Dear authors:
I am glad to learn your work. I have a question about the argument detection dataset. In the file run_arg_detection.py, the training dataset(batch data) used in code is different from the dataset in the file run_trigger_detection.py. But I only find the file save_dataset.py which is used to save the trigger detection dataset. Could you tell me hou can I store the argument detection dataset into .pt file? Thank you for your work!

亲爱的作者：
您好！我很高兴能阅读您的论文。我注意到代码中论元抽取代码里导入的.pt文件应该与事件检测导入的.pt文件不同。但是save_dataset.py文件中只有保存事件检测数据集的代码，请问您能告诉我如何保存论元检测的数据到.pt文件中吗？

In your arg_detection model,
https://github.com/VT-NLP/Event_Query_Extract/blob/main/model/argument_detection/arg_detection.py#L205
after getting the trigger embeddings, it seems that you did not use them for further calculations, which doesn't align with the definition of "trigger-aware entity representations" in your paper "2.2 Event Argument Extraction, Multiway Attention".
In other words, I could not find where the trigger information is used in arg_detection models.

Could you help me with this problem if possible? Thanks!

vt-nlp / event_query_extract Goto Github PK

event_query_extract's People

Contributors

Stargazers

Watchers

Forkers

event_query_extract's Issues

Hi @zyf-zone, thank you for your interest in our work!

Problem about the argument detection dataset

数据处理

About Time and Value arguments

Problem about entity_detection

Questions about arg_detection and zero-shot task

Problem about the argument detection dataset

requirements

_

Data process

Problem about the usage of trigger information in arg_detection model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent