shannonai / corefqa Goto Github PK
View Code? Open in Web Editor NEWThis repo contains the code for ACL2020 paper "Coreference Resolution as Query-based Span Prediction"
This repo contains the code for ACL2020 paper "Coreference Resolution as Query-based Span Prediction"
System setup:
tensorflow 1.15
torch 1.2
cuda 10.0
python 3.7.16
use_tpu False
I am geting this error during training
ERROR:tensorflow:Error recorded from training_loop: GetNext() failed because the iterator has not been initialized.
Ensure that you have run the initializer operation for this iterator before getting the next element.
[[node IteratorGetNext (defined at /home/shantanu/anaconda3/envs/corefqa/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Original stack trace for 'IteratorGetNext':
File "./run/run_mention_proposal.py", line 192, in <module>
tf.app.run()
File "/home/shantanu/anaconda3/envs/corefqa/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/shantanu/anaconda3/envs/corefqa/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/shantanu/anaconda3/envs/corefqa/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "./run/run_mention_proposal.py", line 129, in main
window_size=model_config.window_size, max_num_mention=model_config.max_num_mention, is_training=True, drop_remainder=True), max_steps=num_train_steps)
File "/home/shantanu/anaconda3/envs/corefqa/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
saving_listeners=saving_listeners)
I have no idea about this error.
Please share a working link for mention proposal module
You mention in the README that "We plan to release the PyTorch version soon". Can you share your progress here and/or estimated timelines? I am very interested in that effort as TF is so persnickety.
May I know your training time on TPU,and how many TPUs did you use?Thanks
Hello, Thanks for sharing your work!
How can I use your approach on a different dataset, whether to train or fine-tune it?
and how can I see the output document/sentences after the coreference resolution?
Thanks in advance
While I try to load the pretrained model, I get the following error.
E0903 21:29:51.073421 46912496399232 error_handling.py:75] Error recorded from prediction_loop: Unable to open table file ./trained_models/corefqa_trained/bert_finetune_
model_7_1800.bin: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Can you please help
I have downloaded the final model and want to use it on a custom sentence to do coreference resolution.
What is the usage? How to achieve this simple need? I don't see a predict.py script
@littlesulley
Hello,
I am trying to replicate the training process in a Colab TPU environment .
In the step 1.2. Or train the mention proposal model yourself.
I am getting the following error
ValueError Traceback (most recent call last)
[/content/drive/MyDrive/corefQA/code/run/run_mention_proposal.py](https://localhost:8080/#) in <module>()
190 tf.set_random_seed(FLAGS.seed)
191 # start train/evaluate the model.
--> 192 tf.app.run()
193
194
35 frames
[/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/resource_variable_ops.py](https://localhost:8080/#) in _init_from_args(self, initial_value, trainable, collections, caching_device, name, dtype, constraint, synchronization, aggregation, distribute_strategy, shape)
1558 "construct, such as a loop or conditional. When creating a "
1559 "variable inside a loop or conditional, use a lambda as the "
-> 1560 "initializer." % name)
1561 # pylint: enable=protected-access
1562 dtype = initial_value.dtype.base_dtype
ValueError: Initializer for variable Variable/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer
I am using the following command in order to execute the training step:
if train_mention_proposal:
DATA_DIR = GS_SEMVEVAL_TRFILES
OUTPUT_DIR = f"{GS_PATH}/models/mention_proposal"
PRETRAINED_MODEL = GS_SQUAD2_ES_TRAINED_MODEL
INIT_CHECKPOINT=f"{PRETRAINED_MODEL}/model.ckpt"
%cd {REPO_PATH}
%run run/run_mention_proposal.py \
--output_dir=$OUTPUT_DIR \
--bert_config_file=$BERT_CONFIG \
--init_checkpoint=$INIT_CHECKPOINT \
--vocab_file=$BERT_VOCAB \
--logfile_path=./train_mention_proposal.log \
--num_epochs=8 \
--keep_checkpoint_max=50 \
--save_checkpoints_steps=500 \
--train_file=$DATA_DIR/train.overlap.corefqa.es.tfrecord \
--dev_file=$DATA_DIR/dev.overlap.corefqa.es.tfrecord \
--test_file=$DATA_DIR/test.overlap.corefqa.es.tfrecord \
--do_train=True \
--do_eval=False \
--do_predict=False \
--learning_rate=1e-5 \
--dropout_rate=0.2 \
--mention_threshold=0.5 \
--hidden_size=1024 \
--num_docs=5604 \
--window_size=384 \
--num_window=6 \
--max_num_mention=60 \
--start_end_share=False \
--loss_start_ratio=0.3 \
--loss_end_ratio=0.3 \
--loss_span_ratio=0.3 \
--use_tpu=True \
--tpu_name=$TPU_NAME \
--seed=2333
Do you have any ideas as to what could be the problem?
Thank you in advance
Hi, thanks for your great work! Could you tell me how to get the final CorefQA model? the link in the instruction is a XML file link
当使用GPU训练时报错如下,请问是什么原因呢:
tensorflow.python.framework.errors_impl.FailedPreconditionError: GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
Hi. thanks for your contribution. Very great and novel work. I have also implemented your model using PyTorch. But I find it impossible to train even a base model on multiple GPUs. The main reason I believe is that even when batch size is 1, the number of generated questions varies and it is impossible to distribute those questions together with the corresponding passages to different GPUS in the interval of computation. Have you trained this model on multiple GPUs before or is it only feasible to train on TPUs?
Hope you could clarify my confusion and correct me if I am wrong.
Thanks
XLnet generally achieve much better performance than bert and nobody tried it on coreference resolution:
https://huggingface.co/transformers/model_doc/xlnet.html
I think that's a great opportunity to improve even further the state of the art
Hello all,
Good morning! My name is Wentseng and I am an intern at mine&make GmbH in Stuttgart, Germany.
We want build a commercial appliction using coreference resolution model and we are interested in your CorefQA model. However, we could not find any license in this project. Would you mind that we evaluating and integrating your model and would you mind adding some licenses in your project?
Thank you very much in advance!
Hi. Thanks for your contribution. It's a real meaningful work.
While i'm trying to run the project, I found it didn't seem to provide the run_quoref.py
mentioned in ./script/model/quoref_tpu.sh? Or maybe i'm carelessly misunderstand the documentation?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.