deep-coref's People
Forkers
stevenlol ns-moosavi nandily martin-voigt zshwuhan taikikadomae junaraki vyraun fangzheng354 benjamesbabala pku-wuwei singsanj cltl wafaears shannonyu akshayjh thatdeep purpleslz skybirdhe phildani7 lihongweimail cosecant-csc victoriasovereigne thanhlct tornadozou markpopovich quanticpotato cmerwich samarth92 jlee24282 liu4lin karlqu1990 herbertchen1 smallsmallwood zxsted tcxdgit achint08 afcarl valdicarlo jackmayy itsmengzaime learnerzhang lufenggui naetimus aavella77 yangyang233 rainbow-rain qfxlcyc akhiljha01 liushui9404 mujizi keaideii euhkim chenny0808 dongyingname angledluffa ruchenshanghai athrado wangcj05 gkuo06deep-coref's Issues
Memory requirement for training on the conll-2012 corpus
Hi, I am trying to train your model on a AWS p2x instance (with a 12 Go K80 GPU) on the Conll-2012 corpus (2802 documents in the training dataset). The training eats all (RAM) memory (64 Go) in less than 30 % of the first epoch and gets killed before finishing it.
I was wondering on what type of machine you trained it ?
Is 64 Go of RAM too small for training on the conll corpus ?
Hello clarkkev
I am a master from ECNU(China), and my field is coreference resolution. I am reading your papers now, and your papers is very enlightening to me. Thank you for your paper.Last week I will translate your papers in my ppt,and I will explain your paper to my classmates.
If you don't mind, could you give me your email in private? I hope to know you and have a chance to communicate with you. My email is [email protected] or [email protected]
Thank you!
Error: A JNI error has occurred, please check your installation and try again
Hi, I'm trying to train a new model.
My console command:
java -Xmx5g -cp stanford-corenlp.jar:stanford-corenlp-models-current.jar:* edu.stanford.nlp.coref.neural.NeuralCorefDataExporter /home/PC/Desktop/Files/CoreNLP/src/edu/stanford/nlp/coref/properties/neural-turkish-conll.properties .
Here is my properties file:
coref.algorithm = neural
coref.conll = true
coref.data = /home/PC/Desktop/Files/CoreNLP/input.txt.conll
coref.conllOutputPath = /home/PC/Desktop/Files/CoreNLP
coref.scorer = /home/PC/Downloads/cc.tr.300.vec
Error Ouput:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: javax/json/JsonValue
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: javax.json.JsonValue
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
What causes the error? How can I solve this?
example_file.txt file format needed, to run already trained model
I am trying to run an already-trained model. So can you provide the example_file.txt file format.
java -Xmx5g -cp stanford-corenlp-3.7.0.jar:stanford-corenlp-models-3.7.0.jar:* edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,mention,coref -coref.algorithm neural -file example_file.txt
error Attempted to fetch annotator "parse" but the annotator pool does not store any such type!
Hello, I am studying the paper "Deep Reinforcement Learning for Mention-Ranking Coference Models" and follow the steps to train my own model. But in the third step(Run the NeuralCorefDataExporter class in the development version of Stanford's CoreNLP), I occur this error. I cannot find a solution in Google. Thank you!
This is my neural-english-conll.properties:
coref.algorithm = neural
coref.conll = true
coref.data = /home/hengru/deep-coref/conll-2012
coref.conllOutputPath = /home/hengru/deep-coref/data/data_out
coref.scorer = /home/hengru/deep-coref/data/scorer
This is the error:
hengru@dc6:~/CoreNLP/target$ java -Xmx2g -cp javax.json.jar:stanford-corenlp-3.7.0.jar:stanford-corenlp-models-3.7.0.jar:* edu.stanford.nlp.coref.neural.NeuralCorefDataExporter neural-english-conll.properties /home/hengru/deep-coref/data/data_out/
Jul 25, 2017 6:52:56 PM edu.stanford.nlp.coref.docreader.CoNLLDocumentReader
INFO: Reading 1940 CoNLL files from /home/hengru/deep-coref/conll-2012/v4/data/train/data/english/annotations/
Adding annotator lemma
Adding annotator mention
Using mention detector type: rule
Attempted to fetch annotator "parse" but the annotator pool does not store any such type!
Exception in thread "main" java.lang.NullPointerException
at edu.stanford.nlp.coref.md.CorefMentionFinder.parse(CorefMentionFinder.java:646)
at edu.stanford.nlp.coref.md.CorefMentionFinder.findSyntacticHead(CorefMentionFinder.java:536)
at edu.stanford.nlp.coref.md.CorefMentionFinder.findHead(CorefMentionFinder.java:456)
at edu.stanford.nlp.coref.md.RuleBasedCorefMentionFinder.findMentions(RuleBasedCorefMentionFinder.java:100)
at edu.stanford.nlp.pipeline.MentionAnnotator.annotate(MentionAnnotator.java:102)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:648)
at edu.stanford.nlp.coref.data.DocumentMaker.nextDoc(DocumentMaker.java:145)
at edu.stanford.nlp.coref.CorefDocumentProcessor.run(CorefDocumentProcessor.java:40)
at edu.stanford.nlp.coref.CorefDocumentProcessor.run(CorefDocumentProcessor.java:25)
at edu.stanford.nlp.coref.neural.NeuralCorefDataExporter.exportData(NeuralCorefDataExporter.java:182)
at edu.stanford.nlp.coref.neural.NeuralCorefDataExporter.main(NeuralCorefDataExporter.java:189)
Please tell me the output of NeuralCorefDataExporter with some example, the output of this class is blank files
need a details guide on how to train the model
"Run the NeuralCorefDataExporter class in version Stanford's CoreNLP using the neural-coref-conll properties file. This does mention detection and feature extraction on the CoNLL data and then outputs the results as json"
how to do this, which command and its parameter?
ssplit.eolonly rises NullPointerException at edu.stanford.nlp.pipeline.NERCombinerAnnotato
So, basically we have an already tokenised corpus with golden sentence segmentation, which we want to preserve. Evidently, we found this parameters :
tokenize.whitespace = true
ssplit.eolonly = true
They work alright together with tokenize,ssplit,pos,lemma and parses, but it we want to pass all the annotators needed for the coreference resolution
annotators = tokenize,ssplit,pos,lemma,ner,parse,coref
it gives error Nullpointer exception specifically on NER annotation part.
Processing file /Users/nikahelicopter/Dropbox/data/new_gold/txt/xx00.txt ... writing to /Users/nikahelicopter/Downloads/stanford-corenlp-full-2018-10-05/xx00.txt.xml
Exception in thread "main" java.lang.NullPointerException
at edu.stanford.nlp.pipeline.NERCombinerAnnotator.annotate(NERCombinerAnnotator.java:322)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:637)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:647)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1226)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1060)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.run(StanfordCoreNLP.java:1326)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1389)
We use stanford-corenlp-full-2018-10-05 version 3.9.2
An example file:
xx01.txt
Parameters:
annotators = tokenize,ssplit,pos,lemma,ner,parse,coref
tokenize.whitespace = true
ssplit.eolonly = true
coref.algorithm = neural
file = /Users/nikahelicopter/Dropbox/data/new_gold/txt/xx00.txt
Calling NeuralCorefDataExporter
Hi there, in the latest CoreNLP, I found this class under edu.stanford.nlp.coref.neural.NeuralCorefDataExporter
instead of edu.stanford.nlp.coref.NeuralCorefDataExporter
.
Minh
Error: A JNI error has occurred
Can you please help, we are getting below mention error
~/softwares/CoreNLP-master$ java -Xmx2g -cp "stanford-corenlp.jar:stanford-corenlp-models-3.7.0.jar:*" edu.stanford.nlp.coref.neural.NeuralCorefDataExporter coref.properties “/home/development/devendra/deep-coref-master/output”
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: javax/json/JsonValue
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: javax.json.JsonValue
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
This is the text from properties file
coref.algorithm = neural
coref.conll = true
coref.data = /home/devendra/Downloads/Internship+MTP/MTP_Stage_II/deep-coref-master/gold_conll/
coref.conllOutputPath = /home/devendra/Downloads/Internship+MTP/MTP_Stage_II/deep-coref-master/output/
Getting Error: ImportError: cannot import name Graph, while training the model
I am getting below error while training the model, following the given steps. Seems this code uses Graph, which is no more supported by newer version of Keras. Can you please help me to fix the problem.
devendra@krishna:~/deep-coref-master_2.7$ python run_all.py
Using Theano backend.
Traceback (most recent call last):
File "run_all.py", line 5, in <module>
import clustering_learning
File "/home/development/devendra/deep-coref-master_2.7/clustering_learning.py", line 2, in <module>
import clustering_models
File "/home/development/devendra/deep-coref-master_2.7/clustering_models.py", line 2, in <module>
import pairwise_models
File "/home/development/devendra/deep-coref-master_2.7/pairwise_models.py", line 9, in <module>
from keras.models import Graph
ImportError: cannot import name Graph
Exception when training its own models
Hello,
I'm trying to reproduce the training and unfortunately I'm running to an exception during the process. I have extracted the features using the NeuralCorefDataExporter to get the features in JSON, and then run the Python code with python run_all.py
. After a moment I get the following exception:
Loading data
Traceback (most recent call last):
File "run_all.py", line 93, in <module>
train_best_model()
File "run_all.py", line 88, in train_best_model
train_and_test_pairwise(model_properties.MentionRankingProps(), mode='reward_rescaling')
File "run_all.py", line 68, in train_and_test_pairwise
train_pairwise(model_props, mode=mode)
File "run_all.py", line 59, in train_pairwise
pretrain(model_props)
File "run_all.py", line 33, in pretrain
pairwise_learning.train(model_props, n_epochs=150)
File "/opt/deep-coref/pairwise_learning.py", line 313, in train
model_props, with_ids=True)
File "/opt/deep-coref/datasets.py", line 309, in __init__
for ana in range(0, me - ms)])
ValueError: need at least one array to concatenate
After checking the JSON from train, dev and test, apparently they contains empty features like:
{"mentions":{},"labels":{},"pair_feature_names":["same-speaker","antecedent-is-mention-speaker","mention-is-antecedent-speaker","relaxed-head-match","exact-string-match","relaxed-string-match"],"pair_features":{},"document_features":{"type":1,"source":"wb"}}
Which I suppose is normal as if there is no coref, there is nothing to extract then. So I checked the code in datasets.py
to print the content of doc_mentions
and indeed, the value me - ms
can be 0
as the content looks like:
[[0 117]
[117 305]
[305 522]
...,
[71818 71818]
[71818 71818]
[71818 71818]]
I certainly did something wrong in my process but I don't see what. Any help will be appreciated.
Thanks!
String matching features?
I'm reading Clark and Manning (2016) and the list of features include "String Matching Features: Head match, exact string match, and partial string match."
I look at datasets.py but can't pinpoint where those features are implemented.
Could anybody help me?
Clark, K., & Manning, C. D. (2016). Improving Coreference Resolution by Learning Entity-Level Distributed Representations. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 643–653. http://doi.org/10.18653/v1/P16-1061
how to generates the json file for training?
java -Xmx2g -cp stanford-corenlp.jar:stanford-corenlp-models-3.7.0.jar:* edu.stanford.nlp.coref.NeuralCorefDataExporter
what is the edu.stanford.nlp.coref.NeuralCorefDataExporter ?
should I just download the below link?
https://github.com/stanfordnlp/CoreNLP/blob/master/src/edu/stanford/nlp/coref/neural/NeuralCorefDataExporter.java
cannot find edu/stanford/nlp/models/
I just followed the example, and got an error message.
java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL
There is no subdirectory "models"
Availability of pre-trained model
Hello @clarkkev ,
First of all thanks for pushing this code on GitHub. As training this model requires 7 days on GTX TITAN, can you please refer the link or push the pre-trained model so that it can be used for transfer learning?
Thanks.
Training the model for another language
Is it possible to train the model for Russian? If possible, what should I change?
How to use custom trained coreference model in Stanford CoreNLP ?
dev conll score converge at first few epochs
When I train the model with conll 2012 data, the dev conll score is 0.657888797045302 at the first epoch. And remain stable around 0.658. This is even better than your result 0.657.
The dev conll score of the first 10 epochs:
(0, 0.657888797045302)
(1, 0.6585094296994586)
(2, 0.6582570233258541)
(3, 0.6582991986114178)
(4, 0.6590045611564873)
(5, 0.6580189901300686)
(6, 0.6585166713504456)
(7, 0.6574807587739219)
(8, 0.660184576001444)
(9, 0.6592850994459925)
(10, 0.6583200324261912)
What I did was just following the 4 steps in the instruction. Do you have any idea how does this happen?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.