Giter Club home page Giter Club logo

emnlp2018-jmee's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

emnlp2018-jmee's Issues

Can you give a example?

Thanks your sharing!
Hello! I am a beginner. In preprocessing, I want to know how to transform a sentence into an example of the JSON format you gave. Can you give me a example in detail?
Thanks!

json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

您好,我用stanfordnlp处理了数据,遇到json.decoder.JSONDecodeError问题,很多人说是json文件的问题,但是我用了sample.json也遇到同样问题.
希望得到您的解答,谢谢!
Namespace(batch=128, dev='ace-05-splits/dev.json', device='cpu', earlystop=999999, epochs=9223372036854775807, finetune=None, hps=None, l2decay=0, lr=0.001, maxnorm=3, optimizer='adam', out='out', restart=999999, seed=42, test='ace-05-splits/test.json', train='ace-05-splits/sample.json', webd='word2vec.txt')
Running on cpu
loading corpus from ace-05-splits/sample.json
Traceback (most recent call last):
File "enet/run/ee/runner.py", line 241, in
EERunner().run()
File "enet/run/ee/runner.py", line 99, in run
keep_events=1)
File "/home/lhj/coding/EMNLP2018-JMEE-master/enet/corpus/Data.py", line 191, in init
super(ACE2005Dataset, self).init(path, fields, **kwargs)
File "/home/lhj/coding/EMNLP2018-JMEE-master/enet/corpus/Corpus.py", line 20, in init
examples = self.parse_example(path, fields)
File "/home/lhj/coding/EMNLP2018-JMEE-master/enet/corpus/Data.py", line 202, in parse_example
jl = json.loads(line,encoding="utf-8")
File "/home/lhj/anaconda3/envs/jmee/lib/python3.5/json/init.py", line 319, in loads
return _default_decoder.decode(s)
File "/home/lhj/anaconda3/envs/jmee/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/lhj/anaconda3/envs/jmee/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

CUDA ERROR: out of memory

when I set the batch_size = 32, after the Epoch 1 training, the validiation will CUDA Error: out of memory. My device is T100, GPU[16G]
Aslo, when set batch_size = 16, it will be OK but two slow. Is there some adjustment in your code?

The printed f1 is identification or classification

This is the best result I'v got:
epoch:106|loss: 9.72400|ed_p: 0.75342|ed_r: 0.74728|ed_f1: 0.75034|ae_p: 0.37625|ae_r: 0.31840|ae_f1: 0.34492|lr:0.2621440000
the ed_f1 means trigger identification f1 or trigger classification f1?

questions about data

hello,
Did you use all the sentences including those with events and without events for train/test/dev , or just use the sentences with events for train/dev/test? Thank you!

Train on sentences without containing event

Hi there,

I found that when loading corpus, JMEE use keep_events=1 option to filter out those sentences without containing event, this dramatically decrease the size of training set.

Is this step necessary? Why not keep all the event of training set?

# sentence in train set mush contains at least 1 event
#
train_set = ACE2005Dataset(self.a.train,
                           fields={"words": ("WORDS", WordsField),
                                   "pos-tags": ("POSTAGS", PosTagsField),
                                   "golden-entity-mentions": ("ENTITYLABELS", EntityLabelsField),
                                   "stanford-colcc": ("ADJM", AdjMatrixField),
                                   "golden-event-mentions": ("LABEL", LabelField),
                                   "all-events": ("EVENT", EventsField),
                                   "all-entities": ("ENTITIES", EntitiesField)},
                           keep_events=1)

# sentence in dev set can have no event
#
dev_set = ACE2005Dataset(self.a.dev,
                         fields={"words": ("WORDS", WordsField),
                                 "pos-tags": ("POSTAGS", PosTagsField),
                                 "golden-entity-mentions": ("ENTITYLABELS", EntityLabelsField),
                                 "stanford-colcc": ("ADJM", AdjMatrixField),
                                 "golden-event-mentions": ("LABEL", LabelField),
                                 "all-events": ("EVENT", EventsField),
                                 "all-entities": ("ENTITIES", EntitiesField)},
                         keep_events=0)

# sentence in test set can have no event
#
test_set = ACE2005Dataset(self.a.test,
                          fields={"words": ("WORDS", WordsField),
                                  "pos-tags": ("POSTAGS", PosTagsField),
                                  "golden-entity-mentions": ("ENTITYLABELS", EntityLabelsField),
                                  "stanford-colcc": ("ADJM", AdjMatrixField),
                                  "golden-event-mentions": ("LABEL", LabelField),
                                  "all-events": ("EVENT", EventsField),
                                  "all-entities": ("ENTITIES", EntitiesField)},
                          keep_events=0)

这代码写的是真坑!!!

论文代码既然开源,就要考虑运行方便,可以复现!
虽然ACE2005数据集license问题不开源,也不是你写的这么烂,这么坑的理由!

[closed]

for item, item_ in zip(arguments, arguments_):

Is this function correct? it seems to me that simply comparing some pairs (even after sorting both lists) will not give real ae performance.

Is the following workaround okay?

                for item in arguments:
                    if item in arguments_:
                        ct += 1

code run?

Are you run this code?how are you set this model's params?

about data

can you share data preprocessing code ?

1

def train(model, train_set, dev_set, test_set, optimizer_constructor, epochs, tester, parser, other_testsets): # build batch on cpu train_iter = BucketIterator(train_set, batch_size=parser.batch, train=False, shuffle=True, device=-1,sort_key=lambda x: len(x.POSTAGS))

but when i change the train to True,,, then i train it ,i found that most of train step the f1 and f, r are closed to 100%,,, so what's wrong with it?

questions about evaluation

Regarding the accuracy of calculating the trigger word, why is it correct to only calculate the B-Tags and i-Tags separately, and not to parse the BIO tags?
Looking forward for your response, thank you!

def calculate_report(self, y, y_, transform=True):
'''
calculating F1, P, R
:param y: golden label, list
:param y_: model output, list
:return:
'''
if transform:
for i in range(len(y)):
for j in range(len(y[i])):
y[i][j] = self.voc_i2s[y[i][j]]
for i in range(len(y_)):
for j in range(len(y_[i])):
y_[i][j] = self.voc_i2s[y_[i][j]]
return precision_score(y, y_), recall_score(y, y_), f1_score(y, y_)

paper与代码问题请教

您好,读过了您的paper,关于代码有两处问题想请教:

  1. 请问这一行代码是对应paper中的公式4吗?如果是,请问公式4中对N(v)节点的聚合在代码中是如何体现的?

  2. Self-Attention的forward函数中, 原文公式10的部分是如何体现的?

感谢您分享的代码,希望得到您的回复!

祝好!

F1 score dropping to zero

Hello,

I am trying to reproduce the same results using the same parameters, but when I run your code for some time, the loss keeps going down just fine and the accuracy increases but at the same time conll f1 score, precision and recall all drop to zero.
screen shot 2019-02-19 at 11 25 55 pm

It seems that the code overfits on the dataset since it returns all 'O' for predicted labels. I know that the dataset is licensed that's why it cannot be included but it does have a lot of character offset issues which require some treatment which could be different from one person to another. Could you at least include the code used for preprocessing it so that your code becomes more end to end and to be sure that the input is consistent?

Thanks,

mislabeled data

Hi, I'm try to reproduce your model. But my result is low. I have checked these labels that my model predicted and I found a lot of labels( that was predicted to Event sub-type difference to tag "O") but was tagged to 'O' tag in the dataset. Therefore, my precision score is downgrade( I only get precison=62%) . Did you encountered with this issue. If so, how did your tackled with it. You fixed wrong label in test, dev sets or keep the original data to evaluate these score?
Hope to see your answer soon! Thank you so much!

code run?

Are you run this code?how are you set this model's params?

about preprocessing code

Is there no pre-processing code?
Do I need to convert the data in ace corpus into JSON file by myself?

How to set parameter "loss_alpha"

Hi, I'm trying to run your code and I just found that value of parameter "loss_alpha" was not mentioned in your paper.
Could you please give me a value to set this parameter?
Thanks!

model hyperparams

how to set model hyperparams?Can you offer a useful params list?
Thanks

Is the code complete?

Hi,
I am looking for a EE algorithm that can be adopted in my program. I read your paper and think it is a good solution. In the code, I noticed that at least there are several classes like BottledXavierLinear are written as "pass", but they are actually used in the model. I guess the code is not complete? If so, could you please update the complete version and your params? I would appreciate it if you can help me.
Thank you very much!

Is the code in the gcn part incomplete?

In my observations, the entire gcn does not seem to complete its functions, including the BottledOrthogonalLinear and other classes are not implemented at all. Is there any problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.