Giter Club home page Giter Club logo

chinese_nre's People

Contributors

dfxliziran avatar ningding97 avatar zibuyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinese_nre's Issues

Pre-trained model?

Is there any pre-trained model checkpoint I can use to test? Training takes too much time because of the batch_size

关于数据集

在test文档中,第一列的实体和第二列的实体顺序是否不可改变,必须是句子中先出现的在前,后出现的在后

File "Chinese_NRE\nn\mglattice.py", line 175, in reset_parameters self.weight_hh.data.set_(weight_hh_data) RuntimeError: set_storage is not allowed on a Tensor created from .data or .detach(). If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset) without autograd tracking the change, remove the .data / .detach() call and wrap the change in a `with torch.no_grad():` block.

Changing
self.weight_hh.data.set_(weight_hh_data)

to
with torch.no_grad():
self.weight_hh.data.set_(weight_hh_data)

did not help.

runtime error

ValueError: At least 2 points are needed to compute area under curve, but x.shape = 0
auc曲线那里获取不到数据?

数据文件有错

给的数据文件sense.txt最后一行"铛铛车"的embedding缺失了一小段数值,长度才175,其他都是201

请问如何修改batch_size?

模型里牵扯到的太多了所以不知道应该修改哪里的batch_size,看到显存占用只有980M,麻烦您能不能解答一下,非常感谢!

RuntimeError:set_storge is not allowed on Tensor created from .data or .detach()

Traceback (most recent call last):
File "D:/pycharm/work/Chinese_NRE-master/main.py", line 187, in
train(data, configure.savemodel)
File "D:/pycharm/work/Chinese_NRE-master/main.py", line 97, in train
model = MGLattice_model(data)
File "D:\pycharm\work\Chinese_NRE-master\nn\framework.py", line 15, in init
self.encoder = BiLstmEncoder(data)
File "D:\pycharm\work\Chinese_NRE-master\nn\encoder.py", line 98, in init
self.forward_lstm = LatticeLSTM(lstm_input, lstm_hidden, data.gaz_dropout, data.gaz_alphabet.size(), data.gaz_emb_dim, data.pretrain_gaz_embedding, True, data.HP_fix_gaz_emb, self.gpu)
File "D:\pycharm\work\Chinese_NRE-master\nn\mglattice.py", line 262, in init
self.rnn = MultiInputLSTMCell(input_dim, hidden_dim)
File "D:\pycharm\work\Chinese_NRE-master\nn\mglattice.py", line 163, in init
self.reset_parameters()
File "D:\pycharm\work\Chinese_NRE-master\nn\mglattice.py", line 174, in reset_parameters
self.weight_hh.data.set_(weight_hh_data)
RuntimeError: set_storage is not allowed on Tensor created from .data or .detach()

关于位置嵌入

想问一下对于论文中的公式1,为什么在代码中还要加上最大句长再加1?

return x + maxlen + 1

I checked and re-labeled FinRE dataset with these rules

I re-labeled the FinRE dataset as FinRE-v2 with these rules below:

  1. Extend 订单 to 订单, 被下订单 relations so that can capture the characters between "provider" and "client"
  2. Add relation 砍单, 被砍单: if 增持 and 减持 exist, there should have 订单 and 砍单
  3. Check and extend company relations in [交易, 签约, 重组]: more specific capture what kind of trading, eg. 买资, 收购, 持股, 增持 or 减持, etc.

The entire relation classes schema:

unknown 0
注资 1
拥有 2
纠纷 3
自己 4
增持 5
重组 6
买资 7
签约 8
持股 9
交易 10
入股 11
转让 12
成立 13
分析 14
合作 15
帮助 16
发行 17
商讨 18
合并 19
竞争 20
订单 21
砍单 22
减持 23
合资 24
收购 25
借壳 26
欠款 27
被发行 28
被转让 29
被成立 30
被注资 31
被持股 32
被拥有 33
被收购 34
被帮助 35
被借壳 36
被买资 37
被欠款 38
被增持 39
拟收购 40
被减持 41
被分析 42
被入股 43
被拟收购 44
被重组 45
被下订单 46
被砍单 47

The re-labeled dataset is provided through Google Drive link on my Github repo: https://github.com/A-baoYang/NLP-techniques-chinese/tree/main/NLU/Classification/RelationClassification

文章

请问文章什么时候可以放上来,想学习一下

模型是用于远程监督的吗

您好,我阅读了您的代码,发现AttClassifier貌似是用于远程监督的包级别分类器,但是论文中却没有提及这一点,请问模型使用的这些数据集都是远程监督的数据集吗

question about train.txt

训练集的前四句中,“东方航空”和“上航”既是unknown,也是“合并”关系,请问这种情况应该怎么处理呢?非常感谢您的回复~
“东方航空 上航 unknown 东方航空AH股临时停牌传将与上航合并

上航 东方航空 unknown 东方航空AH股临时停牌传将与上航合并

东方航空 上航 合并 东方航空AH股临时停牌传将与上航合并

上航 东方航空 合并 东方航空AH股临时停牌传将与上航合并”

Batch size can be larger than 1 ?

Hello! I found the default batch size is set to be 1. Can we change it to a larger value for accelerating the training?

Besides, the training takes too much time for me. I wonder if something goes wrong in my setting. May you share the time you spend in your training.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.