dianbowork / graph4cner Goto Github PK
View Code? Open in Web Editor NEWSource code for the paper "Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network"
Source code for the paper "Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network"
我使用的是weiboNER_2nd_conll.train数据集作为训练数据,其原始数据格式是:
['科0\tO\n', '技1\tO\n', '全0\tO\n', '方1\tO\n', '位2\tO\n', '资0\tO\n', '讯1\tO\n', '智0\tO\n', '能1\tO\n', ',0\tO\n', '快0\tO\n', '捷1\tO\n', '的0\tO\n', '汽0\tO\n', '车1\tO\n', '生0\tO\n', '活1\tO\n', '需0\tO\n', '要1\tO\n', '有0\tO\n', '三1\tO\n', '屏2\tO\n', '一0\tO\n', '云1\tO\n', '爱2\tO\n', '你0\tO\n', '\n']
然后根据你提供的代码,会将其转化为:
['科0', '技0', '全0', '方0', '位0', '资0', '讯0', '智0', '能0', ',0', '快0', '捷0', '的0', '汽0', '车0', '生0', '活0', '需0', '要0', '有0', '三0', '屏0', '一0', '云0', '爱0', '你0']
但是在处理之后,gazs匹配的全部都为空,所以我想请教一下,是否在处理数据的过程中将原始数据转化为:
['科', '技', '全', '方', '位', '资', '讯', '智', '能', ',', '快', '捷', '的', '汽', '车', '生', '活', '需', '要', '有', '三', '屏', '一', '云', '爱', '你']
您好,我使用您提供的数据进行train,修改了shell中的batchsize后acc一直为0,请问是哪里的问题?
你们复现代码还需要下载什么文件吗?
Hi,
I got this error when trying to run the code
gat_input = torch.cat((lstm_feature, gaz_feature), dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 8 but got size 9 for tensor number 1 in the list.
Any ideas plz?
thank u
dev f1只有50,test只有47,跑满了150个epoch,用的数据集来自https://github.com/hltcoe/golden-horse
您好,请问self-matched word是通过Trie树匹配的Word embeddings (sgns.merge.word)中的词么?
运行Resume数据集的时候,当epoch=17时,出现了"Floating point exception (core dumped)"。
服务器提示killed,是我们内存太小了吗?请问您训练的时候用的多大内存
File "D:\pycode\Graph4CNER\utils\functions.py", line 16, in read_instance
in_lines = open(input_file, 'r').readlines()
UnicodeDecodeError: 'gbk' codec can't decode byte 0x91 in position 2: illegal multibyte sequence
将‘r’替换成‘rb’,出现 AttributeError: 'int' object has no attribute 'isdigit'
我将输入端字向量替换为BERT字向量,之后的结构不做改动,最终得到的效果非常差,f1值在50以下,训练时间也变很长,loss也变得非常大(几万)。请问有人遇到这样的问题吗?感谢!
你好,作者,没有剩余的GPU可以用了,可以使用CPU跑这个程序吗。
您好,您有这种情况吗?FileNotFoundError: [Errno 2] No such file or directory: './data/embeddings/sgns.merge.word'
I find some many variables and functions start/end with gaz
which looks like word related things. I don't quite understand these functions like seq_gaz
... Can you tell about what's gaz?
作者您好,请问您在论文中使用的数据集是weiboNER.conll.train还是weiboNER_2nd_conll.train呢?
您好,我在对weibo数据集进行测试的时候最后只有dev的得分,没有test的得分,请问测试集这一块需要自己补上么?
您好,我用您给的参数,但是结果要差好多。我把lstm换成一种形式的transformer,在我训练的时候结果要比原来的好一点,但是跟您论文里面的结果还是差好多。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.