zjy-ucas / chinesener Goto Github PK
View Code? Open in Web Editor NEWA neural network model for Chinese named entity recognition
A neural network model for Chinese named entity recognition
请问 def main(_): 这个下划线代表什么意思?
如果把下划线删掉 变成main(): 又运行出错了
I don't know why you add this. It is never used.
Hi,
I noticed that the pre-trained embedding file was not used in "embedding layer", just used a lookup function to generation character embedding and seg embedding. The pre-trained embedding only used in the char_to_id generation. I want to know whether I misunderstand this. If so, why not use the pre-trained embedding to generate the input. Thanks!
self.seg_lookup = tf.get_variable(
name="seg_embedding",
shape=[self.num_segs, self.seg_dim],
initializer=self.initializer)
在embedding层中加入这几行代码,并且 embed = tf.concat(embedding, axis=-1)加入这行代码的作用是什么?
RT
File "/home/PycharmProjects/NER/ChineseNER-master/main.py", line 54, in
assert FLAGS.clip < 5.1, "gradient clip should't be too much"
File "/usr/local/lib/python3.5/dist-packages/absl/flags/_flagvalues.py", line 488, in getattr
raise _exceptions.UnparsedFlagAccessError(error_message)
absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --clip before flags were parsed.
Hi:
目前模型支持三个实体,如果我要扩展到更多实体,则需要增加相应语料进行训练,但是这样随着扩展的实体越来越多,训练的耗时也会相应增加,请问我增加实体类别后如何做到增量训练?来减少训练的时间。
for i in range(len(batch))是不是应该改成for i in range(len(str_lines)),不然对于一些短文本的句子会出现越界情况
是不是需要修改colleval文件里的输出,发现是用perl写的,好麻烦
Traceback (most recent call last):
File "", line 1, in
runfile('E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py', wdir='E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj')
File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py", line 246, in
train()
File "E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py", line 192, in train
step, batch_loss = model.run_step(sess, True, batch)
File "E:\【重点代码】ChineseNER-master-bishe\Gradu_Prj\model.py", line 221, in run_step
feed_dict)
File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 895, in run
run_metadata_ptr)
File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1097, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
将example.train, example.test, example.dev三个文件中的句子删除一部分后,转变成txt文档保存,但运行时出错。
您好,我想学习一下您的代码,试运行了一下遇到了解决不了的错误。之前rnn_cell_impl.LSTMStateTuple这一句提示找不到方法,我通过改用tf.contrib.rnn.LSTMStateTuple解决了错误,但是每当运行到outputs, final_states = tf.nn.bidirectional_dynamic_rnn(lstm_cell["forward"],lstm_cell["backward"],lstm_inputs,dtype=tf.float32,sequence_length=lengths)这一句的时候就报NotImplementedError: Abstract method错误,找不到错误原因,希望您能帮助我谢谢
您好打擾了,我目前看到對於中文大都是使用word2vec的'詞'向量,但對於中文NER來說目前主流算法都是以'字'來看,因此想請問一下您的"字向量"是如何訓練出來的呢?是否有什麼資料可以參考呢?
是类似Word2vec那种方式?基于的语料库是?非常感谢。
我想知道数据集的来源和完整程度,谢谢
I want to know the source of your data, and whether the dataset is complete?
line = input("请输入测试句子:")
print line
result = model.evaluate_line(sess, input_from_line(line, char_to_id), id_to_tag)
请问对 输入的测试句子 有什么格式要求?
输入中文:北京*** 报错
输入数字:3232132312 报错
Hi @zjy-ucas
感谢你的分享。
有一个小问题,请问一下你的训练集是自己如何获得的,准确性如何?
感谢~
Hi,
Did you encount the bug like:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [4341,100] rhs shape= [3637,100]
when run python main.py
在dev数据集的时候报上述错误,dev数据集中数组长度不一致导致,有一个分组列表长度有99是72,一个是73导致
模型中使用wiki_100中提供的向量,对于英文如chanel会切分成c,h,a, n , e,l,有办法改进英文的输入吗?
这里的"char_embedding" if not name else name
是对的吗?没有见过这样的语法呀
如题,训练模型时,出现了下面的错误调试:
Building prefix dict from the default dictionary ... Loading model from cache C:\Users\cloudy\AppData\Local\Temp\jieba.cache Loading model cost 1.237 seconds. Prefix dict has been built succesfully. Found 4313 unique words (979180 in total) Loading pretrained embeddings from wiki_100.utf8... Found 13 unique named entity tags 20864 / 0 / 4636 sentences in train / dev / test. Traceback (most recent call last): File "main.py", line 225, in <module> tf.app.run(main) File "D:\Anaconda3\envs\keras\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "main.py", line 219, in main train() File "main.py", line 150, in train train_manager = BatchManager(train_data, FLAGS.batch_size) File "C:\Users\cloudy\Desktop\ChineseNER\data_utils.py", line 285, in __init__ self.batch_data = self.sort_and_pad(data, batch_size) File "C:\Users\cloudy\Desktop\ChineseNER\data_utils.py", line 293, in sort_and_pad batch_data.append(self.pad_data(sorted_data[i*batch_size: (i+1)*batch_size])) TypeError: slice indices must be integers or None or have an __index__ method
自己找了几个方法,没有解决,希望帮我解决一下,感激不尽!
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,10,100] vs. shape[1] = [1,6,20]
[[Node: char_embedding/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](char_embedding/embedding_lookup, char_embedding/seg_embedding/embedding_lookup, char_embedding/concat/axis)]]
data文件夹下dev和test的作用分别是什么?为什么测试时要分别测试dev和test中的数据?
For example, when defining the loss function, you expand logits and targets to [self.num_tags + 1, self.num_tags + 1].
def loss_layer(self, project_logits, lengths, name=None):
"""
calculate crf loss
:param project_logits: [1, num_steps, num_tags]
:return: scalar loss
"""
with tf.variable_scope("crf_loss" if not name else name):
small = -1000.0
# pad logits for crf loss
start_logits = tf.concat(
[small * tf.ones(shape=[self.batch_size, 1, self.num_tags]), tf.zeros(shape=[self.batch_size, 1, 1])], axis=-1)
pad_logits = tf.cast(small * tf.ones([self.batch_size, self.num_steps, 1]), tf.float32)
logits = tf.concat([project_logits, pad_logits], axis=-1)
logits = tf.concat([start_logits, logits], axis=1)
targets = tf.concat(
[tf.cast(self.num_tags*tf.ones([self.batch_size, 1]), tf.int32), self.targets], axis=-1)
self.trans = tf.get_variable(
"transitions",
shape=[self.num_tags + 1, self.num_tags + 1],
initializer=self.initializer)
log_likelihood, self.trans = crf_log_likelihood(
inputs=logits,
tag_indices=targets,
transition_params=self.trans,
sequence_lengths=lengths+1)
return tf.reduce_mean(-log_likelihood)
But in fact, the model works fine with the original logits and targets as the code following, so what's the purpose of doing so? thx!
def loss_layer(self, project_logits, lengths, name=None):
self.trans = tf.get_variable(
"transitions",
shape=[self.num_tags, self.num_tags],
initializer=self.initializer)
log_likelihood, self.trans = crf_log_likelihood(
inputs=self.logits,
tag_indices=self.targets,
transition_params=self.trans,
sequence_lengths=lengths)
return tf.reduce_mean(-log_likelihood)
不知道大家有没有试过用GPU去运行程序,速度很慢,甚至比CPU还慢。不知道是什么原因。
麻烦大神解答下疑惑,感激~
Traceback (most recent call last):
File "F:/yyhaker/software/project/NamedEntityRecognition/src/ChineseNER/main.py", line 225, in
if name == "main":
File "D:\perhack\Anaconda3\envs\my_pytorch\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "F:/yyhaker/software/project/NamedEntityRecognition/src/ChineseNER/main.py", line 219, in main
clean(FLAGS)
File "F:/yyhaker/software/project/NamedEntityRecognition/src/ChineseNER/main.py", line 114, in train
# create maps if not exist
NameError: name 'os' is not defined
I have install the os module, and it can run correctly! What's wrong with it?
请问是python main.py后在输入句子提示后输入句子,就能看到计算出的结果了吗?
感觉分出来的效果不是很理想,请问是不是有什么别的方式,谢谢!
请输入测试句子:老张开车去东北玩。
结果:
[{'end': 3, 'start': 1, 'type': 'PER', 'word': '老张开'},
{'end': 4, 'start': 1, 'type': 'PER', 'word': '车'},
{'end': 5, 'start': 4, 'type': 'LOC', 'word': '去'},
{'end': 6, 'start': 5, 'type': 'LOC', 'word': '东'},
{'end': 7, 'start': 6, 'type': 'LOC', 'word': '北'},
{'end': 8, 'start': 7, 'type': 'LOC', 'word': '玩'},
{'end': 9, 'start': 8, 'type': 'LOC', 'word': '。'}]
句子:他的检验报告等。
标注:“报告”
位置:4, 6
使用model.py中的evaluate_line方法会出现以下情况:
可以解释下main.py中参数设置及其意义吗?
tensorflow 1.10 中已经将 rnn_cell
从 tensorflow.python.ops
移除,功能类似的是 tensorflow.contrib.rnn
。可以把 model.py
中的第四行改为 import tensorflow.contrib.rnn as rnn_cell
(不负责任的做法)。
tf.concat()
的参数顺序被做了调整,所有的 rnn_inputs = tf.concat(2, [rnn_inputs, self.features])
应被改为 rnn_inputs = tf.concat([rnn_inputs, self.features], 2)
。
tf.batch_matmul()
已经被移除, 应改为 tf.matmul()
。
`
def create_model(session, Model_class, path, load_vec, config, id_to_char, logger):
# create model, reuse parameters if exists
model = Model_class(config)
ckpt = tf.train.get_checkpoint_state(path)
if ckpt and tf.train.checkpoint_exists(ckpt.model_checkpoint_path):
logger.info("Reading model parameters from %s" % ckpt.model_checkpoint_path)
model.saver.restore(session, ckpt.model_checkpoint_path)
else:
logger.info("Created model with fresh parameters.")
session.run(tf.global_variables_initializer())
if config["pre_emb"]:
emb_weights = session.run(model.char_lookup.read_value())
emb_weights = load_vec(config["emb_file"],id_to_char, config["char_dim"], emb_weights)
session.run(model.char_lookup.assign(emb_weights))
logger.info("Load pre-trained embedding.")
return model
`
麻烦了,谢谢
def input_from_line(line, char_to_id):
"""
Take sentence data and return an input for
the training or the evaluation function.
"""
line = full_to_half(line)
line = replace_html(line)
inputs = list()
inputs.append([line])
line.replace(" ", "$")
inputs.append([[char_to_id[char] if char in char_to_id else char_to_id["<UNK>"]
for char in line]])
inputs.append([get_seg_features(line)])
inputs.append([[]])
return inputs
line.replace(" ", "$")
has no effect, line is unchanged
change to
line = re.sub('\s', '$', line)
?
您好,在model.py代码project_layer方法中,第138行注释, :param lstm_outputs: [batch_size, num_steps, emb_size] .project_layer是介于bilstm层和logits层之间,它的输入应该是bilstm_layer的输出:[batch_size,num_steps,2*lstm].您看我这样理解对吗?
我好像绕不过tf.app.run()这个坎,没有办法做一个接口,调用main中的实体识别功能
你好!请教一个小白问题,origin_data里的训练及预测语料是用什么工具整理成那种格式的,能提供下代码吗?谢谢!
Traceback (most recent call last):
File "main.py", line 227, in
if name == "main":
File "C:\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "main.py", line 221, in main
clean(FLAGS)
File "main.py", line 187, in train
File "main.py", line 87, in evaluate
ner_results = model.evaluate(sess, data, id_to_tag)
File "C:\pyproject\ChineseNER-master\utils.py", line 66, in test_ner
eval_lines = return_report(output_file)
File "C:\pyproject\ChineseNER-master\conlleval.py", line 284, in return_report
counts = evaluate(f)
File "C:\pyproject\ChineseNER-master\conlleval.py", line 74, in evaluate
for line in iterable:
File "C:\Python35\lib\codecs.py", line 711, in next
return next(self.reader)
File "C:\Python35\lib\codecs.py", line 642, in next
line = self.readline()
File "C:\Python35\lib\codecs.py", line 555, in readline
data = self.read(readsize, firstline=True)
File "C:\Python35\lib\codecs.py", line 501, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte
Hi,您好,请教一下sighan.dev这个数据集和训练集和测试集有什么不同?
Traceback (most recent call last):
File "E:\python2.7\pycharm\PyCharm 4.5.5\helpers\pydev\pydevd.py", line 2358, in
globals = debugger.run(setup['file'], None, None, is_module)
File "E:\python2.7\pycharm\PyCharm 4.5.5\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "E:\python2.7\pycharm\PyCharm 4.5.5\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "ChineseNER-master/main.py", line 225, in
tf.app.run(main)
File "tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "ChineseNER-master/main.py", line 219, in main
train()
File "ChineseNER-master/main.py", line 185, in train
best = evaluate(sess, model, "dev", dev_manager, id_to_tag, logger)
File "ChineseNER-master/main.py", line 85, in evaluate
eval_lines = test_ner(ner_results, FLAGS.result_path)
File "ChineseNER-master\utils.py", line 66, in test_ner
eval_lines = return_report(output_file)
File "ChineseNER-master\conlleval.py", line 282, in return_report
counts = evaluate(f)
File "ChineseNER-master\conlleval.py", line 74, in evaluate
for line in iterable:
File "tensorflow\lib\codecs.py", line 713, in next
return next(self.reader)
File "tensorflow\lib\codecs.py", line 644, in next
line = self.readline()
File "tensorflow\lib\codecs.py", line 557, in readline
data = self.read(readsize, firstline=True)
File "tensorflow\lib\codecs.py", line 501, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte
我的是tensorflow 1.3版本,请问下大家有没有遇到类似问题?有何解决方法。
model.py
文件第 135 行的 project_layer 方法,定义了一个 hidden
层维度为 [self.lstm_dim*2, self.lstm_dim]
,然后定义 pred
维度为 [self.lstm_dim, self.num_tags]
。
为什么不直接从只定义一个 hidden layer,维度为 [self.lstm_dim*2, self.num_tags] ?
for word in sentence:
for char in word:
# do something
if word.lower() == word:
# do something
if word[0].upper() == word:
# do something
Thank you in advance!
提示没有config_file,这个文件从哪里得到啊?
为什么我用我自己的数据集训练,会报下面这个错误?
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.