huanghuidmml / tfbert Goto Github PK

View Code? Open in Web Editor NEW

58.0 58.0 11.0 7.63 MB

基于tensorflow1.x的预训练模型调用，支持单机多卡、梯度累积，XLA加速，混合精度。可灵活训练、验证、预测。

Python 100.00%

albert bert chinesebert electra ernie ernie-gram mixed-precision tensorflow trainer xla

tfbert's People

Contributors

Stargazers

Watchers

Forkers

823858275 sataliulan holykikyou lazykindman ericdoug-qi starlight-2021 haojiepan1 slidersun haierai chrismii hanguangmic

tfbert's Issues

The problem happened when I use chinese_bert_chinese_wwm_L-12_H-768_A-12 model with run_elment_extract.py script.

WARNING:tensorflow:From /opt/tfbert/tfbert/models/layers.py:28: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
Traceback (most recent call last):
File "/opt/tfbert/run_element_extract.py", line 292, in
main()
File "/opt/tfbert/run_element_extract.py", line 253, in main
args.model_dir if args.pretrained_checkpoint_path is None else args.pretrained_checkpoint_path)
File "/opt/tfbert/tfbert/trainer.py", line 175, in from_pretrained
utils.init_checkpoints(ckpt, True)
File "/opt/tfbert/tfbert/utils.py", line 261, in init_checkpoints
prefix=prefix)
File "/opt/tfbert/tfbert/utils.py", line 239, in get_assignment_map_from_checkpoint
init_vars = tf.train.list_variables(init_checkpoint)
File "/opttensorflow_core/python/training/checkpoint_utils.py", line 97, in list_variables
reader = load_checkpoint(ckpt_dir_or_file)
File "/opttensorflow_core/python/training/checkpoint_utils.py", line 66, in load_checkpoint
return pywrap_tensorflow.NewCheckpointReader(filename)
File "/opttensorflow_core/python/pywrap_tensorflow_internal.py", line 873, in NewCheckpointReader
return CheckpointReader(compat.as_bytes(filepattern))
File "/opttensorflow_core/python/pywrap_tensorflow_internal.py", line 885, in init
this = _pywrap_tensorflow_internal.new_CheckpointReader(filename)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /opt/models/bert/chinese_bert_chinese_wwm_L-12_H-768_A-12/publish/bert_model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

单个epoch多个batch之间loss抖动过大

使用作者代码训练数据，在模型收敛之后，loss抖动仍然很大，和其它代码相比则无此类问题（在英语训练语料上）

加载模型问题

NER任务加载模型进行测试比直接训练之后的测试降了五个点。
加载完模型之后不进行初始化会报错，请问是否是部分参数未加载导致的。

I got those error when I was run the python file of run_element_extract.py.

/opt/conda/lib/python3.7/site-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return array(a, dtype, copy=False, order=order)
Traceback (most recent call last):
File "run_element_extract.py", line 277, in
main()
File "run_element_extract.py", line 234, in main
trainer.build_model(model_fn=get_model_fn(config, args))
File "/opt/tfbert/tfbert/trainer.py", line 621, in build_model
model_output = model_fn(inputs, True)
File "run_element_extract.py", line 126, in model_fn
**inputs
File "/opt/tfbert/tfbert/models/for_task.py", line 173, in init
compute_type=compute_type
File "/opt/tfbert/tfbert/models/bert.py", line 152, in init
input_shape = model_utils.get_shape_list(input_ids, expected_rank=2)
File "/opt/tfbert/tfbert/models/model_utils.py", line 196, in get_shape_list
assert_rank(tensor, expected_rank, name)
File "/opt/tfbert/tfbert/models/model_utils.py", line 241, in assert_rank
(name, scope_name, actual_rank, str(tensor.shape), str(expected_rank)))
ValueError: For the tensor IteratorGetNext:1 in scope ``, the actual rank 1 (shape = (?,)) is not equal to the expected rank `

huanghuidmml / tfbert Goto Github PK

tfbert's People

Contributors

Stargazers

Watchers

Forkers

tfbert's Issues

多卡会报错

UNILM实现

fgm方式哪里实现错误呢？

The problem happened when I use chinese_bert_chinese_wwm_L-12_H-768_A-12 model with run_elment_extract.py script.

单个epoch多个batch之间loss抖动过大

加载模型问题

I got those error when I was run the python file of run_element_extract.py.

对抗训练保存模型

请问7个G数据在进行Chinesebert预训练时，数据量有些大，如何改成tfrecord的形式

楼主找工作吗？

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent