Comments (1)
您好,感谢您对夫子•明察项目的关注,
我们参考了一些开源项目[1][2]和其他模型的预训练方法 (比如 LLaMA 等),使用的是普通的自回归生成任务,没有完全按照 GLM 论文中所采用的类BERT的训练任务。
自回归生成任务大致对应于 GLM130B 论文[3]中的 Self-Supervised Blank Infilling (95% tokens)
。
我们使用的格式为
[gMASK] <sop> X1 [gMASK] <sop> X2 [gMASK] <sop> X3 ...
其中 [gMASK]
和 <sop>
是 ChatGLM 中的 special token,[gMASK]
表示长文本生成的掩码占位符。
对于训练语料,我们使用了公开的法律文书和法律法规数据集,详见 训练数据的介绍。
[1] https://github.com/hiyouga/ChatGLM-Efficient-Tuning
[2] https://github.com/hiyouga/LLaMA-Efficient-Tuning
[3] GLM-130B: An Open Bilingual Pre-trained Model, Aohan et al. ICLR22'
from fuzi.mingcha.
Related Issues (18)
- how to use HOT 1
- TypeError: slice indices must be integers or None or have an __index__ method HOT 1
- AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer' HOT 3
- 请问你们论文发出来了吗 HOT 2
- 加载模型出错,ValueError: Unrecognized configuration class <class 'transformers_modules.fuzi-mingcha-v1_0.configuration_chatglm.ChatGLMConfig'> HOT 7
- 什么时候打算开源? HOT 1
- 请问这个模型是否具备合同审查方面的能力 HOT 1
- 多GPU运行 HOT 7
- 镜像文件格式不对 HOT 1
- position_ids报错:'NoneType' object has no attribute 'max' HOT 4
- 政法大学的伙伴介绍夫子项目,期待一起交流
- ModuleNotFoundError: No module named 'transformers_modules.fuzi' HOT 1
- 法条检索有误 HOT 1
- 运行环境的 python 版本是多少呢 HOT 2
- 问下huggingface的模型文件能上传到网盘吗,国内上huggingface拉取不到数据,很难受的。 HOT 4
- 建议修改requirement.txt的transformers版本,transformer按照最新版本4.35.2会出现异常 HOT 1
- UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 7: illegal multibyte sequence HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fuzi.mingcha.