Comments (2)
后续会支持自动转换模型参数,当前先给出转换脚本,临时使用。具体使用步骤根据模型不同,需要进行修改,请参考以下说明:
- 模型前缀。paddlenlp中使用的模型前缀为模型名称,例如Qwen2使用qwen2.,LLaMA使用llama.,HF使用的model.或者transformer.。可查看paddlenlp.transformers.XXX.modeling.LlamaPretrainedModel中的base_model_prefix。
- linear参数转置。Paddle和torch的linear实现不同,参数互为转置关系。需根据不同模型进行判断,具体可查找paddlenlp.transformers.XXX.modeling.XXXPretrainedModel中的mappings参数,其标注了需要进行转置的参数。例如llama模型为paddlenlp.transformers.llama.modeling.LlamaPretrainedModel
脚本:https://gist.github.com/DrownFish19/80b43383c9205ee1cf7cf35445009488
from paddlenlp.
后续会支持自动转换模型参数,当前先给出转换脚本,临时使用。具体使用步骤根据模型不同,需要进行修改,请参考以下说明:
- 模型前缀。paddlenlp中使用的模型前缀为模型名称,例如Qwen2使用qwen2.,LLaMA使用llama.,HF使用的model.或者transformer.。可查看paddlenlp.transformers.XXX.modeling.LlamaPretrainedModel中的base_model_prefix。
- linear参数转置。Paddle和torch的linear实现不同,参数互为转置关系。需根据不同模型进行判断,具体可查找paddlenlp.transformers.XXX.modeling.XXXPretrainedModel中的mappings参数,其标注了需要进行转置的参数。例如llama模型为paddlenlp.transformers.llama.modeling.LlamaPretrainedModel
脚本:https://gist.github.com/DrownFish19/80b43383c9205ee1cf7cf35445009488
Thanks !
from paddlenlp.
Related Issues (20)
- [Bug]: GPTQ量化报错expand_shape[i] != 0
- [Question]: uie模型训练时max_seq_length 512报太小 1024又爆显存 HOT 4
- [Bug]: /home/ubuntu/miniconda3/envs/paddle/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1189, in build_relation if negative_mask.sum() > 0: AttributeError: 'bool' object has no attribute 'sum'
- [Bug]: paddlenlp/transformers/layoutxlm/modeling.py line:1183-1188 测试下来代码存在bug,negative_mask一直为False的bool值
- [Bug]: run_finetune.py ./config/qwen/lora_argument.json 参数解析错误 HOT 2
- [Question]: llama多卡高性能推理 HOT 2
- [Bug]: PaddleNLP predict/predictor.py 出现 AttributeError: 'LlamaConfig' object has no attribute 'use_fast_layer_norm'
- [Question]: 使用训练完的uie-nano模型,信息抽取效果仍不佳 HOT 1
- [Bug]: Got Exception during training 'GPT3-1.3B' - TypeError: object of type 'NoneType' has no len() HOT 5
- [Question]: 在使用UIE 进行事件提取时,我想在自己的数据上进行训练, 但是没有触发词的标注. 请问UIE在是否能够支持无触发词数据的模型训练? HOT 1
- paddle3.0与paddlenlp兼容问题 HOT 2
- [Question]: 数据集加载失败,老是报错。 HOT 1
- paddlenlp wordtag转onnx推理有提升吗
- [Question]: 关于开启block_attn后模型的停止条件
- [Question]: lexical_analysis不支持排序模式了么
- [Docs]:
- [Question]: 关于llm的pretrain部分代码实现与sft部分代码实现
- 增强 paddlenlp 以支持多轮对话、agent对话和工具对话
- [Improvement Request] 简化数据集加载逻辑并改进文档支持 HOT 1
- 在 `paddlenlp 3.0` 的微调案例中数据结构有限性无法满足当前大模型微调所需要的数据结构 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paddlenlp.