Giter Club home page Giter Club logo

chatglm-instruct-tuning's Introduction

ChatGLM-Instruct-Tuning

基于清华的 ChatGLM-6B + Alpaca 方式进行finetune.

数据集: 中文alpaca

准备

安装依赖

pip install -r requirements.txt

下载数据

cd data
git clone https://github.com/carbonz0/alpaca-chinese-dataset

数据预处理

转化alpaca数据集为按行存储的Intruct格式数据

python cover_alpaca2jsonl.py

然后把数据划分为train.txt和valid.txt,保存在 ./data/example/路径下

训练

bash scripts/finetune.sh

推理

# 要先把文件中的 "output/your_model_dir"替换为实际模型路径
python infer.py

交流

如果有疑问和建议,欢迎加入我们的大模型交流群 group

chatglm-instruct-tuning's People

Contributors

s65b40 avatar thinksoso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

chatglm-instruct-tuning's Issues

为什么我运行代码报错,ValueError: 150001 is not in list

Traceback (most recent call last):
File "run_clm.py", line 564, in
main()
File "run_clm.py", line 512, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/transformers/trainer.py", line 1633, in train
return inner_training_loop(
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/transformers/trainer.py", line 1902, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/transformers/trainer.py", line 2645, in training_step
loss = self.compute_loss(model, inputs)
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/transformers/trainer.py", line 2677, in compute_loss
outputs = model(**inputs)
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/[email protected]/lck/ChatGLM-Instruct-Tuning/modeling_chatglm.py", line 1033, in forward
transformer_outputs = self.transformer(
File "/opt/conda/envs/GLM_instruct/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/[email protected]/lck/ChatGLM-Instruct-Tuning/modeling_chatglm.py", line 836, in forward
mask_position = seq.index(mask_token)
ValueError: 150001 is not in list
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /workspace/[email protected]/lck/ChatGLM-Instruct-Tuning/wandb/offline-run-20230408_040222-i5caj5n7
wandb: Find logs at: ./wandb/offline-run-20230408_040222-i5caj5n7/logs

3090 24g爆显存了,,

--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--train_file ./data/train.txt
--validation_file ./data/valid.txt
--max_seq_length 256 \

减小了batch_size 还是不行QAQ

调用微调之后的模型没有输出

您好,基于您的代码,我在本地数据上进行了微调,但调用本地微调之后的模型后 chat 输出的答复都是空字符串,想请问一下是有什么特殊的调用方法吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.