jackaduma / vicuna-lora-rlhf-pytorch Goto Github PK

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

License: MIT License

Python 100.00%

chatgpt finetune gpt llama llm lora peft ppo pytorch reward-models

vicuna-lora-rlhf-pytorch's People

Contributors

Stargazers

Watchers

Forkers

touchwolf greatheart1000 dty0606 reborm yiranvang codegeek04 yiran-hao integritynoble arielsho andy7166 codezealot samqin123 ai-jie01 tejaswi-kashyap-006 mingkin opencvbaby xinczhang

vicuna-lora-rlhf-pytorch's Issues

CUDA out of memory

你好，我正在使用 T4， 16GB 的 GPU 来做 fine tuning。但是一直遇到 CUDA out of memory error. 请问您是用什么 parameters 来做 fine tuning 的呢？多谢！

MICRO_BATCH_SIZE = 4 # this could actually be 5 but i like powers of 2
BATCH_SIZE = 8
MAX_STEPS = None
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 3 # we don't always need 3 tbh
LEARNING_RATE = 3e-4 # the Karpathy constant
CUTOFF_LEN = 256 # 256 accounts for about 96% of the data
LORA_R = 8
LORA_ALPHA = 16
LORA_DROPOUT = 0.05
VAL_SET_SIZE = args.test_size # 2000
TARGET_MODULES = [
"q_proj",
"v_proj",
]

What is the data format to LoRA-fine-tune Vicuna?

Since https://github.com/lm-sys/FastChat/ does not publish its data, but mentions it "enhanced the training scripts provided by Alpaca to better handle multi-round conversations and long sequences", I looked at ShareGPT Vicuna datasets on Huggingface, and they contain conversations.

Now I see in this repo, data/merge_sample.json is used as data_path for the script supervised_finetune.py, but it contains Aplaca-like instruction, input, output triples.

Can we use supervised_finetune.py to fine-tune on conversations, e.g. in the format as the ShareGPT Vicuna datasets on Huggingface? If so, have you tried such a fine-tuning? If not, do you know of some repo that offers Vicuna fine-tuning based on conversations? Do you think supervised_finetune.py can be adapted easily to allow fine-tuning based on conversations?

大神和原版vicuna仓库对比过效果吗？

我想基于vicuna-7b，在自己的数据集上做微调试一下。
仅仅做指令微调，相当于只做你的第一步supervised_finetune，
但是我不确定这样做的效果，大神有什么建议吗？

大神你的第一步supervised_finetune跑完，特别是代码和数理逻辑，和原版vicuna仓库对比过效果吗？

how to evaluate?

how to evaluate the model?

跑最后一步报这个警告，要怎么改超参数呢

/root/miniconda3/envs/Vicuna/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.00 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters

supervised_finetune.py failed with a wordaround

(gh_Vicuna-LoRA-RLHF-PyTorch) amd00@asus00:~/llm_dev/Vicuna-LoRA-RLHF-PyTorch$ python supervised_finetune.py --data_path './data/merge_sample.json' --output_path 'lora-Vicuna' --model_path './weights/vicuna-7b' --eval_steps 200 --save_steps 200 --test_size 1

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
/home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
CUDA SETUP: Loading binary /home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
./weights/vicuna-7b
Overriding torch_dtype=None with torch_dtype=torch.float16 due to requirements of bitsandbytes to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/amd00/llm_dev/Vicuna-LoRA-RLHF-PyTorch/supervised_finetune.py:72 in │
│ │
│ 69 │ device_map = {"": int(os.environ.get("LOCAL_RANK") or 0)} │
│ 70 │ GRADIENT_ACCUMULATION_STEPS = GRADIENT_ACCUMULATION_STEPS // world_size │
│ 71 print(args.model_path) │
│ ❱ 72 model = LlamaForCausalLM.from_pretrained( │
│ 73 │ args.model_path, │
│ 74 │ load_in_8bit=True, │
│ 75 │ device_map=device_map │
│ │
│ /home/amd00/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2740 in │
│ from_pretrained │
│ │
│ 2737 │ │ │ │ │ key: device_map[key] for key in device_map.keys() if key not in modu │
│ 2738 │ │ │ │ } │
│ 2739 │ │ │ │ if "cpu" in device_map_without_lm_head.values() or "disk" in device_map_ │
│ ❱ 2740 │ │ │ │ │ raise ValueError( │
│ 2741 │ │ │ │ │ │ """ │
│ 2742 │ │ │ │ │ │ Some modules are dispatched on the CPU or the disk. Make sure yo │
│ 2743 │ │ │ │ │ │ the quantized model. If you want to dispatch the model on the CP │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError:
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
these modules in 32-bit, you need to set load_in_8bit_fp32_cpu_offload=True and pass a custom
device_map to from_pretrained. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details.

(gh_Vicuna-LoRA-RLHF-PyTorch) amd00@asus00:~/llm_dev/Vicuna-LoRA-RLHF-PyTorch$

unable to merge reward adapter into model

while I am doing the second last step
Merge Reward adapter into Model: python merge_peft_adapter.py --model_name ./reward_model_vicuna-7b

I got the following error

Traceback (most recent call last):
  File "/home/xuan/anaconda3/envs/vicuna_lora/lib/python3.9/site-packages/peft/tuners/lora.py", line 382, in __getattr__
    return super().__getattr__(name)  # defer to nn.Module's logic
  File "/home/xuan/anaconda3/envs/vicuna_lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LoraModel' object has no attribute '_get_submodules'

how to solve this problem?

never mind, the problem is solved by PEFT的版本，目前从git上安装的是 0.3.0.dev0 版本，在merge_peft_adapter的时候有问题，需要切换到peft==0.2.0 (0.3.0.dev0 没有 _get_submodules()这个函数) as mentioend in the github page.

Unable to merge reward adapter into model

Calling python merge_peft_adapter.py --model_name ./reward_model_vicuna-7b
yields

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /home/paperspace/anaconda3/envs/vic310 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.7/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
script_args:  ScriptArguments(model_name='./reward_model_vicuna-7b', output_name=None)
Traceback (most recent call last):
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/peft/utils/config.py", line 106, in from_pretrained
    config_file = hf_hub_download(pretrained_model_name_or_path, CONFIG_NAME, subfolder=subfolder)
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 112, in _inner_fn
    validate_repo_id(arg_value)
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 166, in validate_repo_id
    raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: './reward_model_vicuna-7b'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/paperspace/Vicuna-LoRA-RLHF-PyTorch/merge_peft_adapter.py", line 33, in <module>
    peft_config = PeftConfig.from_pretrained(peft_model_id)
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/peft/utils/config.py", line 108, in from_pretrained
    raise ValueError(f"Can't find '{CONFIG_NAME}' at '{pretrained_model_name_or_path}'")
ValueError: Can't find 'adapter_config.json' at './reward_model_vicuna-7b'

When calling python -m bitsandbytes, I do get SUCCESS! Installation was successful!.

When calling python merge_peft_adapter.py --model_name ./reward_model_vicuna-7b_100_2e-05/, I get the error

key:  
Traceback (most recent call last):
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/peft/tuners/lora.py", line 278, in __getattr__
    return super().__getattr__(name)  # defer to nn.Module's logic
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LoraModel' object has no attribute '_get_submodules'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/paperspace/Vicuna-LoRA-RLHF-PyTorch/merge_peft_adapter.py", line 60, in <module>
    parent, target, target_name = model.base_model._get_submodules(key) 
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/peft/tuners/lora.py", line 280, in __getattr__
    return getattr(self.model, name)
  File "/home/paperspace/anaconda3/envs/vic310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LlamaForCausalLM' object has no attribute '_get_submodules'. Did you mean: 'get_submodule'?

after loading checkpoint shards.

Can we another format than alpaca-instruct like alpaca-chat instruct format if yes how ?

Does it really work ob RTX2080Ti ?

What optimizations have been made to make it possible to work on 12G-RTX2080Ti ? Reduce accuracy to 2bit ?

SFT with large loss {'loss': 388082722196684.8, 'learning_rate': 0.0, 'epoch': 0.02}

The SFT loss is very large and becomes inf then nan at last.
{'loss': 240023.775, 'learning_rate': 0.0002762320648783531, 'epoch': 0.47}
{'loss': inf, 'learning_rate': 0.0002743605739238927, 'epoch': 0.49}
{'loss': nan, 'learning_rate': 0.00027248908296943227, 'epoch': 0.51}

不能理解为什么注释这行代码？

52行返回的哪个代码，看起来没有逻辑问题

请问如何在training reward model中自定义数据集

reward model中没有任何参数来自定义数据集，可以提供一个修改方案，类似于sft model训练中的--data_path来自定义一个json文件作为reward model的训练数据吗？

另，这个项目是不是已经不再计划维护下去了？本月的若干issue都没有回复

python train_reward_model.py failed

(gh_Vicuna-LoRA-RLHF-PyTorch) amd00@asus00:~/llm_dev/Vicuna-LoRA-RLHF-PyTorch$ python train_reward_model.py --model_name './weights/vicuna-7b' --gradient_accumulation_steps 32 --per_device_train_batch_size 1 --train_subset 100 --eval_subset 10 --local_rank 0 --bf16 False

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
/home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
CUDA SETUP: Loading binary /home/amd00/anaconda3/envs/gh_Vicuna-LoRA-RLHF-PyTorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
dataset_name: ./datasets/
device_map: auto
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/amd00/llm_dev/Vicuna-LoRA-RLHF-PyTorch/train_reward_model.py:172 in │
│ │
│ 169 # ) │
│ 170 │
│ 171 if "llama" in script_args.model_name or "vicuna" in script_args.model_name or "Vicuna" i │
│ ❱ 172 │ model = LlamaForSequenceClassification.from_pretrained( │
│ 173 │ │ script_args.model_name, │
│ 174 │ │ num_labels=1, │
│ 175 │ │ load_in_8bit=True, │
│ │
│ /home/amd00/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2740 in │
│ from_pretrained │
│ │
│ 2737 │ │ │ │ │ key: device_map[key] for key in device_map.keys() if key not in modu │
│ 2738 │ │ │ │ } │
│ 2739 │ │ │ │ if "cpu" in device_map_without_lm_head.values() or "disk" in device_map_ │
│ ❱ 2740 │ │ │ │ │ raise ValueError( │
│ 2741 │ │ │ │ │ │ """ │
│ 2742 │ │ │ │ │ │ Some modules are dispatched on the CPU or the disk. Make sure yo │
│ 2743 │ │ │ │ │ │ the quantized model. If you want to dispatch the model on the CP │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError:
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
these modules in 32-bit, you need to set load_in_8bit_fp32_cpu_offload=True and pass a custom
device_map to from_pretrained. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details.

(gh_Vicuna-LoRA-RLHF-PyTorch) amd00@asus00:~/llm_dev/Vicuna-LoRA-RLHF-PyTorch$

any plans for adding repo using stable vicuna for conversation .. human: assistant

Hi,
Thanks for the amazing repo, checking if you have any plans to add a repo using stablevicuna.
Morever, if you can share hardware requirements to download the checkpoints, that will be awesome. I ran out of memory using Kaggle notebook.
Looking forward to hearing from you.
Best,
Andy

i tried to create for human bot within a single column message text but still getting error:

%%writefile sft_dataloader.py
def format_prompt(prompt: str) -> str:
    text = f"""
### Human: {prompt}
### Assistant:
    """
    return text.strip()


class SFTDataLoader(object):
    def __init__(self, data, CUTOFF_LEN, VAL_SET_SIZE, tokenizer) -> None:
        super(SFTDataLoader, self).__init__()

        self.data = data
        self.CUTOFF_LEN = CUTOFF_LEN
        self.VAL_SET_SIZE = VAL_SET_SIZE

        self.tokenizer = tokenizer

    def generate_prompt(self, data_point):
        return format_prompt(data_point["message_tree_text"])

    def tokenize(self, prompt):
        # there's probably a way to do this with the tokenizer settings
        # but again, gotta move fast
        return self.tokenizer(
            prompt,
            truncation=True,
            max_length=self.CUTOFF_LEN + 1,
            padding="max_length",
            return_unused_tokens=True,
        )

    def generate_and_tokenize_prompt(self, data_point):
        # This function masks out the labels for the input,
        # so that our loss is computed only on the response.
        user_prompt = format_prompt(data_point["message_tree_text"])
        len_user_prompt_tokens = len(
            self.tokenizer(
                user_prompt,
                truncation=True,
                max_length=self.CUTOFF_LEN + 1,
                padding="max_length",
                return_unused_tokens=True,
            )["input_ids"]
        )
        full_tokens = self.tokenizer(
            user_prompt,
            truncation=True,
            max_length=self.CUTOFF_LEN + 1,
            padding="max_length",
            return_unused_tokens=True,
        )["input_ids"]
        return {
            "input_ids": full_tokens[: len_user_prompt_tokens],
            "labels": full_tokens[len_user_prompt_tokens:],
            "attention_mask": [1] * len(full_tokens),
        }

    def load_data(self):
        train_val = formatted_dataset.train_test_split(
            test_size=self.VAL_SET_SIZE, shuffle=True, seed=42
        )
        train_data = train_val["train"].shuffle().map(
            self.generate_and_tokenize_prompt
        )
        val_data = train_val["test"].shuffle().map(
            self.generate_and_tokenize_prompt
        )

        return train_data, val_data

jackaduma / vicuna-lora-rlhf-pytorch Goto Github PK

vicuna-lora-rlhf-pytorch's People

Contributors

Stargazers

Watchers

Forkers

vicuna-lora-rlhf-pytorch's Issues

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

how to solve this problem?

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

Recommend Projects

Recommend Topics

Recommend Org