I added a 4-bit load after the command LoRA training with ZeRO-3 on two or more GPUs t

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I had similar problems and I decided to use the multi_gpu and set the param to

How to QLoRA training with ZeRO-3 on two or more GPUs? about alignment-handbook HOT 4 OPEN

huggingface commented on August 26, 2024

How to QLoRA training with ZeRO-3 on two or more GPUs?

from alignment-handbook.

Comments (4)

alvarobartt commented on August 26, 2024

Hi @Di-Zayn, note that you will need to also modify the configuration used for DeepSpeed ZeRO 3, as the one they share is the one is suited for a VM with 8 x A100 80GB, so to suit your needs you may need to add the flags required to load and train using a lower precision.

Anyway not sure about how to fine-tune that using NF4, but maybe https://www.deepspeed.ai/tutorials/MoQ-tutorial/#deepspeed-configuration-file is worth checking?

from alignment-handbook.

laphang commented on August 26, 2024

I'm getting this issue as well (trying qlora with ZeRO-3 and 4 gpus, same error message), @Di-Zayn were you able to solve it?

from alignment-handbook.

Serega6678 commented on August 26, 2024

I had similar problems and I decided to use the multi_gpu script and set the param to just use 2 GPUs and everything was working fine: https://github.com/huggingface/alignment-handbook/blob/main/recipes/accelerate_configs/multi_gpu.yaml

However, on the Zero code, the starting loss was like 1.7 instead of 1.4 with the multi-gpu script both when using 1 or 2 GPUs

I never bothered further experimenting with Zero as I got the results I needed with multi_gpu script

from alignment-handbook.

laphang commented on August 26, 2024

I was keen on sharding the model across gpus in order to be able to allow for larger models.

As an aside, the latest FSDP and qlora examples are working for me - that works for my use case
606d2e9

from alignment-handbook.

Recommend Projects

How to QLoRA training with ZeRO-3 on two or more GPUs? about alignment-handbook HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent