Comments (4)
Even xturing fails out of the box, which is using deepspeed. https://github.com/stochasticai/xturing/blob/main/examples/gptj/gptj_lora.py
(env) arno@rippa:/nfs4/llm/xturing/examples/gptj(main)$ python gptj_lora.py
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Loading binary /nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes-0.37.2-py3.10.egg/bitsandbytes/libbitsandbytes_cuda121.so...
trainable params: 3670016 || all params: 6054552800 || trainable%: 0.060615806339982696
2023-03-28 09:37:14,171 | DEBUG | xturing.models.causal 34 | Finetuning parameters: {'learning_rate': '1e-4', 'gradient_accumulation_steps': 1, 'batch_size': 4, 'weight_decay': 0.01, 'warmup_steps': 50, 'eval_steps': 5000, 'save_steps': 5000, 'max_length': 512, 'num_train_epochs': 3, 'logging_steps': 10, 'max_grad_norm': 2.0, 'save_total_limit': 4, 'optimizer_name': 'adamw', 'output_dir': 'saved_model'}
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/lightning_fabric/connector.py:562: UserWarning: 16 is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!
rank_zero_warn(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
warning_cache.warn(
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/configuration_validator.py:72: PossibleUserWarning: You defined a `validation_step` but have no `val_dataloader`. Skipping val loop.
rank_zero_warn(
initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/2
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Loading binary /nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/bitsandbytes-0.37.2-py3.10.egg/bitsandbytes/libbitsandbytes_cuda121.so...
trainable params: 3670016 || all params: 6054552800 || trainable%: 0.060615806339982696
2023-03-28 09:39:13,881 | DEBUG | xturing.models.causal 34 | Finetuning parameters: {'learning_rate': '1e-4', 'gradient_accumulation_steps': 1, 'batch_size': 4, 'weight_decay': 0.01, 'warmup_steps': 50, 'eval_steps': 5000, 'save_steps': 5000, 'max_length': 512, 'num_train_epochs': 3, 'logging_steps': 10, 'max_grad_norm': 2.0, 'save_total_limit': 4, 'optimizer_name': 'adamw', 'output_dir': 'saved_model'}
initializing deepspeed distributed: GLOBAL_RANK: 1, MEMBER: 2/2
Enabling DeepSpeed FP16.
You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1]
Using /home/arno/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
Using /home/arno/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
Emitting ninja build file /home/arno/.cache/torch_extensions/py310_cu117/utils/build.ninja...
Building extension module utils...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module utils...
Time to load utils op: 0.053357839584350586 seconds
Loading extension module utils...
Time to load utils op: 0.10169410705566406 seconds
Rank: 1 partition count [2] and sizes[(1835008, False)]
Rank: 0 partition count [2] and sizes[(1835008, False)]
Using /home/arno/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module utils, skipping build step...
Loading extension module utils...
Time to load utils op: 0.00026106834411621094 seconds
Using /home/arno/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
No modifications detected for re-loaded extension module utils, skipping build step...
Loading extension module utils...
Time to load utils op: 0.0002510547637939453 seconds
| Name | Type | Params
-------------------------------------------------------
0 | pytorch_model | PeftModelForCausalLM | 6.1 B
-------------------------------------------------------
3.7 M Trainable params
6.1 B Non-trainable params
6.1 B Total params
24,218.211Total estimated model params size (MB)
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2365: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
warnings.warn(
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2365: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
warnings.warn(
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 64 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 0%| | 0/6501 [00:00<?, ?it/s]You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2365: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
warnings.warn(
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2365: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
warnings.warn(
Epoch 0: 0%| | 7/6501 [00:02<46:06, 2.35it/s, v_num=0, loss=6.350]Traceback (most recent call last):
File "/nfs4/llm/xturing/examples/gptj/gptj_lora.py", line 8, in <module>
model.finetune(dataset=instruction_dataset)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/xturing/models/causal.py", line 62, in finetune
trainer.fit()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/xturing/trainers/lightning_trainer.py", line 179, in fit
self.trainer.fit(self.lightning_model)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 520, in fit
call._call_and_handle_interrupt(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 92, in launch
return function(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 559, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 935, in _run
results = self._run_stage()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 978, in _run_stage
self.fit_loop.run()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 201, in run
self.advance()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 354, in advance
self.epoch_loop.run(self._data_fetcher)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 133, in run
self.advance(data_fetcher)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 218, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 185, in run
self._optimizer_step(kwargs.get("batch_idx", 0), closure)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 261, in _optimizer_step
call._call_lightning_module_hook(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 142, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 1266, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 158, in step
step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/strategies/ddp.py", line 257, in optimizer_step
optimizer_output = super().optimizer_step(optimizer, closure, model, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 224, in optimizer_step
return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/deepspeed.py", line 92, in optimizer_step
closure_result = closure()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 140, in __call__
self._result = self.closure(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 126, in closure
step_output = self._step_fn()
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 308, in _training_step
training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 288, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/strategies/ddp.py", line 329, in training_step
return self.model(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 11, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1846, in forward
loss = self.module(*inputs, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/pytorch_lightning/overrides/base.py", line 90, in forward
output = self._forward_module.training_step(*inputs, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/xturing/trainers/lightning_trainer.py", line 73, in training_step
loss = self.model_engine.training_step(batch)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/xturing/engines/causal.py", line 48, in training_step
outputs = self.model(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/peft/peft_model.py", line 529, in forward
return self.base_model(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 852, in forward
transformer_outputs = self.transformer(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 687, in forward
outputs = block(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 308, in forward
attn_outputs = self.attn(
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/nfs4/llm/h2o-llm/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 236, in forward
query = torch.cat([q_rot, q_pass], dim=-1)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 1; 23.69 GiB total capacity; 22.86 GiB already allocated; 13.56 MiB free; 23.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTOR
from h2ogpt.
https://nn.labml.ai/neox/samples/finetune.html
from h2ogpt.
from h2ogpt.
Should be able to just use FSDP in PyTorch
https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/#auto-wrapping
from h2ogpt.
Related Issues (20)
- OCR issue HOT 1
- shared / personal collections HOT 1
- Failed to initial linux full script intallation HOT 2
- random assertion errors due to evaluate_nochat HOT 13
- Run docker image on any machine which haven't internet connection HOT 19
- h2ogpt vllm-check init-container stuck when istio injection
- GPU offloading mistralai_mistral-7b-instruct-v0.2 HOT 3
- Windows fatal exception: Access violation HOT 3
- Failed to load models HOT 2
- TimeoutError: answer_question_using_context timed out, took more than 60s
- doctr for scanned pdf HOT 6
- pytorch_model.bin 1.34G download hangs forever on Linux HOT 7
- umbrella podSecurityContext null values are always overwritten by sub-chart default values
- [Question] how model learn data from new document ? HOT 1
- EventListener Failure HOT 2
- GPU Installation HOT 18
- Enchance h2oGPT UI to have librechat like features. HOT 2
- Sepparate Upload Document to Database H2O and Query-Summary HOT 1
- Linux install of h2ogpt--Require corrections in install Instructions HOT 5
- failed to concatenate document_choice HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2ogpt.