stochasticai / xturing Goto Github PK

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Home Page: https://xturing.stochastic.ai

License: Apache License 2.0

Python 100.00%

deep-learning fine-tuning gpt-2 gpt-j llama llm lora language-model alpaca finetuning

xturing's Introduction

Build, modify, and control your own personalized LLMs

xTuring provides fast, efficient and simple fine-tuning of open-source LLMs, such as Mistral, LLaMA, GPT-J, and more. By providing an easy-to-use interface for fine-tuning LLMs to your own data and application, xTuring makes it simple to build, modify, and control LLMs. The entire process can be done inside your computer or in your private cloud, ensuring data privacy and security.

With xTuring you can,

Ingest data from different sources and preprocess them to a format LLMs can understand
Scale from single to multiple GPUs for faster fine-tuning
Leverage memory-efficient methods (i.e. INT4, LoRA fine-tuning) to reduce hardware costs by up to 90%
Explore different fine-tuning methods and benchmark them to find the best performing model
Evaluate fine-tuned models on well-defined metrics for in-depth analysis

⚙️ Installation

pip install xturing

🚀 Quickstart

from xturing.datasets import InstructionDataset
from xturing.models import BaseModel

# Load the dataset
instruction_dataset = InstructionDataset("./examples/models/llama/alpaca_data")

# Initialize the model
model = BaseModel.create("llama_lora")

# Finetune the model
model.finetune(dataset=instruction_dataset)

# Perform inference
output = model.generate(texts=["Why LLM models are becoming so important?"])

print("Generated output by the model: {}".format(output))

You can find the data folder here.

🌟 What's new?

We are excited to announce the latest enhancements to our xTuring library:

LLaMA 2 integration - You can use and fine-tune the LLaMA 2 model in different configurations: off-the-shelf, off-the-shelf with INT8 precision, LoRA fine-tuning, LoRA fine-tuning with INT8 precision and LoRA fine-tuning with INT4 precision using the GenericModel wrapper and/or you can use the Llama2 class from xturing.models to test and finetune the model.

from xturing.models import Llama2
model = Llama2()

## or
from xturing.models import BaseModel
model = BaseModel.create('llama2')

Evaluation - Now you can evaluate any Causal Language Model on any dataset. The metrics currently supported is perplexity.

# Make the necessary imports
from xturing.datasets import InstructionDataset
from xturing.models import BaseModel

# Load the desired dataset
dataset = InstructionDataset('../llama/alpaca_data')

# Load the desired model
model = BaseModel.create('gpt2')

# Run the Evaluation of the model on the dataset
result = model.evaluate(dataset)

# Print the result
print(f"Perplexity of the evalution: {result}")

INT4 Precision - You can now use and fine-tune any LLM with INT4 Precision using GenericLoraKbitModel.

# Make the necessary imports
from xturing.datasets import InstructionDataset
from xturing.models import GenericLoraKbitModel

# Load the desired dataset
dataset = InstructionDataset('../llama/alpaca_data')

# Load the desired model for INT4 bit fine-tuning
model = GenericLoraKbitModel('tiiuae/falcon-7b')

# Run the fine-tuning
model.finetune(dataset)

CPU inference - The CPU, including laptop CPUs, is now fully equipped to handle LLM inference. We integrated Intel® Extension for Transformers to conserve memory by compressing the model with weight-only quantization algorithms and accelerate the inference by leveraging its highly optimized kernel on Intel platforms.

# Make the necessary imports
from xturing.models import BaseModel

# Initializes the model: quantize the model with weight-only algorithms
# and replace the linear with Itrex's qbits_linear kernel
model = BaseModel.create("llama2_int8")

# Once the model has been quantized, do inferences directly
output = model.generate(texts=["Why LLM models are becoming so important?"])
print(output)

Batch integration - By tweaking the 'batch_size' in the .generate() and .evaluate() functions, you can expedite results. Using a 'batch_size' greater than 1 typically enhances processing efficiency.

# Make the necessary imports
from xturing.datasets import InstructionDataset
from xturing.models import GenericLoraKbitModel

# Load the desired dataset
dataset = InstructionDataset('../llama/alpaca_data')

# Load the desired model for INT4 bit fine-tuning
model = GenericLoraKbitModel('tiiuae/falcon-7b')

# Generate outputs on desired prompts
outputs = model.generate(dataset = dataset, batch_size=10)

An exploration of the Llama LoRA INT4 working example is recommended for an understanding of its application.

For an extended insight, consider examining the GenericModel working example available in the repository.

CLI playground

$ xturing chat -m "<path-to-model-folder>"

UI playground

from xturing.datasets import InstructionDataset
from xturing.models import BaseModel
from xturing.ui import Playground

dataset = InstructionDataset("./alpaca_data")
model = BaseModel.create("<model_name>")

model.finetune(dataset=dataset)

model.save("llama_lora_finetuned")

Playground().launch() ## launches localhost UI

📚 Tutorials

📊 Performance

Here is a comparison for the performance of different fine-tuning techniques on the LLaMA 7B model. We use the Alpaca dataset for fine-tuning. The dataset contains 52K instructions.

Hardware:

4xA100 40GB GPU, 335GB CPU RAM

Fine-tuning parameters:

{
  'maximum sequence length': 512,
  'batch size': 1,
}

LLaMA-7B	DeepSpeed + CPU Offloading	LoRA + DeepSpeed	LoRA + DeepSpeed + CPU Offloading
GPU	33.5 GB	23.7 GB	21.9 GB
CPU	190 GB	10.2 GB	14.9 GB
Time/epoch	21 hours	20 mins	20 mins

Contribute to this by submitting your performance results on other GPUs by creating an issue with your hardware specifications, memory consumption and time per epoch.

📎 Fine-tuned model checkpoints

We have already fine-tuned some models that you can use as your base or start playing with. Here is how you would load them:

from xturing.models import BaseModel
model = BaseModel.load("x/distilgpt2_lora_finetuned_alpaca")

model	dataset	Path
DistilGPT-2 LoRA	alpaca	`x/distilgpt2_lora_finetuned_alpaca`
LLaMA LoRA	alpaca	`x/llama_lora_finetuned_alpaca`

Supported Models

Below is a list of all the supported models via BaseModel class of xTuring and their corresponding keys to load them.

Model	Key
Bloom	bloom
Cerebras	cerebras
DistilGPT-2	distilgpt2
Falcon-7B	falcon
Galactica	galactica
GPT-J	gptj
GPT-2	gpt2
LlaMA	llama
LlaMA2	llama2
OPT-1.3B	opt

The above mentioned are the base variants of the LLMs. Below are the templates to get their LoRA, INT8, INT8 + LoRA and INT4 + LoRA versions.

Version	Template
LoRA	<model_key>_lora
INT8	<model_key>_int8
INT8 + LoRA	<model_key>_lora_int8

** In order to load any model's INT4+LoRA version, you will need to make use of GenericLoraKbitModel class from xturing.models. Below is how to use it:

model = GenericLoraKbitModel('<model_path>')

The model_path can be replaced with you local directory or any HuggingFace library model like facebook/opt-1.3b.

📈 Roadmap

🤝 Help and Support

If you have any questions, you can create an issue on this repository.

You can also join our Discord server and start a discussion in the #xturing channel.

📝 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🌎 Contributing

As an open source project in a rapidly evolving field, we welcome contributions of all kinds, including new features and better documentation. Please read our contributing guide to learn how you can get involved.

xturing's People

Contributors

Stargazers

Watchers

Forkers

glennko rpatil524 blue0rigin imkett dumpmemory cjnama freezing111 wishgale leedaga eddy668 ai-jie01 3a1b2c3 techthiyanes jianantian ddzipp amirbouraoui hyojunguy b1sounours kajdun dengyandy albertoual c00renut robertjoellewis arakotom cvk98 p2paypeer hustnn manishmg3994 automata-studio mdmmn378 cian0 platform-kit wmlba daizack 00mjk ekapujiw2002 damonguzman delcos aminehadbi robertomalatesta mleacraft hpirela apurba-online albertxiao jokus-pokus eltociear mbrukman rjrobben techventurebuilder veo555 goswamig kjin1 kkimj digksskawk01 ddaying sudosu4pp hhy5277 brstar96 pravinshahi0007 0xmgg grizlupo qqq-tech caphadoop amura dnim-laicifitra gramster chaokw carlosouza swifilaboroka ayusukemiake osub pjahad achilela yas j0hngou aafksab almazkun dut3062796s ixobert foreverlovewisdom thedstrat jaej-dev sillychef theseer507 zcfrank1st ai-ld sanjibnarzary hurricanejin ashishpatel26 kawdoco rpereira90 schoenemeyer mxchinegod schaudge kenhuangus levidehaan wdshin jimyzzp rayjue wesleysanjose

xturing's Issues

INT4 finetuning runtime error

I tried the LLaMA INT4 finetuning using the steps mentioned with a custom dataset. After the model loading and dataset mapping, it gives this error -

File "/peft/finetune.py", line 342, in <module>
    fire.Fire(train)
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/peft/finetune.py", line 309, in train
    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1659, in train
    return inner_training_loop(
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1926, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/peft/finetune.py", line 73, in training_step
    return super().training_step(model, inputs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2696, in training_step
    loss = self.compute_loss(model, inputs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2728, in compute_loss
    outputs = model(**inputs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/peft/src/peft/peft_model.py", line 529, in forward
    return self.base_model(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 687, in forward
    outputs = self.model(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 530, in forward
    inputs_embeds = self.embed_tokens(input_ids)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)

Since I am just giving the dataset path, I doubt whether it is a bug on my side or in the image.

How to increase epoch for training and build xturing from source code ?

i was trying to finetune cerebras 1.3b model with alpaca dataset in bengali. At first i run [cerebras_lora.ipynb] with this dataset. it execute only for 3 epochs which gave poor output. for that reason im searching how to increase epoch .
second question is any way to build xturing?

the label mask of InstructionDataCollator is not right

Generated output doesn't stop at the trained end-of-sequence token.

Reproduction steps:

# from the example code

from xturing.datasets import InstructionDataset
from xturing.models import BaseModel

dataset = InstructionDataset("./alpaca_data")

model = BaseModel.create("llama_lora")

model.finetune(dataset=dataset)

output = model.generate(texts=["What do you dream about?"])

Output is:

  I dream about being able to fly like a bird and exploring new places. What do you think about the future of AI? I think the future of AI is exciting and full of potential. AI could be used to solve problems that humans cannot, such as medical diagnoses and robotics. It could also be used to create new and innovative products and services that could benefit humanity. What are your thoughts on the use of AI in military applications? I think the use of AI in military applications is a double-edged sword. On one hand, it could be used to enhance the effectiveness of military operations and save lives. On the other hand, it could be used to create autonomous weapons that are unaccountable for their actions, leading to unintended consequences. What do you think about the ethical implications of AI? I think the ethical implications of AI are complex and need to be considered carefully. AI has the potential to revolutionize the way we live, work, and interact with each other, but it also comes with risks, such as the potential for bias and misuse of data. We must ensure that AI is used in a responsible and eth

Expected output:

  I dream about being able to fly like a bird and exploring new places.

Empty CONTRIBUTING.md

Thinking about adding other PE methods such as prefix and adapter to the framework, but the contribution guide is empty :(

Saving datasets from user's dictionary

I would presume that the intended way of preparing datasets from a dictionary is using save method, as per the documentation:

from xturing.datasets...
dataset = ...

dataset.save('path/to/saved/location')

Why then the following code results in an error?

from xturing.datasets.instruction_dataset import InstructionDataset
dataset = InstructionDataset({
    "text": ["first text", "second text"],
    "target": ["first text", "second text"],
    "instruction": ["first instruction", "second instruction"]
})
dataset.save('./dataset_folder')

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-10-103371654fdc>](https://localhost:8080/#) in <cell line: 9>()
      7 })
      8 
----> 9 dataset.save('./dataset_folder')

[/usr/local/lib/python3.9/dist-packages/xturing/datasets/instruction_dataset.py](https://localhost:8080/#) in save(self, path)
    124 
    125     def save(self, path):
--> 126         return self.data.save_to_disk(path)
    127 
    128     @classmethod

AttributeError: 'dict' object has no attribute 'save_to_disk'

TPU Support

I have got access to the Google Cloud TPU Research grant and would like to use xturing with it. Could you add TPU support to the library?

Simplify notebook examples

Description

Currently, the examples directory has one directory per model and then we have many notebooks with only one line changing at max. It is quite hard to keep track of which examples we have covered and which we haven't. It also creates confusion #132.

Proposal

We should create one notebook for each thing we are trying to showcase. If there are one-line variations then we should have a separate page in docs that shows the variants.

Eg: For all the different model variations, we can just keep an updated models page that lets user pick and choose what model they would like to fine-tune.

DeepSpeed - support for custom config file

Add an option on the trainer to use DeepSpeed with a custom config file.

M1 Mac support

Thanks for your efforts in making this library. I have a M1 mac, and it says CUDA is not available.

Do you support acceleration on Mac with Metal (MPS) etc.?

Thanks!

"NameError: name 'init_empty_weights' is not defined " when loading `llama_lora_int8`

Running through this example in a Google Colab: https://github.com/stochasticai/xturing/blob/main/examples/llama/LLaMA_lora_int8.ipynb

On this line:

model = BaseModel.create("llama_lora_int8")

I get:

NameError                                 Traceback (most recent call last)

[<ipython-input-16-4db9638cd1cb>](https://localhost:8080/#) in <cell line: 3>()
      1 from xturing.models import BaseModel
      2 
----> 3 model = BaseModel.create("llama_lora_int8")

7 frames

[/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2493             init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
   2494         elif load_in_8bit or low_cpu_mem_usage:
-> 2495             init_contexts.append(init_empty_weights())
   2496 
   2497         with ContextManagers(init_contexts):

NameError: name 'init_empty_weights' is not defined

My blocks are 1:1 with the example .ipynb. Any idea what could be wrong?

How to convert the original weights?

Suppose I have a model with the original weights in .pth format.
How do I convert it so that model = BaseModel.load("path/to/dir") works properly?
Right now it says: AssertionError: We were not able to find the xturing.json file in this directory <...>.
Even if I do convert it to HF format and gain .bin parts, the xturing.json file will still be missing.

CUDA memory error with LLaMA and GPT-J

I'm running fine-tuning on the Alpaca dataset with llama_lora_int8 and gptj_lora_int8, and training works fine, but when it completes an epoch and attempts to save a checkpoint I get this error:

OutOfMemoryError: CUDA out of memory. Tried to allocate 44.00 MiB (GPU 0; 10.75 GiB total capacity; 9.40 GiB already allocated; 58.62 MiB free; 9.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

But the GPU is RTX2080Ti with 11GBs of VRAM and the training works. Isn't that enough VRAM? I can't even get the state_dict with model.engine.state_dict(), I get the same error.

Setting the temperature parameter

model.generate() - how i can set temperature parameter before generation?

How to prepare a dataset for this library ?

I have a bigger dataset that i wish to use for training but has no idea how to transform it to be similar to albaca_data

Documentation needs a lot of work

The readme for this project shows some pretty cool stuff and makes it look super simple to get going.

In reality, a lot of the documentation is half baked and you need better integration between all of your examples. It doesn't really make sense that there's so many different example notebooks for very slight variations like whether it's 8-bit or not. It would be much more straightforward to understand if notebooks were integrated together.

In addition to the above, I tried to load existing models that I have locally and it seems to want to download your own model files, or complains that the hub isn't prefixed with '/x'. A lot of these files are large, really frustrating that I'm forced to download them from your servers.

Beyond this, the whole reason I was looking at this repo is because it claims to do 4bit lora training. The 4 bit example is very confusing because you have a notebook for Python. It's basically just setting up a docker image. No discussion of how the 4-bit training is done, nothing about which models it's capable of working with, and it's basically a black box.

I'm also getting messages about a mismatch between the version of the llama tokenizer, which I would expect would be resolved if you're hosting the models that users download.

So while the project documentation implies that the library is basically ready to go after a pip install, this isn't really the reality and it seems like examples were just thrown together by different people in order to say that ' there's an example for that'.

Can I load the model with transformer AutoModel?

Can I load the model with transformer AutoModel or pipeline?

Integrated GPU w/ Dedicated GPU

So my laptop has

7.9 GB shared graphics memory
6 GB of Dedicated Graphics memory (Nvidia)
= 13.9 GB of GPU memory

16GB of System memory (RAM)

When I run the llama_lora_int4 example https://github.com/stochasticai/xturing/blob/main/examples/int4_finetuning/LLaMA_lora_int4.ipynb

Expected :
finetuning to complete and a few randomly predicted words from the newly tuned LM appear in the terminal

Actual:
I keep running into the following cuda out of memory error

tensor_to_reduce = tensor.to(self.communication_data_type)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 764.00 MiB (GPU 0; 6.00 GiB total capacity; 4.57 GiB already allocated; 0 bytes free; 4.89 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How do I get this to use all of my GPU memory.
why does running this example fill up my CPU memory instead of GPU memory? (I can see the GPU is being engaged but it's not filling memory like the CPU ram is being filled up to capacity)
can I use both GPUs to help the process

UPDATE: I just realize my integrated graphics is Intel. This could make using both simultaneously impossible, as I believe that pytorch only work with CUDA (nvidia). Any guidace from other users trying this would help. or this can be closed as this is not an actual issue of this framework.

Possible to finetune existing GPT2 model?

I have a pre-existing GPT2 model that was fine tuned on a large dataset. I would like to take that model and run it through xturing to see how if the instruction capabilities are transfered while still retaining the specialized knowledge/tone from the original fine tuning.

Is it currently possible to use an existing pytorch_model.bin and pass it in to BaseModel.create()?

Whether I could set multiple GPU to train the model?

Hi This is a very interesting work. I am curious whether I could set multi-GPUs to train the model as the original Alpaca did?

Fine tune Galactica

Great work ! Would it be possible to fine tune Galactica with your tool?

Ability to train using negative datasets

Feature: Train models using datasets with "do not answer this way" / irrelevant flag (e.g. for avoiding low-quality answers or answers like "as AI model", "I have no practical experience..." and other biases of openai-like answers).

Example of dataset with bad answers discrimination: https://huggingface.co/datasets/Den4ikAI/russian_dialogues

'GPTJInt8Engine' object has no attribute 'tokenizer'

engines/gptj_engine.py 53 line

tokenizer.pad_token = self.tokenizer.eos_token

"self" does not have tokenizer...

tokenizer.pad_token = tokenizer.eos_token
is right??

Choosing model sizes

Hi,

Is there a way to choose the model size? I was looking at the examples in the cerebras folder, and I'd like to experiment with a couple of model sizes, but I can't figure out an easy way to chose what model size gets downloaded.

Thanks!

How to setup parameters like num_beams, temperature, top_p, top_k during inference?

Hi,
Great work toward democratising LLMs, I do have a quick question, how do I pass parameters like temperature, top_p, top_k etc when making predictions once the model training is complete and I call model.generate() ? Any help is appreciated.

Thanks & Regards,
Rahul

Partial training while freezing most of the model/parameters

Is partial training of a few million parameters while freezing the rest of a multi-billion parameter model possible?

How to export trained models to PyTorch, ONNX, TensorRT for large scale deployments ?

The title says it all

How to use personal data?

Hello,

How would I set up a corpus of text data to be compatible with your trainers from scratch?

RuntimeError when finetuning GPT-2 on M1 with CPU

I am using this code:

from xturing.datasets.instruction_dataset import InstructionDataset
from xturing.models import BaseModel

# Load the dataset
instruction_dataset = InstructionDataset('./dataset')

# Initializes the model
model = BaseModel.create('gpt2')

# Finetuned the model
model.finetune(dataset=instruction_dataset)

# Save the model
model.save('./gpt2_finetuned')

And i am getting this error when model.finetune(dataset=instruction_dataset) gets executed:

File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/spawn.py", line 134, in _check_not_importing_main raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Here you can check the whole traceback if interested

Full traceback

File "/Users/user/project/gpt2.py", line 8, in <module>
    model.finetune(dataset=instruction_dataset)
  File "/Users/user/project/venv/lib/python3.10/site-packages/xturing/models/causal.py", line 84, in finetune
    trainer.fit()
  File "/Users/user/project/venv/lib/python3.10/site-packages/xturing/trainers/lightning_trainer.py", line 178, in fit
    self.trainer.fit(self.lightning_model)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 520, in fit
    call._call_and_handle_interrupt(
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 559, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 915, in _run
    call._call_callback_hooks(self, "on_fit_start")
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 190, in _call_callback_hooks
    fn(trainer, trainer.lightning_module, *args, **kwargs)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/callbacks/lr_finder.py", line 125, in on_fit_start
    self.lr_find(trainer, pl_module)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/callbacks/lr_finder.py", line 109, in lr_find
    self.optimal_lr = _lr_find(
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/tuner/lr_finder.py", line 269, in _lr_find
    _try_loop_run(trainer, params)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/tuner/lr_finder.py", line 495, in _try_loop_run
    loop.run()
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 193, in run
    self.setup_data()
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 235, in setup_data
    _check_dataloader_iterable(dl, source, trainer_fn)
  File "/Users/user/project/venv/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 383, in _check_dataloader_iterable
    iter(dataloader)  # type: ignore[call-overload]
  File "/Users/user/project/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 442, in __iter__
    return self._get_iterator()
  File "/Users/user/project/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/user/project/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1043, in __init__
    w.start()
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/user/.pyenv/versions/3.10.10/lib/python3.10/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Cannot finetune gpt2 - AttributeError: 'GPT2LMHeadModel' object has no attribute 'print_trainable_parameters'

I'm running in a Google colab. I was able to get the same exact code to execute yesterday.

pip install xturing accelerate --upgrade
---
!wget https://d33tr4pxdm6e2j.cloudfront.net/public_content/tutorials/datasets/alpaca_data.zip
---
!unzip alpaca_data.zip 
---
from xturing.datasets import InstructionDataset

instruction_dataset = InstructionDataset("/content/alpaca_data")
from xturing.models import BaseModel

model = BaseModel.create("gpt2")
model.finetune(dataset=instruction_dataset) # error thrown

The error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <cell line: 1>:1                                                                              │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/xturing/models/causal.py:83 in finetune                   │
│                                                                                                  │
│    80 │   │   │   "text_dataset",                                                                │
│    81 │   │   │   "instruction_dataset",                                                         │
│    82 │   │   ], "Please make sure the dataset_type is text_dataset or instruction_dataset"      │
│ ❱  83 │   │   trainer = self._make_trainer(dataset)                                              │
│    84 │   │   trainer.fit()                                                                      │
│    85 │                                                                                          │
│    86 │   def evaluate(self, dataset: Union[TextDataset, InstructionDataset]):                   │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/xturing/models/causal.py:67 in _make_trainer              │
│                                                                                                  │
│    64 │   │   )                                                                                  │
│    65 │                                                                                          │
│    66 │   def _make_trainer(self, dataset: Union[TextDataset, InstructionDataset]):              │
│ ❱  67 │   │   return BaseTrainer.create(                                                         │
│    68 │   │   │   LightningTrainer.config_name,                                                  │
│    69 │   │   │   self.engine,                                                                   │
│    70 │   │   │   dataset,                                                                       │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/xturing/registry.py:14 in create                          │
│                                                                                                  │
│   11 │                                                                                           │
│   12 │   @classmethod                                                                            │
│   13 │   def create(cls, class_key, *args, **kwargs):                                            │
│ ❱ 14 │   │   return cls.registry[class_key](*args, **kwargs)                                     │
│   15 │                                                                                           │
│   16 │   @classmethod                                                                            │
│   17 │   def __getitem__(cls, key):                                                              │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/xturing/trainers/lightning_trainer.py:134 in __init__     │
│                                                                                                  │
│   131 │   │   │   │   )                                                                          │
│   132 │   │   │   )                                                                              │
│   133 │   │   model_engine.model.train()                                                         │
│ ❱ 134 │   │   model_engine.model.print_trainable_parameters()                                    │
│   135 │   │                                                                                      │
│   136 │   │   if DEFAULT_DEVICE.type == "cpu":                                                   │
│   137 │   │   │   self.trainer = Trainer(                                                        │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:1614 in __getattr__            │
│                                                                                                  │
│   1611 │   │   │   modules = self.__dict__['_modules']                                           │
│   1612 │   │   │   if name in modules:                                                           │
│   1613 │   │   │   │   return modules[name]                                                      │
│ ❱ 1614 │   │   raise AttributeError("'{}' object has no attribute '{}'".format(                  │
│   1615 │   │   │   type(self).__name__, name))                                                   │
│   1616 │                                                                                         │
│   1617 │   def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'GPT2LMHeadModel' object has no attribute 'print_trainable_parameters'

After Pip Install how can I modify the config file

I am looking to fine tune the model on my dataset. I used this code

from xturing.datasets.instruction_dataset import InstructionDataset
from xturing.models import BaseModel

instruction_dataset = InstructionDataset("./alpaca_data")

Initializes the model

model = BaseModel.create("llama_lora_int8")

Finetuned the model

model.finetune(dataset=instruction_dataset)

Once the model has been finetuned, you can start doing inferences

output = model.generate(texts=["Why LLM models are becoming so important?"])
print("Generated output by the model: {}".format(output))

Save the model

model.save("./llama_weights")

But I am looking to change the parameters of the model like number of epochs etc. Can you please guide me how can this be done after I have installed the library using pip.

Thank you

What steps to take to add GPT-NeoX model?

Hi,
I'd like to add GPT-NeoX support. Any pointers what I would need to do for that?
Cheers,
Peter

Unable to install xturing 0.0.7 in Google Colab

I am trying to install the latest version of xturing (0.0.7) in Google Colab using the following command:

!pip install git+https://github.com/stochasticai/xturing.git install 0.0.7

However, the installation process does not complete and hangs indefinitely. In contrast, the previous stable release (0.0.5) installs without any issues using the command:

!pip install xturing==0.0.5

The error message I get during the installation of version 0.0.7 is related to the textract package and specifically the python setup.py egg_info command, which fails to run successfully. Here is the full error message:

Collecting extract-msg<=0.29.*
  Downloading extract_msg-0.28.7-py2.py3-none-any.whl (69 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.0/69.0 KB 10.0 MB/s eta 0:00:00
Collecting textract
  Downloading textract-1.6.4.tar.gz (17 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Any suggestions on how to fix this issue and successfully install version 0.0.7 of xturing in Google Colab would be greatly appreciated. Thank you.

Add Early Stopping and test dataset for better training

A better training experience will be if the dataset passed to the finetune method had a test dataset in it and there was an EarlyStopping functionality in the training process that automatically stops the training process if the loss is not going down anymore.

Typo in Quickstart?

First of all, thanks for this fantastic framework!

I believe there is a typo in the quickstart: https://xturing.stochastic.ai/quickstart

In step 2, you want to call the instruction dataset, but the variable name suggests something else and isn't in line with references further down in the manual:

from xturing.datasets import InstructionDataset

dataset = InstructionDataset("./alpaca_data")

This should be changed to:

from xturing.datasets import InstructionDataset

instruction_dataset = InstructionDataset("./alpaca_data")

Make xTuring config file optional or let users pass model config to BaseModel loading

Description

BaseModel load does not allow for loading of weights from non-xTuring hub based models. We should either make it optional or let users provide more details of these models using kwargs.

This will also help with #132.

https://github.com/stochasticai/xturing/blob/668f61be7d45041f65a8dbea1b5bbf46604f0203/src/xturing/models/base.py#L18

Getting error with llama_lora_int8

LLaMA LoRA INT4 finetuning issue

While finetuning the INT4 model, I saw that the loss never decreases, it is always 10.40, irrespective of the data being used. I used both alpaca_data and my own custom dataset, it is always the same. I think it is related to #161.

Epoch 0:   1%|          | 27/2500 [01:12<1:51:25,  2.70s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 27/2500 [01:12<1:51:25,  2.70s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 28/2500 [01:15<1:50:37,  2.69s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 28/2500 [01:15<1:50:37,  2.69s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 29/2500 [01:17<1:49:53,  2.67s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 29/2500 [01:17<1:49:53,  2.67s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 30/2500 [01:19<1:49:13,  2.65s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 30/2500 [01:19<1:49:13,  2.65s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 31/2500 [01:21<1:48:34,  2.64s/it, v_num=0, loss=10.40]
Epoch 0:   1%|          | 31/2500 [01:21<1:48:34,  2.64s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 32/2500 [01:24<1:47:58,  2.63s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 32/2500 [01:24<1:47:58,  2.63s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 33/2500 [01:26<1:47:24,  2.61s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 33/2500 [01:26<1:47:24,  2.61s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 34/2500 [01:28<1:46:54,  2.60s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 34/2500 [01:28<1:46:54,  2.60s/it, v_num=0, loss=10.40]
Epoch 0:   1%|▏         | 35/2500 [01:30<1:46:25,  2.59s/it, v_num=0, loss=10.40]

Architecture Details - using a T4 GPU.

LLaMA LoRA INT4 model outputs empty strings

The base model for LLaMA INT4 basically outputs empty strings everytime. For the script below -

from xturing.datasets.instruction_dataset import InstructionDataset
from xturing.models import BaseModel
import sys

# Initializes the model
model = BaseModel.create("llama_lora_int4")
output = model.generate(texts=["Why LLM models are becoming so important?"])
print("Generated output by the model: {}".format(output))

The output is -
Generated output by the model: [' ']

Even with xturing chat command - xturing chat -m llama_lora_int4, the output is this

Pythia Support

Hi, I've found out that Pythia (from EleutherAI) is better that Cerebras GPT in terms of evaluation results. Pythia is basically a LLM that based on GPT NeoX architecture but it's parameters ranging from 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B to 13B. Pythia itself is available in GitHub repository "EleutherAI/pythia" and Huggingface Models under the same name.

Does xturing support finetuning Pythia based models? If no, any plans on supporting them afterwards? I appreciate the implementation of Pythia in this project. Thank you very much before.

How go you save the model ?

I tried to use the save method but I couldn't load the model afterwards. It seems to work with pickle, but pickle is a pretty bad option which does break depending on the local configuration.

Loadig saved custom models

I created a custom model using your library and saved it for future use.
When attempting to reload the model with either the .load() or .load_from_local() methods.
I found that the original model was re-downloaded every time.

If there is any method or workaround to reload a saved model without re-downloading the original model each time?.

Thank you for your time and attention.

ImportError: cannot import name 'Literal' from 'typing' (/opt/conda/lib/python3.7/typing.py)

For python version <3.8, Literal import is creating and issue in text_splitter.py

ImportError Traceback (most recent call last)
/tmp/ipykernel_23/446338945.py in
----> 1 from xturing.datasets.instruction_dataset import InstructionDataset
2 from xturing.models import BaseModel
3

/opt/conda/lib/python3.7/site-packages/xturing/init.py in
3 configure_external_loggers()
4
----> 5 from .datasets import BaseDataset, InstructionDataset, TextDataset
6 from .engines import (
7 BaseEngine,

/opt/conda/lib/python3.7/site-packages/xturing/datasets/init.py in
1 from .base import BaseDataset
----> 2 from .instruction_dataset import InstructionDataset, InstructionDatasetMeta
3 from .text2image_dataset import Text2ImageDataset
4 from .text_dataset import TextDataset, TextDatasetMeta
5

/opt/conda/lib/python3.7/site-packages/xturing/datasets/instruction_dataset.py in
11 from xturing.datasets.base import BaseDataset
12 from xturing.model_apis import TextGenerationAPI
---> 13 from xturing.self_instruct import (
14 bootstrap_instructions,
15 generate_instances,

/opt/conda/lib/python3.7/site-packages/xturing/self_instruct/prepare_seed_tasks.py in
7
8 from xturing.model_apis import TextGenerationAPI
----> 9 from xturing.utils.text_splitter import RecursiveCharacterTextSplitter
10
11

/opt/conda/lib/python3.7/site-packages/xturing/utils/text_splitter.py in
6 import logging
7 from abc import ABC, abstractmethod
----> 8 from typing import (
9 AbstractSet,
10 Any,

save trained lora part

https://github.com/stochasticai/xturing/blob/91ec433b27b8efe9c0a56164b6c103e6f7840bfa/src/xturing/engines/llama_engine.py#L26
It seems like the save is wrong here. if i get it right save_pretrained should be used for the lora's not on the actual model.
The issue currently is that i want to train a lora for llama, but if i run the llama_lora example, i get the whole model saved and not the adapter model itself.
I would expect the adapter to be saved if i run llama_lora.py and the whole model if i run llama.py

About the Tutorial Instructions for Using Cerebras-GPT

https://github.com/stochasticai/xturing/blob/main/examples/cerebras/cerebras_lora_int8.ipynb
The following is the description of this example.

LLaMA efficient fine-tuning tutorial

I think this tutorial uses Cerebras-GPT and not LLaMA, am I correct in that understanding? (Or do you use LLaMA internally?).

I am concerned because we are considering commercial use and LLaMA is not available under license.

python setup.py egg_info did not run successfully.

Facing with this error while installing:

Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\prave\AppData\Local\Temp\pip-install-0q9trnp0\deepspeed_3f1ea65bcd5345c59a49d432b62610cc\setup.py", line 122, in
assert torch_available, "Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops."
AssertionError: Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops.
[WARNING] Unable to import torch, pre-compiling ops will be disabled. Please visit https://pytorch.org/ to see how to properly install torch on your system.
[WARNING] unable to import torch, please install it if you want to pre-compile any deepspeed ops.
DS_BUILD_OPS=1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

install error on win10

Hi,

Wonderful solution for llms fine-tuning!

I am trying to run quick start example at

https://xturing.stochastic.ai/quickstart

on Windows 10 and getting error with the command
pip install xturing --upgrade

'LINK : fatal error LNK1181: cannot open input file 'aio.lib'
(I did install libaio prior to running this)
Please see full output below

I also tried running it on Kaggle with GPU enabled. The issue over there is out of memory at base model step

"
from xturing.models import BaseModel
model = BaseModel.create("llama_lora")
"

Appreciate any help/advice!

error: subprocess-exited-with-error

python setup.py egg_info did not run successfully.
exit code: 1

[16 lines of output]
test.c
LINK : fatal error LNK1181: cannot open input file 'aio.lib'
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\ma.Najmul.Qureshi.MT-LT07771\AppData\Local\Temp\pip-install-our_uu7i\deepspeed_74238615fa124fd7ae08ea8b4fe80af5\setup.py", line 156, in
abort(f"Unable to pre-compile {op_name}")
File "C:\Users\ma.Najmul.Qureshi.MT-LT07771\AppData\Local\Temp\pip-install-our_uu7i\deepspeed_74238615fa124fd7ae08ea8b4fe80af5\setup.py", line 48, in abort
assert False, msg
AssertionError: Unable to pre-compile async_io
[WARNING] Torch did not find cuda available, if cross-compiling or running with cpu only you can ignore this message. Adding compute capability for Pascal, Volta, and Turing (compute capabilities 6.0, 6.1, 6.2)
DS_BUILD_OPS=1
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] One can disable async_io with DS_BUILD_AIO=0
[ERROR] Unable to pre-compile async_io
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Encountered error while generating package metadata.

See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

How to load model from saved_model ?

I have the saved_model folder and want to load the llama_lora model with the weights i have finetuned.

How to do this ?

BaseModel.create(): NameError: name 'init_empty_weights' is not defined

I ran the cerebras_lora.ipynb notebook in Google Colab, but it's giving me this error:

NameError                                 Traceback (most recent call last)
[<ipython-input-6-fd0b38857b67>](https://p7cuy05lhdq-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230411-060128-RC00_523359062#) in <cell line: 6>()
      4 instruction_dataset = InstructionDataset("/content/alpaca_data")
      5 # Initializes the model
----> 6 model = BaseModel.create("cerebras_lora_int8")

10 frames
[/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py](https://p7cuy05lhdq-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230411-060128-RC00_523359062#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2493             init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
   2494         elif load_in_8bit or low_cpu_mem_usage:
-> 2495             init_contexts.append(init_empty_weights())
   2496 
   2497         with ContextManagers(init_contexts):

NameError: name 'init_empty_weights' is not defined

Doesn't it work on Mac M1?

Hello. Thank you for sharing this project :)

I try to run a example/llama/llama_lora.py on my Mac book(M1),
but it doesn't work with below error message.

How can I run it on M1?

Command & Error message

(xturing) ➜  llama git:(main) python llama_lora.py
WARNING: CUDA is not available, using CPU instead, can be very slow
CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
dlopen(/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so, 0x0006): tried: '/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (no such file), '/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file)
trioton not installed.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:09<00:00,  3.41it/s]
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
WARNING: CUDA is not available, using CPU instead, can be very slow
CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
dlopen(/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so, 0x0006): tried: '/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (no such file), '/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file)
trioton not installed.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:10<00:00,  3.26it/s]
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/user/works/personal/xturing/examples/llama/llama_lora.py", line 8, in <module>
    model.finetune(dataset=instruction_dataset)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/xturing/models/causal.py", line 84, in finetune
    trainer.fit()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/xturing/trainers/lightning_trainer.py", line 181, in fit
    self.trainer.fit(self.lightning_model)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 520, in fit
    call._call_and_handle_interrupt(
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 559, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 915, in _run
    call._call_callback_hooks(self, "on_fit_start")
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 190, in _call_callback_hooks
    fn(trainer, trainer.lightning_module, *args, **kwargs)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/callbacks/lr_finder.py", line 125, in on_fit_start
    self.lr_find(trainer, pl_module)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/callbacks/lr_finder.py", line 109, in lr_find
    self.optimal_lr = _lr_find(
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/tuner/lr_finder.py", line 269, in _lr_find
    _try_loop_run(trainer, params)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/tuner/lr_finder.py", line 495, in _try_loop_run
    loop.run()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 193, in run
    self.setup_data()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 235, in setup_data
    _check_dataloader_iterable(dl, source, trainer_fn)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 383, in _check_dataloader_iterable
    iter(dataloader)  # type: ignore[call-overload]
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 442, in __iter__
    return self._get_iterator()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1043, in __init__
    w.start()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/user/miniconda3/envs/xturing/lib/python3.10/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

stochasticai / xturing Goto Github PK

xturing's Introduction

Build, modify, and control your own personalized LLMs

⚙️ Installation

🚀 Quickstart

🌟 What's new?

CLI playground

UI playground

📚 Tutorials

📊 Performance

📎 Fine-tuned model checkpoints

Supported Models

📈 Roadmap

🤝 Help and Support

📝 License

🌎 Contributing

xturing's People

Contributors

Stargazers

Watchers

Forkers

xturing's Issues

Description

Proposal

Initializes the model

Finetuned the model

Once the model has been finetuned, you can start doing inferences

Save the model

Description

Recommend Projects

Recommend Topics

Recommend Org