transformerlab / transformerlab-app Goto Github PK

Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

Home Page: https://transformerlab.ai/

License: GNU Affero General Public License v3.0

TypeScript 95.57% JavaScript 3.80% CSS 0.57% EJS 0.06%

electron llama llms lora rlhf transformers mlx

transformerlab-app's Introduction

Transformer Lab

Download, interact, and finetune models locally.
Explore the docs »

View Demo · Report Bugs · Suggest Features · Join Discord · Follow on Twitter

Note: Transformer Lab is actively being worked on. Please join our Discord or follow us on Twitter for updates. Questions, feedback and contributions are highly valued!

Download Now

About The Project

Transformer Lab is an app that allows anyone to experiment with Large Language Models.

Transformer Lab allows you to:

💕 One-click Download Hundreds of Popular Models:
- Llama3, Phi3, Mistral, Mixtral, Gemma, Command-R, and dozens more
⬇ Download any LLM from Huggingface
🎶 Finetune / Train Across Different Hardware
- Finetune using MLX on Apple Silicon
- Finetune using Huggingface on GPU
⚖️ RLHF and Preference Optimization
- DPO
- ORPO
- SIMPO
- Reward Modeling
💻 Work with LLMs Across Operating Systems:
- Windows App
- MacOS App
- Linux
💬 Chat with Models
- Chat
- Completions
- Preset (Templated) Prompts
- Chat History
- Tweak generation parameters
🚂 Use Different Inference Engines
- MLX on Apple Silicon
- Huggingface Transformers
- vLLM
- Llama CPP
🧑‍🎓 Evaluate models
📖 RAG (Retreival Augmented Generation)
- Drag and Drop File UI
- Works on Apple MLX, Transformers, and other engines
📓 Build Datasets for Training
- Pull from hundreds of common datasets available on HuggingFace
- Provide your own dataset using drag and drop
🔢 Calculate Embeddings
💁 Full REST API
🌩 Run in the Cloud
- You can run the user interface on your desktop/laptop while the engine runs on a remote or cloud machine
- Or you can run everything locally on a single machine
🔀 Convert Models Across Platforms
- Convert from/to Huggingface, MLX, GGUF
🔌 Plugin Support
- Easily pull from a library of existing plugins
- Write your own plugins to extend functionality
🧑‍💻 Embedded Monaco Code Editor
- Edit plugins and view what's happening behind the scenes
📝 Prompt Editing
- Easily edit System Messages or Prompt Templates
📜 Inference Logs
- While doing inference or RAG, view a log of the raw queries sent to the LLM

And you can do the above, all through a simple cross-platform GUI.

Getting Started

Click here to download Transformer Lab.

Read this page to learn how to install and use.

Built With

Developers

Building from Scratch

To build the app yourself, pull this repo, and follow the steps below:

npm install

npm start

Packaging for Production

To package apps for the local platform:

npm run package

License

Distributed under the AGPL V3 License. See LICENSE.txt for more information.

Reference

If you found Transformer Lab useful in your research or applications, please cite using the following BibTeX:

@software{transformerlab,
  author = {Asaria, Ali},
  title = {Transformer Lab: Experiment with Large Language Models},
  month = December,
  year = 2023,
  url = {https://github.com/transformerlab/transformerlab-app}
}

Contact

@aliasaria - Ali Asasria
@dadmobile - Tony Salomone

transformerlab-app's People

Contributors

Stargazers

Watchers

Forkers

turbo-agi cddigi corymsmith prashantnayak-edu alignment-lab-ai awmalka rohannair safiyamak niconico6 drolu erickwill conglesolutionx ankur-kalita

transformerlab-app's Issues

More flexible custom dataset file structures

Our underlying code uses huggingface load_dataset which allows for flexible file system structures for custom local datasets:

https://huggingface.co/docs/hub/en/datasets-file-names-and-splits

But our app and API code force the user to use a very specific format (exactly one file each of <dataset_id>[train|eval].jsonl)

Preview more of a dataset

Right now you can only see the first 10 or so records in a dataset. Would be nice if there was a better browser for the data with pagination or scroll.

[M1 Max/Sonoma 14.2.1] Error When Checking For Local Running Server

When executing step 3 of the local connection wizard, "Check if Server is Running Locally...", I receive an error message:

{"status":"error","code":1}

Application Logs:

[2024-02-21 16:54:13.099] [info]  Checking if server is installed locally at /Users/<REDACTED>/.transformerlab/src/LATEST_VERSION
[2024-02-21 16:54:13.101] [info]  Found version v0.2.0
[2024-02-21 16:54:15.043] [info]  Starting local server at /Users/<REDACTED>/.transformerlab/src/run.sh
[2024-02-21 16:54:15.047] [info]  Local server started with pid 1281
[2024-02-21 16:54:15.302] [info]  child process exited with code 1
[2024-02-21 16:54:15.304] [info]  child process exited with code 1

Notes:

Python is installed via homebrew (version: 3.11)
Conda (miniconda) is installed via homebrew (version: 23.11.0)

Electron Upgrader has no GUI

We need to add the electron updater GUI:

https://github.com/iffy/electron-updater-example

Add a regenerate icon to messages

Add regenerate icon to be able to regenerate a LLM reply, e.g.

Manually edit some of the model properties.

e.g.downloaded a mlx version of gemma, but Transformer Labs saysthere is no available engine (selection menu is empty).

Error running llama trainer if no NVIDIA GPU

al-sadr and dadmobile both got:

ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes`

Can be "fixed" by downgrading transformers to 4.30 although you gotta work through other errors. It looks like probably this:

https://stackoverflow.com/questions/76924239/accelerate-and-bitsandbytes-is-needed-to-install-but-i-did

But details at bottom worked for me.

There is no proper errors if you do not name your files correctly (e.g. _train.jsonl)

Add a "stop" button for inference ("interact" tab)

Add a "stop" button or some similiar option to stop interference manually.

Unable to read training dataset

Regardless of what training dataset I use, whether self imported or downloaded through electron app, it is unable to open database file to train.

-- RUN 2024-02-22 05:54:49--
Traceback (most recent call last):
  File "/Users/adityasood/.transformerlab/workspace/plugins/mlx_lora_trainer/main.py", line 64, in <module>
    db = sqlite3.connect(llmlab_root_dir + "/workspace/llmlab.sqlite3")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: unable to open database file

Inference on Trained Mistral-7b fails often

log attached

message.txt

install script requires curl but doesn't check for it

When running the install script, we should check that curl exists and alert user if not

Pin MLX to some tested version

We have mlx as a required dependency in all of the MLX plugins, but they've broken the app twice with updates. For now, we should pin to a version and just deal with upgrading and testing on some cadence.

Give detailed instrucitons on writing a template

The following section can be hard to know how to use. Put in a link to a doc that gives detailed info with examples

System Prompt display may not always match what is stored

If you change the system prompt on the inference page it sometimes reverts back to the old prompt in the AI, but actually (based on the AI answers) it seems to be still using the new edited prompt. If you change to another model it will load and show the new prompt.

Tensorboard doesn't show for MLX training

Tensorboard works for t5 and llama training but not for MLX

bug

Export models names should include quantization if there is any

Currently exported models just add the export format to the original model ID (eg.
Mistral-7B-Instruct-v0.2 - MLX). It'd be better if they included any quantization in the name as most models on HuggingFace do.

Training for a self exported MLX model generally seems to fail (some issue with finding the Huggingface repository)

Unable to start server ...

{"status":"error","code":127}

It's not clear how to start an inference server via API

We have a /worker/start endpoint in the API but it doesn't allow you to set the inference engine or inference parameters.

It's also not clear what "model_filename" refers to in the app since we refer to the unique id of the model in several ways across the API (uniqueId, filename, huggingface_id).

Select A Model screen doesn't scroll

Once you get more than a screen full of models you need to filter in order to be able to load models that don't fit on the front page.

Unable to install Conda - ERROR: File or directory already exists: '~/.transformerlab/miniconda3'

I have conda installed on my system, but it looks like Transformerlab wants it installed in it's own directory.

It doesn't install and just spins for an hour or so then resets itself asking to click the install button again

Have a way to force update of app

We need to add autoupdate and force update to the app.

It could check Github to see the latest release and popup an alert, or something else?

Better errors when starting models fails

E.g. You can't run Nous Hermes on MLX engine...the API throws an error that the model doesn't have safetensoers. But it only returns "Error starting worker process". Should be able to at least show the Exception as an error message or something?

Remember previously used experiment

I would like it, if the application remembered the experiment you have used the last time and automatically selects it if you boot up the application. In the case that you want to continue using that experiment you need less clicks and if you want to use another experiment the amount of clicks required stays the same. So it can only be beneficial.

Downloads don't resume or show progress using "Download HuggingFace Model" button

Investigate user al-sabr reports:

The HG download from the UI is not resuming when the wifi gets disconnected which is pain in the eye and there is no status of the console to know at which percentage is the download

I think we are doing the right things that resuming should be working. Double check but also it would help if there was some sort of progress indicator (although even in the gallery I don't think it shows you if a download is resuming).

Error starting server

I just downloaded the app hosted on your website (Apple Silicon). I was able to successfully install the server, but step 3 doesn't work. When I click "Start" it alerts back: {"status":"error","code":127}

MLX Training: Adaptor Field is not Made Required

For MLX It's possible to do a train but not set the adaptor field, this results in a broken adaptor

| ERROR | stderr | huggingface_hub.utils.validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/Users/timk/.transformerlab/workspace/models/phi-2/phi-2_'. Use repo_type argument if needed.

We do not upgrade dependencies on install

We don't run

pip install --upgrade --force-reinstall -r requirements.txt

even if a user has upgraded

Name of dataset files must be exactly the same as the dataset itself.

But this isn't documented anywhere!

Deleted Conda environment doesn't allow fresh install

Steps to reproduce:

Install Transformer Lab on a Mac using the GUI installer
Delete the app and delete your ~/.transformerlab folder
Try to install the app again -- it fails

I believe the ~/.conda/ folder is remembering the environment and it isn't created the second time? Or some other reason why conda isn't cleaned? Should be able to fix.

Error when training with very few examples

Loaded train dataset with 39 examples.
...
raise ValueError(f'Unknown split "{split}". Should be one of {list(name2len)}.')
ValueError: Unknown split "test". Should be one of ['train'].

I believe this happens when you have less than 100 examples. Warn users better about this error or prevent it in the first place.

Hide unused completion endpoints

Fastchat offers a few unused completion endpoints. Delete or hide them from the documentation

Check mlx_lora_trainer for missing parameters

mlx_lora_trainer supports more parameters than currently being made accessible in the UI. At least batch_size is an important one missing. Please check other parameters as well if something is missing.

lora.py [-h] [--model MODEL] [--max-tokens MAX_TOKENS] [--temp TEMP]
               [--prompt PROMPT] [--train] [--data DATA]
               [--lora-layers LORA_LAYERS] [--batch-size BATCH_SIZE]
               [--iters ITERS] [--val-batches VAL_BATCHES]
               [--learning-rate LEARNING_RATE]
               [--steps-per-report STEPS_PER_REPORT]
               [--steps-per-eval STEPS_PER_EVAL]
               [--resume-adapter-file RESUME_ADAPTER_FILE]
               [--adapter-file ADAPTER_FILE] [--save-every SAVE_EVERY]
               [--test] [--test-batches TEST_BATCHES] [--seed SEED]

Token counter for inference seems buggy

Empty conversation without any system prompt already starts at 500+ tokens in the counter

Models generated by MLX training won't run

Reproduced using TinyLlama-1.1B-Chat-v1.0.

huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/Users/tony/.transformerlab/workspace/models/TinyLlama-1.1B-Chat-v1.0_jerk2000/TinyLlama-1.1B-Chat-v1.0_jerk2000'. Use repo_type argument if needed.

It looks like the model_path that is getting passed in doubles the model name.

Plugins should have versioning

A plugin should have a version id and then if there is an updated one available the system should let you know to reinstall it

Unable to select any quantization other than 4 for MLX Exporter

"Hm if I want to export models I get the choices for 2, 4 and 8 bit quantization. But I can only select 4."

Add a jobs directory to workspace to store job artifacts

Jobs can create artifacts that need to be kept but don't have a place to go. eg.:

output logs from training jobs get stored in the plugin directory currently
tensorboards go in a tensorboards directory

Add epochs to mlx_lora_trainer

For most users it would be much easier to just configure epochs + batch-size and auto-calculate the amount of iterations based on the amount of training data.

GPT-4 solution:

Definitions

Total Number of Examples (N): The total number of examples in your training dataset.
Batch Size (B): The number of examples processed in one iteration (or step) of training.
Number of Epochs (E): The total number of times the training process will work through the entire dataset.
Total Number of Iterations (I): The total number of iterations (or steps) needed to complete the specified number of epochs.

Formula to Calculate Total Number of Iterations
The total number of iterations needed to complete the training can be calculated with the following formula:

This formula works under the assumption that N/B divides evenly. If N/B does not divide evenly (i.e., if there is a remainder), the actual number of iterations will be slightly higher, as the last batch of each epoch will be smaller than the specified batch size but still counts as a separate iteration.

Example Calculation
Suppose you have the following:

Total Number of Examples (N): 682
Batch Size (B): 32
Number of Epochs (E): 2
The calculation would be:

The calculation results in 42.625 total iterations, indicating that you would need 43 iterations to complete 2 epochs, as you can't have a fraction of an iteration. The presence of a fractional part (.625) indicates that the last batch in each epoch will be smaller than the specified batch size of 32.

This means, to complete 2 epochs with a batch size of 32 over a dataset of 682 examples, you would conduct 43 iterations, where the last iteration of each epoch processes fewer examples to cover the entire dataset.

Change standard value for temperature

Currently temperature standard setting is 0.9, which is very creative for most models.
Probably something in the range of 0.5 - 0.7 is more reasonable

If you reinstall Transformer Lab, Plugins need to be reinstalled

The MLX plugin, for example, installs pip dependencies. If you delete the transformerlab conda environment, and then re-install, the MLX plugin will think it is installed but it won't be able to find it's depedencies.

Perhaps, upon reinstalling tranformerlab dependencies, we need to deactivate all plugins to reset their install state. Or we can loop through each one and reinstall it.

Make Maximum Length in Interact a slider

I temporarily set the Maximum Length input to a field instead of a slider because we need a way to dynamically set the maximum to whatever the context length of the model is.

In this temporary state, you can enter any number but if you go to high you get an error (that helpfully explains what the maximum is).

Let's return this to a slider once we have a way to fetch the max context length on screen load.

Error reinstalling mlx_lora_trainer - mlx_examples dependency

fatal: destination path 'mlx-examples' already exists and is not an empty directory.

It is cloning the repo, which already exists. So perhaps this is OK...but this means everybody has different combinations of plugin version and mlx-examples repo version.

Several options:

just say having any version of mlx-examples is fine
clone if doesn't exist, pull if it does
pull from a specific version of mlx-examples

Llama trainer looking in wrong directory for custom datasets

Data from user al-sabr on ubuntu:

-- RUN 2024-03-09 22:26:07--
/home/doumbo/miniconda3/envs/transformerlab/lib/python3.11/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/doumbo/miniconda3/envs/transformerlab/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
Arguments:
Namespace(input_file='/home/doumbo/.transformerlab/workspace/temp/plugin_input_46.json')

Allow importing of models you already have on your computer

E.g. you have a gguf model or MLX model already on your computer somewhere, you should be able to "import" it

Training on TinyDolphin model throws errors in MLX training script

Model: mlx-community/TinyDolphin-2.8-1.1b-4bit-mlx
Dataset: samsum
Plugin: mlx-lora-trainer

`Loading pretrained model

Fetching 7 files: 0%| | 0/7 [00:00<?, ?it/s]
Fetching 7 files: 100%|██████████| 7/7 [00:00<00:00, 75475.91it/s]
Traceback (most recent call last):
File "/Users/timk/.transformerlab/workspace/plugins/mlx_lora_trainer/mlx-examples/lora/lora.py", line 321, in
model, tokenizer, _ = lora_utils.load(args.model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/timk/.transformerlab/workspace/plugins/mlx_lora_trainer/mlx-examples/lora/utils.py", line 171, in load
model.load_weights(list(weights.items()))
File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/mlx/nn/layers/base.py", line 167, in load_weights
raise ValueError(f"Missing parameters: {missing}.")
ValueError: Missing parameters: lm_head.biases lm_head.scales.
Finished training.

AI made approachable — Today at 4:07 AM
I don't think the error with running models created by mlx_lora_trainer is fixed. Still got this today:
Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/Users/timk/.transformerlab/workspace/models/TinyLlama-1.1B-Chat-v1.0_test/TinyLlama-1.1B-Chat-v1.0_test`

Test each model that can work in MLX

We have some models that are supposed to support MLX but don't work when we load them.

Let's test each one and then remove the ones from the gallery that do not work. We can store notes in the issue here.

Make it optional to fuse the adapter/lora

When creating a training template add a configuration setting to decide if you want to fuse the adapter/lora after training into the base model or not, instead of always doing this.