Giter Club home page Giter Club logo

minlora's Introduction

minLoRA

A minimal, but versatile PyTorch re-implementation of LoRA. In only ~100 lines of code, minLoRA supports the following features:

Features

  • Functional, no need to modify the model definition
  • Works everywhere, as long as you use torch.nn.Module
  • PyTorch native, uses PyTorch's torch.nn.utils.parametrize to do all the heavy lifting
  • Easily extendable, you can add your own LoRA parameterization
  • Supports training, inference, and inference with multiple LoRA models

Demo

  • demo.ipynb shows the basic usage of the library
  • advanced_usage.ipynb shows how you can add LoRA to other layers such as embedding, and how to tie weights

Examples

Library Installation

If you want to import minlora into your project:

git clone https://github.com/cccntu/minLoRA.git
cd minLoRA
pip install -e .

Usage

import torch
from minlora import add_lora, apply_to_lora, disable_lora, enable_lora, get_lora_params, merge_lora, name_is_lora, remove_lora, load_multiple_lora, select_lora

Training a model with minLoRA

model = torch.nn.Linear(in_features=5, out_features=3)
# Step 1: Add LoRA to the model
add_lora(model)

# Step 2: Collect the parameters, pass them to the optimizer

parameters = [
    {"params": list(get_lora_params(model))},
]
optimizer = torch.optim.AdamW(parameters, lr=1e-3)

# Step 3: Train the model
# ...

# Step 4: export the LoRA parameters
lora_state_dict = get_lora_state_dict(model)

Loading and Inferencing with minLoRA

# Step 1: Add LoRA to your model
add_lora(model)

# Step 2: Load the LoRA parameters
_ = model.load_state_dict(lora_state_dict, strict=False)

# Step 3: Merge the LoRA parameters into the model
merge_lora(model)

Inferencing with multiple LoRA models

# to avoid re-adding lora to the model when rerun the cell, remove lora first
remove_lora(model)
# Step 1: Add LoRA to your model
add_lora(model)

# Step 2: Load the LoRA parameters

# load three sets of LoRA parameters
lora_state_dicts = [lora_state_dict_0, lora_state_dict_1, lora_state_dict_2]

load_multiple_lora(model, lora_state_dicts)


# Step 3: Select which LoRA to use at inference time
Y0 = select_lora(model, 0)(x)
Y1 = select_lora(model, 1)(x)
Y2 = select_lora(model, 2)(x)

References

TODO

  • A notebook to show how to configure LoRA parameters
  • Real training & inference examples

minlora's People

Contributors

cccntu avatar kells1986 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minlora's Issues

Specify Layers By Name, not Type?

Is it possible to specify layers in the lora_config by name rather than type?

For instance suppose I only wanted to apply lora to all layers with the name qkv rather than all nn.Linear layers, how would I do that?

Understanding the forward operation

I noticed that you apply the mul operation in LoraA and LoraB, then, you sum the result with the input.

image

I think the result of multiplying LoraA and LoraB has to be summed to the original weights, or I am wrong?

Could you also explain the scaling factor?

Thanks.

Freeze manually

Hi, thank you for your great work.

I want to use yours for my experiment.

I wonder get_lora_params() would load parameters to optimizer, but if the model itself can compute gradient, wouldn't the model still compute gradient?

Would be freezing the model enough for using minlora without the get_lora_params?

Also, when merging lora to the model to have another lora module, should I have to set lora_A and lora_B requires_grad=False before merging?

Thank you.

FSDP

Can this be used with FSDP? I haven't seen any examples of using torch.nn.utils.parametrize with FSDP.

Doesn't work with DataParallel

Minimum example

import torch
import timm
from torch import nn
from minlora import add_lora, get_lora_params, get_lora_state_dict


model_timm = timm.create_model("vit_large_patch14_clip_336.openai", pretrained=True, num_classes=0, global_pool='avg')
add_lora(model_timm)
model_timm = nn.DataParallel(model_timm, device_ids=[0,1]).cuda()

with torch.no_grad():
    asdf = model_timm(torch.randn(2, 3, 336, 336).cuda())
  File "/home/anaconda3/envs/face/lib/python3.8/site-packages/minlora/model.py", line 39, in lora_forward
    return X + torch.mm(*self.swap((self.lora_B, self.dropout_fn(self.lora_A)))).view(X.shape) * self.scaling
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.