ljleb / sd-mecha Goto Github PK

Executable State Dict Recipes

License: MIT License

Python 100.00%

sd-mecha's Introduction

Hello!

I'm ljleb, I make model merging tools and Stable Diffusion Web UI extensions. Check out the pinned repos below!

Discord server for sharing and discussing stable diffusion experiments like unconventional merge methods or training schemes. Join in!

If my work was useful to you and you want to say thanks, consider checking my my Patreon.

sd-mecha's People

Contributors

Stargazers

Watchers

Forkers

enferlain

sd-mecha's Issues

name collisions when composing recipes using the same parameters

Currently, it is possible to compose recipes together. If two recipes use the same parameter name and one is composed into the other, then two parameters that might have intended to represent different values are now indistinguishable. I think this should be disallowed and then manually allowed using a click option.

SDXL Support

These parts of the code need to be upgraded for this to work:

sd_mecha/weight.py relies exclusively on SD15 keys currently. We can add another mapping for SDXL and then new block weight/class weight methods for the 2 different text encoders and unet.
the key triage in sd_mecha/merge_scheduler.py using is_passthrough_key and is_merge_key solely relies on SD1.5 key matching. This needs to change depending on whether the entire merge is SD1.5 or SDXL based.

I think that covers every location.

offload device

We need to be able to offload intermediate merges to another device if there isn't enough space in the storage device. This device could be another card or a temporary file on disk.

Add support for pix2pix and inpainting models

Merging a pix2pix or inpainting model with a normal model yet does not work. We need to catch dimension mismatches in the shapes of the tensors and take appropriate action. We need to look around on github, I'm pretty sure there is code for this already somewhere.

Recipe free variables

This is still a little bit experimental in my mind, but I was considering to add free parameters to recipes. The point of this is to take two different recipes and plug the output of one into the input of the other. We could compose recipes in very complex ways.

Recipe serialization

We want to be able to serialize merge recipes into a human readable and editable format and also deserialize them into instances of MergeRecipe. I was thinking that we could represent a recipe as a simple list in a file with the extension .mecha.

I suggest to consider reducing the contents of the file to a single list of instructions that can be performed to merge:

model "ghostmix_v20Bakedvae"
model "dreamshaper_332BakedVaeClipFix"
call "weighted_sum" &0 &1 alpha=0.5

Here, weighted_sum &0 &1 alpha=0.5 means that weighted_sum has 2 positional arguments: the result of line 0 and the result of line 1; and 1 named parameter: alpha with the float value 0.5.

Each line is an expression. The nth line is associated with the index n. Indices start at 0. The first word in a line is the "operation" to perform (either a merge method, basic actions like loading a model or a lora or even the definition of hyper parameters) followed by n space-sparated words, which are its arguments. The last expression is the final output of the recipe.

To reconstruct a recipe in python from a .mecha, we can simply read it from top to bottom, mapping the first word in a and use a dictionary to keep track of existing nodes. We can then return the last node from this function.

SDXL lora

SDXL loras are not yet supported. We need to write a dict mapping the keys of model SD to lora SD and/or the other way around, then we can reuse pretty much all the existing code for SD1.5 lora.

Make repo compatible with comfyui

I think it's possible to turn the repo itself into a comfyui extension by adding a __init__.py at the root of the repo. comfyui would load it, and then we would consider this file the entry point of comfy extensions.

We can modify the @convert_to_recipe decorator to automatically register comfyui nodes whenever a new method is created. We can then dynamically add the right number of models and hyperparameters for a given node. When a method takes a variable number of model as input, we have to make it so that when there is no input connections left on a comfy node, the node automatically adds another one. This might be tricky to implement, so it's okay to start with adding maybe 5 optional connections to start with this.

Make repo compatible with the webui

We can add a python script under scripts/mecha_extension.py which would serve as the entry point for the webui extension. I outline here what I think we want to do, but I'm open to any suggestion really.

Must have:

select a .mecha recipe in the checkpoints dropdown to load a recipe instead of a model. The recipe is merged on the spot. This allows to save a lot of disk space while also making it very easy to explore highly complex alternative model weights composition

Should have:

create recipes from within the webui and run them. I'm not sure what interface to use for this, maybe we can just have a textbox in which we hand-craft them

Larger buffer optimization

Currently, if I try to merge 8 models together, even though it works and the memory consumption is strikingly good, it's actually pretty slow. According to GPT4, it might help to read slightly larger chunks of memory at a time per file to reduce seek time. It seems to be a valid suggestion IIUC, as reading more from a single file at a time (i.e. 10 keys in advance) would mean seeking 10 times less for the same amount of memory.

Of course, this incurs the tradeoff of using more memory at a time. I was thinking we could parameterize MergeScheduler to add a new kwarg buffer_size, which would in turn forward the property to the InSafetensorsDict instances it creates. The dicts would then allocate memory in advance and load into it data from the file. Whenever we try to load data that is outside of the buffer, we read from the file again to fill the buffer with fresh data.

If buffer_size is not explicitly passed, we should try to find an appropriate value for it. Another option is to simply fall back on the current seek-party behavior, but I'm really not a fan of that if larger buffer sizes turn out to help a lot.

Save lora

Loading lora models has been implemented. However, it's not yet possible to save the merge result as a lora. While it wouldn't make sense to do this for full models, if the result happens to be in delta space, then we can apply the SVD algorithm on the appropriate keys. I was thinking we could add a new method merge_and_save_lora that only accepts a recipe that results in delta space. Like this, we can catch mistakes if a recipe doesn't do what it intends to and we can keep the current merge_and_save method as-is.

Merge method cache

Rotate is a very expensive method to run. To speed it up, I found that we can precompute the orthogonal matrix and its eigendecomposition. It is possible that only the eigendecomposition is necessary.
https://github.com/s1dlx/meh/blob/81515bda3cba52edd263964c5517f4713faad86e/sd_meh/merge_methods.py#L245

In any case, rotate is not the only method that can benefit from this. When a recipe is called multiple times in a row on the same models but with different hyperparameters each time, then it is possible that precomputing some portion of the more complex merge methods involved would speed up the process after the first iteration with the cost of higher memory usage.

block merging

sd-meh already implements block merging. There just needs to be a path from the new api to the original code.