segmind / distill-sd Goto Github PK

View Code? Open in Web Editor NEW

532.0 17.0 34.0 4.24 MB

Segmind Distilled diffusion

Home Page: https://discord.gg/p2MdJqZXnb

License: Other

Python 100.00%

distillation inference knowledge-distillation stable-diffusion

distill-sd's Introduction

Installation

pip3 install -r requirements.txt

Code style

Python

We adopt PEP8 as the preferred code style.

We use the following tools for linting and formatting:

flake8: linter
yapf: formatter
isort: sort imports

Style configurations of yapf and isort can be found in setup.cfg.

We use pre-commit hook that checks and formats for flake8, yapf, isort, trailing whitespaces, fixes end-of-files, sorts requirments.txt automatically on every commit. The config for a pre-commit hook is stored in .pre-commit-config.

After you clone the repository, you will need to install initialize pre-commit hook.

pip install -U pre-commit

From the repository folder

pre-commit install

After this on every commit check code linters and formatter will be enforced.

Run locally

cd .

pip install virtualenv

virtualenv venv

source ./venv/bin/activate

pip install -r requirements.txt pip install -e .

distill-sd's People

Contributors

Stargazers

Watchers

distill-sd's Issues

ControlNet?

Hi, will you be training ControlNets with this approach? Initially when I tried it for SD 1.5 many months ago, I found that you'd get much more interesting outputs when you guide the composition than when you let SD handle it based on the raw prompt. I think in the same spirit this could improve the performance of small models by allowing it to focus on low frequency details.

Getting broadcast error when trying to run distill_training.py

When I try to run the accelerate training script, I am facing with a broadcast error.

The log is as follows:

08/19/2023 05:24:35 - INFO - __main__ - ***** Running training *****
08/19/2023 05:24:35 - INFO - __main__ -   Num examples = 20072
08/19/2023 05:24:35 - INFO - __main__ -   Num Epochs = 12
08/19/2023 05:24:35 - INFO - __main__ -   Instantaneous batch size per device = 1
08/19/2023 05:24:35 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 16
08/19/2023 05:24:35 - INFO - __main__ -   Gradient Accumulation steps = 4
08/19/2023 05:24:35 - INFO - __main__ -   Total optimization steps = 15000
Steps:   0%|                                                                          | 0/15000 [00:01<?, ?it/s, lr=1e-5, step_loss=44.7]Traceback (most recent call last):
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1208, in <module>
Traceback (most recent call last):
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1208, in <module>
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1208, in <module>
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1208, in <module>
    main()
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1100, in main
    main()
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1100, in main
    main()    
main()      File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1100, in main

ema_unet.step(unet.parameters())
  File "/home/shkulkarni/segmind/distill-sd/distill_training.py", line 1100, in main
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/diffusers/training_utils.py", line 194, in step
    ema_unet.step(unet.parameters())
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    s_param.sub_(one_minus_decay * (s_param - param))
RuntimeError: output with shape [320, 320, 1, 1] doesn't match the broadcast shape [320, 320, 3, 3]
    return func(*args, **kwargs)
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/diffusers/training_utils.py", line 194, in step
        ema_unet.step(unet.parameters())ema_unet.step(unet.parameters())

  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    s_param.sub_(one_minus_decay * (s_param - param))
RuntimeError: output with shape [320, 320, 1, 1] doesn't match the broadcast shape [320, 320, 3, 3]
    return func(*args, **kwargs)    
return func(*args, **kwargs)
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/diffusers/training_utils.py", line 194, in step
  File "/opt/conda/envs/segmind/lib/python3.10/site-packages/diffusers/training_utils.py", line 194, in step
        s_param.sub_(one_minus_decay * (s_param - param))s_param.sub_(one_minus_decay * (s_param - param))

RuntimeErrorRuntimeError: : output with shape [320, 320, 1, 1] doesn't match the broadcast shape [320, 320, 3, 3]output with shape [320, 320, 1, 1] doesn't match the broadcast shape [320, 320, 3, 3]

@Gothos I suspect this might be related to LoRA version, or its extensions. The error seems to originate from the line ema_unet.step(unet.parameters()). Could you check this?

Thanks in advance.

Thank You, and Typo Correction Request for Hugging Face Blog

Hi!

I’m the author of BK-SDM, and I would like to express my appreciation for utilizing our work and releasing the codes and models. I believe your research definitely contributes to the community of small generative models. We have also mentioned your work in our GitHub repository :)

I have just noticed the blog post, https://huggingface.co/blog/sd_distillation, written by Yatharth Gupta @Warlord-K
There was a typo in the below sentence, and would you please revise it?

Original: Image taken from the paper “On Architectural Compression of Text-to-Image Diffusion Models” by Shinkook. et. al

Request to revise the authorship mention (by Shinkook. et. al) using

(1) by Nota AI
(2) by Kim et al. (Nota AI)

We prefer (1), but (2) would be also appreciated. Thank you for considering this request.

How much gpu ram needed to train and inference?

How much gpu ram needed to train and inference? Can you provide comparison between sdxl and distill-sd in the readme doc?
Many developers' gpu ram is limited to about 12gb or 16gb.

multi-gpu training

Hi, I'm impressed by your work.

Does the distill_training.py support multi-GPU training?

Can this be used for SDXL?

Can this be used for SDXL which is a much larger and VRAM extensive model.

distill_training error

Hi,

When I ran the distill_training by following your command, I encountered this error.

Traceback (most recent call last): [73/1822]
File "/home/user/sdm-kd/distill_training.py", line 1199, in
main()
File "/home/user/sdm-kd/distill_training.py", line 999, in main
cast_hook(unet,KD_student,args.distill_level,False)
File "/home/user/sdm-kd/distill_training.py", line 992, in cast_hook
unet.down_blocks[i].register_forward_hook(getActivation(dicts,'d'+str(i),True))
File "/home/user/anaconda3/envs/uvit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1265, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DistributedDataParallel' object has no attribute 'down_blocks'

Could you check this?

Thanks in advance.

Can I use refine and lora?

please tell me can I use lora from minimaxir

pipe.load_lora_weights("minimaxir/sdxl-wrong-lora")

Discord link is expired

Just that really, I'd love to join, can it be updated please?

Running this in automatic1111?

any possibility getting this to run in automatic1111 or comfyui?

poor result

I use the model from https://huggingface.co/segmind/small-sd, but i found most of the result generated are poor
So what's wrong with my usage

import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "segmind/small-sd",
    torch_dtype=torch.float16)
pipeline.to("cuda")
# pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=False)

pipeline("A man staring ahead at the camera with a neutral expression",
         num_inference_steps=20,
         guidance_scale=7.5).images[0].save("my_image.png")

And this is the result, which is un-natural at all

distillation on img2img or inpainting model

Hi thank you for your cool work!

do you have any plans for implementing distillation on inpainting model?

Encountered a very confusing issue while resuming from checkpoint.

When I resumed training from a checkpoint, I encountered an error with load_state_dict() indicating that the loaded checkpoint is incompatible with the current UNet model structure, causing the loading process to fail. However, I am certain that the saved checkpoint is sd_small, and the UNet model used for resuming training is also sd_small, so this is really a very confusing issue.

The log is as follows：
Traceback (most recent call last):
File "distill_training.py", line 1140, in
main()
File "distill_training.py", line 899, in main
accelerator.load_state(os.path.join(args.output_dir, path))
File "/opt/anaconda3/lib/python3.7/site-packages/accelerate/accelerator.py", line 2347, in load_state
hook(models, input_dir)
File "distill_training.py", line 656, in load_model_hook
model.load_state_dict(load_model.state_dict())
File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1498, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel:
Unexpected key(s) in state_dict: "down_blocks.0.attentions.1.norm.weight", "down_blocks.0.attentions.1.norm.bias",

Could you check this?

Thanks in advance.

Hi, any plan on stablediffusion XL version?

It would be must more better if there have a XL version distillation

How to infer using the trained model?

Hi, thanks for your great job! I want to test the trained model by distill_train.py. The inference code is as follows.

import torch
from diffusers import DiffusionPipeline
from diffusers import DPMSolverMultistepScheduler
from torch import Generator


path = "sd-laion-art/"
# Insert your prompt below.
prompt = "Faceshot Portrait of pretty young (18-year-old) Caucasian wearing a high neck sweater, (masterpiece, extremely detailed skin, photorealistic, heavy shadow, dramatic and cinematic lighting, key light, fill light), sharp focus, BREAK epicrealism"
# Insert negative prompt below. We recommend using this negative prompt for best results.
negative_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck" 

torch.set_grad_enabled(False)
torch.backends.cudnn.benchmark = True

# Below code will run on gpu, please pass cpu everywhere as the device and set 'dtype' to torch.float32 for cpu inference.
with torch.inference_mode():
    gen = Generator("cuda")
    gen.manual_seed(1674753452)
    pipe = DiffusionPipeline.from_pretrained(path, torch_dtype=torch.float16, safety_checker=None, requires_safety_checker=False)
    pipe.to('cuda')
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    pipe.unet.to(device='cuda', dtype=torch.float16, memory_format=torch.channels_last)

    img = pipe(prompt=prompt,negative_prompt=negative_prompt, width=512, height=512, num_inference_steps=25, guidance_scale = 7, num_images_per_prompt=1, generator = gen).images[0]
    img.save("image.png")

However ,the following error occurs.

ValueError: Cannot load <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'> from sd-laion-art/unet because the following keys are missing: 
 down_blocks.2.resnets.1.conv2.bias, up_blocks.2.resnets.2.norm2.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.t...

Could you please tell me how to infer with the model trained by distill_train.py ? Thanks!

segmind / distill-sd Goto Github PK

distill-sd's Introduction

Installation

Code style

Python

Run locally

distill-sd's People

Contributors

Stargazers

Watchers

Forkers

distill-sd's Issues

Recommend Projects

Recommend Topics

Recommend Org