silent-chen / layout-guidance Goto Github PK

View Code? Open in Web Editor NEW

217.0 217.0 11.0 27.82 MB

Python 100.00%

layout-guidance's People

Contributors

Stargazers

Watchers

Forkers

codeaudit stevenshaw1999 dotpyu universewill olineranum zhou-mian karranpandey babyblue26 pusheen2519 chiyee

layout-guidance's Issues

请问可以安装stable diffusion webui的插件里吗

请问可以安装stable diffusion webui的插件里吗？

subject-driven issue

Dear author,

I have discovered an issue when attempting to generate a specific subject using the fine-tuned model from Dreambooth. The problem lies in the variable "attn_num_head_channels" being assigned a list type, resulting in the error message "TypeError: unsupported operand type(s) for //: 'int' and 'list'". Could you please provide guidance on a potential solution for this?

Thank you.

Diffusers based Training-Free Layout Control with Cross-Attention Guidance

Hello Minghao Chen,

Thanks for your amazing work. After reviewing your project, I have added a diffusers-based pipeline with the aim of expanding its accessibility to a wider community.

I would greatly appreciate your review and feedback on the changes I have made. Your insight and expertise would be invaluable in ensuring that the project remains of the highest quality.

You can find the modified work at the following link:

https://github.com/nipunjindal/diffusers-layout-guidance

Setting Problems

Thank you for the exciting work!

I try to run the demo following the instruction, but I can not automatically download the model (CLIP, UNet, etc.), so I want to know your setting for these models and download them manually.

I'm looking forward to hearing back from you.

where is latents = 1 / 0.18215 * latents from?

Thank you for your excellent work.

Could you pls explain the code latents = 1 / 0.18215 * latents?
where is 1 / 0.18215 from?

Best wishes,

Out-of-memory issue

Dear authors,

Thanks for your exciting work. I'm trying to run inference.py to see some generated images. But I get the OOM error though I use Nvidia 2080Ti. I'd like to know how much gpu memory is proper for inference. Is there anything to adjust to decrease the demand on memory like changing the size of generated images?

Complete VISOR results

Hi @silent-chen , thanks for the great work!
Would you also be able to share the complete results on the VISOR metric of your method? Currently the paper has OA, Visor Unconditional and Conditional. But does not mention Visor-{1,2,3,4}.
Thanks again!

Real Image Editing

Dear author.

First, thank you for your awesome work!
I wonder this code has any plan to support real image editing.
If not, I want to ask you implementation detail of real image editing with layout guidance.

Thank you!

TypeError: unsupported operand type(s) for //: 'int' and 'list'

Hello, I met the same problem as follows, could you please tell me how to solve it?

#4 (comment)

A little question about the compute_ca_loss

Hi, thanks for the nice work!

when computing the backward guidance loss, I noticed that in your code you only use the second half-batch of the attention maps by using

layout-guidance/utils.py

Line 12 in 3e169ac

attn_map = attn_map_integrated.chunk(2)[1]

def compute_ca_loss(attn_maps_mid, attn_maps_up, bboxes, object_positions):
    loss = 0
    object_number = len(bboxes)
    if object_number == 0:
        return torch.tensor(0).float().cuda() if torch.cuda.is_available() else torch.tensor(0).float()
    for attn_map_integrated in attn_maps_mid:
        attn_map = attn_map_integrated.chunk(2)[1]   # why chunk here?
...

I am a little bit confused, could you please give me some hint? thank you so much!

Would you also be able to share the code of Word Drop? I can't wait to know more details about it. 😊

Thanks again!