Giter Club home page Giter Club logo

ai_multi-subject-render's Introduction

multi-subject-render

Generate multiple complex subjects all at once!

Made as a script for the AUTOMATIC1111/stable-diffusion-webui repository.

00165-603508287-DDIM-64-7 5-ac07d41f-20221122154627

Miaouuuuuuuuu!

Jump to examples!

๐Ÿ’ฅ Installation ๐Ÿ’ฅ

Copy the url of that repository into the extension tab :

image

OR copy that repository in your extension folder :

image

You might need to restart the whole UI. Maybe twice.

The look

image

OK I know that's a big screenshot

How the hell does this works?

First it creates your background image, then your foreground subjects, then does a depth analysis on them, cut their backgrounds, paste them onto your background and then does an img2img for a smooth blend!

It will cut around that lady with scissors made of code.

image

Explanations of the different UI elements

I will only explain the not so obvious things because I spent enough time making that thing already.

  • First off, your usual UI will be for your initial background. So your normal prompt will probably be something like "a beach, sunny sky" etc.

For my example I decided to generate a bowling alley at 512x512 pixels :

00158-2629831387-Euler a-22-7 5-ac07d41f-1233221312123132

  • Your foreground subjects will be described in that text box.
  • You case use wildcards.
  • If you only use the first line, that line will be used for every foreground subject that will be generated.
  • If you use multiple lines, each line will be used for each foreground subject.
  • The negative prompt is carried over everything

image

Note : if you do that, you will need as many lines as foreground images generated.

For my example I made tree penguins :

sdffsdsdfsdffsddsfsfd

  • That's how much the seed will be incremented at each image. If you set it to 0 you will get the same foregrounds every time. Probably not what you want unless you use the Extra option in your main UI and "Variation strength".

image

  • You can use a different sampler for the foregrounds. As well as a different CLIP value.

image

  • The final blend is there to either make a smooth pass over your collage or to make something more complex / add details to your combination.
  • You can use different settings and samplers for your final blend. Make as you wish. The CLIP value will be the one you've set in your settings tab. Not the one for the foregrounds. So you can decide if you prefer one way or the other.

00162-2629838387-Euler a-92-7 5-ac07d41f-20221124054727

The are not really playing bowling because you need fingers. They're just here for trouble.

  • An important part is to set the final blend width. Your initial background will be stretched to that size so you don't really need to make it initially big. Your foregrounds subjects will be pasted onto your stretched background before the final blend. Not wide enough and you will end up having too many characters at the same spot.

image

The scary miscellaneous options :

  • The foreground distance from center multiplier will make your characters closer together if you select a lower value, further with a higher one. I usually stick in between 0.8 and 1.0

  • Foreground Y shift : the center character will always be at the same height. The you multiply the value of that slider by the position of the foreground subject from the center. That gives you how many pixels lower they will be. Think about some super hero movie poster with the sidies slightly lower. That's what this slider does.

  • Foreground depth cut treshold is the scary one. At 0 the backgrounds of your foregrounds subjects will be opaque. At 255 the entire foreground will be transparent. The best values are in between 50-60 for cartoon-like characters and 90-100 for photorealistic subjects. Too much and they lose their heads, not enough and you get some rock that were sitting on in your final blend.

  • Random superposition : the default is to have the center character in front. if you enable that it might not be the case anymore. That's a cool option depending on what you want to do.

  • The center character will be behind the others. If you use the previous option this one becomes useless.

  • face correction is only for the final blend. If you want that on every foreground subjects, set it in your main UI. It think it's best to enable both if you make photorealistic stuff.

image

Tips and tricks :

  • using (bokeh) and (F1.8:1.2) will make blurry backgrounds which will make it easier for the depth analysis to do a clean cut of the backgrounds.
  • "wide angle" in your prompt will give your more chances to have characters that won't be cropped
  • "skin details" or "detailed skin" raises the chances of having close-ups. I prefer to avoid.
  • Not enough denoising/steps on your final blend will make it look like you used scissors on your moms Vogue catalogue and pasted the ladies onto your dads Lord of the Rings favorite poster. Don't do that.
  • Too much denoising/steps might make the characters all look the same. It's all about finding the right middle value for your needs.
  • Making your foreground subjects have less height than the final image might make them look cropped.
  • You can now use the "Mask foregrounds in blend" checkbox to get something that looks more like a collage and use this in img2img with my other extension if you want your foreground subjects to be less alterated by the img2img blend.

Known issues

  • It does only render the final blend to the UI. You have to save the images (like in the settings you just don't uncheck that "save all images" checkbox and you're good).
  • "List index out of range" might be barfed into your terminal if you interrupt a generation. I missed a state interrupt somewhere. It does not cause any problem by itself.
  • Do not use the "high res fix" checkbox. The blend size slider at the end will trigger it if you use a higher resolution than your background image. So keep your normal UI size sliders near 512*512.
  • There can be bugs.
  • AttributeError: 'StableDiffusionProcessingTxt2Img' object has no attribute 'sampler_name' : You need to update your webui ("git pull" in a commandline from your webui folder)
  • Do check the other issues before opening a new one and try to give as many details as possible.
  • Use the discussion tab if it's not about a bug.
  • Make sure to run the latest webui version by doing "git pull" from your webui folder!

Credits

Thanks to thygate for letting me blatantly copy-paste some of his functions for the depth analysis integration in the webui.

This repository runs with MiDaS.

@ARTICLE {Ranftl2022,
    author  = "Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun",
    title   = "Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer",
    journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
    year    = "2022",
    volume  = "44",
    number  = "3"
}
@article{Ranftl2021,
	author    = {Ren\'{e} Ranftl and Alexey Bochkovskiy and Vladlen Koltun},
	title     = {Vision Transformers for Dense Prediction},
	journal   = {ICCV},
	year      = {2021},
}

A few more examples

An attempt at recreating the "Distracted boyfriend" meme. Without influencing the directions in which they are looking. 100% txt2img.

00241-2439212203-Euler a-100-7 5-ac07d41f-20221124151538 00287-2439212203-Euler a-100-7 5-ac07d41f-20221124151832 00123-60606195-DDIM-74-7 5-ac07d41f-20221124144302 00133-1894928239-DDIM-74-7 5-ac07d41f-20221124144525

I messed up the order on the last one.

00129-603508287-DDIM-64-7 5-ac07d41f-20221122153921

Aren't they cute without oxygen?

00051-3908280031-DPM++ 2M-74-7 5-ac07d41f-20221122145842

Of course you can make a harem just for yourself.

00165-603508287-DDIM-64-7 5-ac07d41f-20221122154627

MOAR KITTENS

Now a few more groups of "super heroes" from the same batch as the first image here. Except maybe for the portraits.

00290-1347027509-DDIM-69-7 5-579c005f-20221123193425

Wrong settings examples

00145-2998285171-DDIM-92-7 5-ac07d41f-20221124054225

This is what too low denoising on the final blend looks like. Yuk!

00254-1268283421-Euler a-68-7 5-ac07d41f-20221124060832

Same issue here. Looks like a funny kid collage. Grandma will love it because you typed your prompts with love and she knows it.

ai_multi-subject-render's People

Contributors

extraltodeus avatar kamilowaty122 avatar martok88 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.