blepping / comfyui_jankhidiffusion Goto Github PK

View Code? Open in Web Editor NEW

109.0 4.0 7.0 7.26 MB

Janky implementation of HiDiffusion for ComfyUI

License: Apache License 2.0

Python 100.00%

comfyui_jankhidiffusion's Introduction

ComfyUI jank HiDiffusion

Janky experimental implementation of HiDiffusion for ComfyUI.

See the changelog for recent user-visible changes.

Description

Read the link above for an official description. The following is just my understanding and may or may not be correct:

As far as I understand it, the RAU-Net part is essentially Kohya Deep Shrink (AKA PatchModelAddDownscale): the concept is to scale down the image at the start of generation to let the model set up major details like how many legs a character has and then allow the model to refine and add detail once the scaling effect ends. The main difference for that part is the downscale methods - it uses convolution with stride/dilation and pool averaging to downscale while Deep Shrink usually uses bicubic downscaling. Where the scaling occurs also may be important — it does seem to work noticeably better than Deep Shrink, at least for SD 1.5.

Not sure how to describe MSW-MSA attention. It seems like a big performance boost for SD 1.5 at high res and also appears to increase quality. Note that it does not enable high res generation by itself.

Caveats

I'm not an expert on diffusion stuff and I'm not sure I fully understood the HiDiffusion code. My implementation may or may not work correctly. If you experience issues, please don't blame HiDiffusion unless you can also reproduce it with their implementation.

I mainly use SD 1.5 models: generally these nodes should work well with SD 1.5. I've found SDXL in general doesn't tolerate these Deep Shrink type effects as well as SD 1.5 and it is also less tested. If using SDXL your mileage may vary.

Important: The advanced node default values are for SD 1.5, they won't work well for other models (like SDXL). This stuff probably doesn't work at all for more exotic models like Cascade.

Not all aspect ratios work with the MSW-MSA attention node. It may be the same with the original implementation? Try to use resolutions that are multiples of 64 or 128.
The RAUNet component may not work properly with ControlNet while the scaling effect is active.
The MSW-MSA attention node doesn't seem to help performance with SDXL much.
I may not have implemented the cross-attention block part correctly. As far as I could tell, it seemed like it was just patching a normal block, not actual cross-attention (so almost exactly like Deep Shrink). My version is implemented that way and does not use an actual attention patch.
Customizable, but not very userfriendly. You get to figure out the blocks to patch!
The list of caveats is too long, and it's probably not even complete. Yikes!

Use with ControlNet

First: RAUNet is used to help the model construct major details like how many legs a creature has when working at resolutions above what it was trained on. With ControlNet guidance, you very likely don't need RAUNet and similar effects. Don't use it unless you actually need to.

If you do use RAUNet and ControlNet concurrently, I recommend adjusting the RAUNet parametrs to only apply the effect for a short time - the minimum necessary. For example, if you'd normaly use an end time of 0.5 and CA end time of 0.3 then with ControlNet you may want to use an end time of 0.3 and just disable the CA effect entirely. Or apply it very briefly, something like CA end time 0.15.

I now try to apply a workaround to scale the ControlNet conditioning when the RAUNet effect is active. This is probably better than nothing but likely still incorrect. When it's working, you'll see messages like this in your log:

* jankhidiffusion: Scaling controlnet conditioning: torch.Size([24, 24]) -> torch.Size([12, 12])

If you find the workaround is causing issues, set the environment variable JANKHIDIFFUSION_DISABLE_CONTROLNET_WORKAROUND to any value.

Ancestral samplers seem to work a lot better than the non-ancestral ones when using RAUNet and ControlNet simultaneously. I recommend using the ancestral version if possible.

As for MSW-MSA attention, it seems fine with ControlNet and no special handling is required. Enable it or not according to your preference.

Simple Nodes

First: I strongly recommend at least skimming the Use case and Compatibility note sections of the advanced nodes so you know when to use them and potential problems to avoid. That information won't be repeated here. Also, I still haven't found the best combination of settings so it is likely the preset parameters for these nodes will change in the future.

When the nodes activate, they will output some information to your log like this:

** ApplyRAUNetSimple: Using preset SD15 high:
  upscale bicubic,
  in/out blocks [3 / 8],
  start/end percent 0.0/0.5  |
  CA upscale bicubic,
  CA in/out blocks [1 / 11],
  CA start/end percent 0.0/0.35

** ApplyMSWMSAAttentionSimple: Using preset SD15:
  in/mid/out blocks [1,2 /  / 11,10,9],
  start/end percent 0.2/1.0

(Example split into multiple lines for readability.)

If you want reproducible generations, take note of those settings: you can enter them into the advanced nodes.

`ApplyMSWMSAAttentionSimple`

Simplified version of the MSW-MSA attention node. Use the SD15 setting for SD 2.1 as well.

`ApplyRAUNetSimple`

Simplified version of the ApplyRAUNet node. All the same caveats apply. Use the SD15 for SD 2.1 as well.

Note: This node just chooses a preset, so it's not necessarily important for your resolution to match the res_mode setting.

Advanced Nodes

Common inputs:

Time mode: You can set a range when the node is active. May be percent (1.0 is 100%) - however note that this is based on sampling percentage completed and not percentage of steps completed. You may also specify times using timesteps or raw sigma values (I recommend using percentages normally). Start and end time properties should be specified using the mode you choose. Note: Time mode only controls the format you enter start/end times with, it doesn't change the behavior of the nodes at all. If you don't know what timesteps or sigmas are, just use percent.

Blocks: A comma separated list of block numbers. Input blocks are also known as down blocks, output blocks are also known as up blocks. SD1.5 and SDXL at least also have one middle block. For visualizing when blocks are active, imagine a simple model with 3 input and output blocks and one middle block. In that case evaluating the model would look like:

 start       end
   |          ^
   v          |
input 0    output 2
   |          |
input 1    output 1  <- now you know why they call it a u-net.
   |          |
input 2    output 0
   |          |
   \ middle 0 /

This is important because if you downscale input 0, you'd want to reverse the operation in the corresponding block which would not be output 0 (it would be output 2).

`ApplyMSWMSAAttention`

Use case: Performance improvement for SD 1.5, may improve generation quality at high res for both SD 1.5 and SDXL.

Applies MSW-MSA attention. Note that this probably won't work with other attention modifications like perturbed attention, self-attention guidance, nearsighted attention, etc. This is a performance boost for SD 1.5 and it seems like it may also reduce artifacts at least at high res (subjective, not scientific opinion). I made a small change compared to the reference implementation: I ensure that the shifts used are different each step.

The default block values are for SD 1.5.

SD 1.5:

Type	Attention Blocks
Input (down)	1, 2, 4, 5, 7, 8
Middle	0
Output (up)	3, 4, 5, 6, 7, 8, 9, 10, 11

Recommended SD 1.5 settings: input 1, 2, output 9, 10, 11.

SDXL

Type	Attention Blocks
Input (down)	4, 5, 7, 8
Middle	0
Output (up)	0, 1, 2, 3, 4, 5

Recommended SDXL settings: input 4, 5, output 4, 5.

Note: This doesn't seem to help performance much with SDXL. Also at very extreme resolutions (over 2048) you may need to set MSW-MSA attention to start a bit later. Try starting at 0.2 or after other scaling effects end.

Compatibility note: If you run into tensor size mismatch errors, try using images sizes that are multiples of 32, 64 or 128 (may need to experiment). Known to work with ELLA, FreeU (V2), CFG rescaling effects, SAG and PAG. Likely does not work with HyperTile, Deep Cache, Nearsighted/Slothful attention or other attention patches that affect the same blocks (SAG/PAG normally target the middle block which is fine).

Input blocks downscale and output blocks upscale so the biggest effect on performance will be applying this to input blocks with a low block number and output blocks with a high block number.

`ApplyRAUNet`

Use case: Helps avoid artifacts when generating at resolutions significantly higher than what the model normally supports. Not beneficial when generating at low resolutions (and actually likely harms quality). In other words, only use it when you have to.

As above, the default block values are for SD 1.5.

CA blocks are (maybe?) cross attention. The blocks you can target are the same as the self-attention blocks listed above.

Non-CA blocks are used to target upsampler and downsampler blocks. When setting an input block, you must use the corresponding output block. For example, if you're using SD 1.5 and you set input 3 then you must set output 8. This also applies when setting CA blocks. SD 1.5 has 12 blocks on each side of the middle block, SDXL has 9.

SD 1.5:

Input (down) Block	Output (up) Block
3	8
6	5
9	2

Recommended SD 1.5 settings:

input 3, output 8, CA input 4, CA output 8, start 0.0, end 0.45, CA start 0.0, CA end 0.3 - I believe this is close to what the official implementation uses.
input 3, output 8, CA input 1, CA output 11, start 0.0, end 0.6, CA start 0.0, CA end 0.35 - Seems to work better than the above for me at least when generating at fairly high resolutions (~2048x2048).

Example workflow: Image with embedded SD1.5 workflow

SDXL:

Input Downsample block	Output Upsample Block
3	5
6	2

Recommended SDXL settings: In general I haven't seen amazing results with SDXL. You can try using input 3, output 5 and disabling CA (set the ca_start_time to 1.0) or setting CA input 2, CA output 7 and disabling the upsampler/downsampler patch (set start_time to 1.0). I don't recommend leaving both enabled at the same time, but feel free to experiment. SDXL seems very sensitive to these settings. Also I don't recommend enabling RAUNet at all unless you are generating at a resolution significantly higher than what the model supports. Using an ancestral or SDE sampler seems to work best with SDXL and RAUNet.

Why does setting input 2 correspond with output 7? I actually have no idea, I would have expected it to be 6.

Example workflow: Image with embedded SDXL workflow

For upscale mode, good old bicubic may be best. The second best alternative is probably bislerp. Two step upscale does does half of the upscale with nearest-exact and the remaining half with the upscale method you selected. The difference seems very minor and I am not sure which setting is better.

If you have my ComfyUI-bleh nodes active, there will be more upscale options. The random upscale method seems to work pretty well, possibly also random+renoise1 ( adds a small amount of gaussian noise after upscaling ).

This node works well with restart sampling — you may need to manually adjust the restart segments. Generally you don't want to restart back into the scaling effect, rather right after it ends to give the model a chance to clean up artifacts. Using the a1111 preset will probably work best if you don't want to manually set segments.

Compatibility note: Should be compatibile with the same effects as MSW-MSA attention. Likely won't work with other scaling effects that target the same blocks (i.e. Deep Shrink). By itself, I think it should be fine with HyperTile and Deep Cache though I haven't actually tested that. May not work properly with ControlNet.

Credits

Code based on the HiDiffusion original implementation: https://github.com/megvii-research/HiDiffusion

RAUNet backend refactored by pamparamm to avoid the need for monkey patching ComfyUI's Upsample/Downsample blocks.

Thanks!

comfyui_jankhidiffusion's People

Contributors

Stargazers

Watchers

Forkers

huchenlei jags111 lovehifi haohaocreates jeyamir pamparamm linecode

comfyui_jankhidiffusion's Issues

Your ApplyRAUNet freezes gen on DirectML (probably on MPS too) Solved below

It's the same problem that happened on IpAdapter and , I think, on base ComfyUI too:

cubiq/ComfyUI_IPAdapter_plus#109 (comment)

I didn't even need to debug your code, just went ahead and altered, on raunet.py, the following lines:

line 26:
sigma = sigma.max().item()
to:
sigma = sigma.max().detach().cpu().numpy()

lines 263 and 277:
sigma = extra_options["sigmas"].max().item()
to:
sigma = extra_options["sigmas"].max().detach().cpu().numpy()

and problem fixed.

ApplyRAUNetSimple and ControlNet

So, after the recent updates I'm getting a lengthy error message saying type object 'HDState' has no attribute 'controlnet_scale_args' on the KSampler step, if Apply ControlNet node is on AND the ApplyRAUNetSimple is set to "high" or "ultra". If "res_mode" set to "low" or one of the two nodes is off - everything works fine.
(PS. I'm using SDXL with xinsir/controlnet-union-sdxl-1.0 controlnet model)

The full error message

Error occurred when executing KSampler:

type object 'HDState' has no attribute 'controlnet_scale_args'

File "E:\AI\ComfyUI\ComfyUI\execution.py", line 316, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\execution.py", line 191, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\execution.py", line 168, in _map_node_over_list
process_inputs(input_dict, i)
File "E:\AI\ComfyUI\ComfyUI\execution.py", line 157, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\nodes.py", line 1429, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\nodes.py", line 1396, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample
return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\sampling.py", line 116, in acn_sample
return orig_comfy_sample(model, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 116, in uncond_multiplier_check_cn_sample
return orig_comfy_sample(model, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\sample.py", line 43, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 1447, in KSampler_sample
return _KSampler_sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 829, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 1470, in sample
return sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 729, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 716, in sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 695, in inner_sample
samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 600, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\k_diffusion\sampling.py", line 160, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 299, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 993, in call
return self.predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 1043, in predict_noise
out = super().predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 685, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 279, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\samplers.py", line 228, in calc_cond_batch
output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 68, in apply_model_uncond_cleanup_wrapper
return orig_apply_model(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\model_base.py", line 145, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 868, in forward
h = apply_control(h, control, 'middle')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\AI\ComfyUI\ComfyUI\custom_nodes\comfyui_jankhidiffusion\py\raunet.py", line 81, in hd_apply_control
ctrl = F.interpolate(ctrl, size=h.shape[-2:], **cls.controlnet_scale_args)
^^^^^^^^^^^^^^^^^^^^^^^^^

How to get decent results with SDXL

Seems to be quite a tricky task, but HiDiffusion eclipsed every solution I found for 1-step generating high-res pictures.
In my experience the ca_start_time and end time makes things extremely messy, and somehow it ends up generating a lot of patterns (cant generate a person with a single color background, or a clean cityscape).
Adding "trash" or busyness to the composition solves this or e.g.: prompting for an autumn forest. But I found out we can get rid of this by lowering the ca_, and I managed to generate some very good backgrounds with the following settings:

I mainly use 2048x2048, or fractions of it, for portraits I would go with 1536x2048. Should be generally compatible with every model, I mainly used DreamShaper lightning.
Interestingly, laying people will lead to the same type of body horror as SD3. It also has a problem with prompt following, so only the simple prompts and compositions will work.

I went over the original hidiffusion paper, and I'm pretty sure the examples were cherry-picked. I'm also sure my settings can be improved, so please try it and experiment with it. If someone is up for it, a combination can be created that generates all the possible settings and sorts out the results based on the aesthetic score.
Any ideas for improvements?

why get bad result?

however in diffusers, it works well. same inference config.
sdxl

support Ella？

Hi, this is a new amazing technic, thank you for your hard work to implement HiDiffusion in ComfyUI!!!
I am wonder that it is possible to use this technic with ella? (forgive about this dumb question, I am a code idiot)
https://github.com/TencentQQGYLab/ComfyUI-ELLA

ControlNet not working

According to official GitHub page, this function should work.

No longer working 😭

It's not working anymore. Everything is distorted. I'm using ComfyUI 2169 [0fecfd]

error with controlNet

take error when use it with controlNet.

\raunet.py", line 60, in hd_forward_timestep_embed
transformer_options = args[1] if args else {}

Error: Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list.

Change setting in RAUnet seems to change the number of expected size and got size. This only happens in percent mode. Time step mode seems no error but it is also not doing anything with output.
Alright, I found it is only working with certain resolution like for SD1.5, it should be 1024x1024, 2048x2048. Otherwise it will give this error.

Error:Cannot import /ComfyUI/custom_nodes/comfyui_jankhidiffusion module for custom nodes: invalid syntax (msw_msa_attention.py, line 92)

os:linux

ComfyUI-Manager: installing dependencies done.

** ComfyUI startup time: 2024-05-10 03:43:36.985964
** Platform: Linux
** Python version: 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21)
[GCC 12.3.0]
** Python executable: /opt/conda/envs/comfyui/bin/python
** Log path: /home/ubuntu/MHF/AIGC/ComfyUI/comfyui.log

Prestartup times for custom nodes:
0.1 seconds: /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/ComfyUI-Manager

Total VRAM 22516 MB, total RAM 63609 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA A10G : cudaMallocAsync
VAE dtype: torch.bfloat16
Using pytorch cross attention

Loading: ComfyUI-Manager (V2.30)

ComfyUI Revision: 2169 [0fecfd2b] | Released on '2024-05-09'

Traceback (most recent call last):
File "/home/ubuntu/MHF/AIGC/ComfyUI/nodes.py", line 1867, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 850, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/comfyui_jankhidiffusion/init.py", line 1, in
from .py import *
File "/home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/comfyui_jankhidiffusion/py/init.py", line 1, in
from .msw_msa_attention import *
File "/home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/comfyui_jankhidiffusion/py/msw_msa_attention.py", line 92
match shift:
^
SyntaxError: invalid syntax

Cannot import /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/comfyui_jankhidiffusion module for custom nodes: invalid syntax (msw_msa_attention.py, line 92)

Import times for custom nodes:
0.0 seconds: /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/websocket_image_save.py
0.0 seconds (IMPORT FAILED): /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/comfyui_jankhidiffusion
0.1 seconds: /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/ComfyUI-Manager

Starting server

To see the GUI go to: http://127.0.0.1:8188
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
FETCH DATA from: /home/ubuntu/MHF/AIGC/ComfyUI/custom_nodes/ComfyUI-Manager/extension-node-map.json

Confetti problem in SDXL

Ok so first of all this is real game changer, works insanely well!

Only problem I have is when I run it in SDXL with settings from example workflow. There's a lot of "confetti" on pictures (noise). Especially with dpmpp_2m at low steps. 3m and high steps is okayish, but its still visible. So I started tweaking the numbers randomly and those settings seems to do the job (0 confetti):

Not really issue ;) can be closed

QoL suggestion

Why not have the RAUNetSimple node pick the res_mode value based on the size of the latent image? Or at least based on the width/height input values.