Comments (28)
Getting the same issue on Linux installed using this guide/script.
Playground runs fine for me.
from stable-diffusion.
Yep, likewise, playground runs fine. Although, I can only see it from the public link not my internal one. That might be a misconfiguration on my end though.
Yeah that's prob due to gradio issues with debug.
I'm out of ideas, it's really hard to debug without an error entirely π
The best I can do is suggest to drop random print("GOT HERE")
or sys.exit()
and see if you get to that point or no to try and figure out if it's possible to load these things inside docker
@hlky any other ideas?
from stable-diffusion.
Good run from scratch on Ubuntu 20.04 below with latest pull. Maybe this can help you debug.
sd | entrypoint.sh: Launching...
sd | python -u scripts/webui.py --no-verify-input --optimized-turbo
sd | Downloading: "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth" to /opt/conda/envs/ldm/lib/python3.8/site-packages/facexlib/weights/detection_Resnet50_Final.pth
sd |
100%|ββββββββββ| 104M/104M [00:05<00:00, 19.3MB/s]
sd | Downloading: "https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth" to /opt/conda/envs/ldm/lib/python3.8/site-packages/facexlib/weights/parsing_parsenet.pth
sd |
100%|ββββββββββ| 81.4M/81.4M [00:04<00:00, 19.6MB/s]
sd | Loaded GFPGAN
sd | Loaded RealESRGAN with model RealESRGAN_x4plus
sd | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd | Global Step: 470000
sd | UNet: Running in eps-prediction mode
sd | CondStage: Running in eps-prediction mode
sd | Downloading: "https://github.com/DagnyT/hardnet/raw/master/pretrained/train_liberty_with_aug/checkpoint_liberty_with_aug.pth" to /root/.cache/torch/hub/checkpoints/checkpoint_liberty_with_aug.pth
100%|ββββββββββ| 5.10M/5.10M [00:00<00:00, 17.1MB/s]
Downloading: 100%|ββββββββββ| 939k/939k [00:00<00:00, 7.05MB/s]
Downloading: 100%|ββββββββββ| 512k/512k [00:00<00:00, 4.68MB/s]
Downloading: 100%|ββββββββββ| 389/389 [00:00<00:00, 482kB/s]
Downloading: 100%|ββββββββββ| 905/905 [00:00<00:00, 1.04MB/s]
Downloading: 100%|ββββββββββ| 4.31k/4.31k [00:00<00:00, 5.01MB/s]
Downloading: 100%|ββββββββββ| 1.59G/1.59G [01:32<00:00, 18.4MB/s]
sd | FirstStage: Running in eps-prediction mode
sd | making attention of type 'vanilla' with 512 in_channels
sd | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
sd | making attention of type 'vanilla' with 512 in_channels
sd | Running on local URL: http://localhost:7860/
sd |
sd | To create a public link, set `share=True` in `launch()`.
Maybe set this up on your system without docker to determine it's actually a docker issue
from stable-diffusion.
Can you try running the webui.py directly and not through relauncher.py and put a stacktrace you get with the error?
from stable-diffusion.
Running python -u scripts/webui.py
has provided two results for me:
Traceback (most recent call last):
File "/sd/scripts/webui.py", line 3, in <module>
from frontend.frontend import draw_gradio_ui
ModuleNotFoundError: No module named 'frontend'
That one can have been a badly timed cloned of the repo. βοΈ
Loaded GFPGAN
Loaded RealESRGAN with model RealESRGAN_x4plus
Loading model from models/ldm/stable-diffusion-v1/model.ckpt
Global Step: 470000
LatentDiffusion: Running in eps-prediction mode
Killed
from stable-diffusion.
Yeah we split the frontend to it's own module before, so the first issue is fixed by pulling the latest.
The second one doesn't throw any errors? Just Killed? π©
from stable-diffusion.
I am getting the same issue in a docker running on linux.
from stable-diffusion.
@yourjelly could be a gradio issue with ports and such inside docker.
if you don't mind testing, download webui_playground.py from https://github.com/hlky/stable-diffusion-webui and put in the same directory, and run python webui_playground.py
and see if that crashes as well?
from stable-diffusion.
Yep, likewise, playground runs fine. Although, I can only see it from the public link not my internal one. That might be a misconfiguration on my end though.
from stable-diffusion.
In my case (and I assume the same for the others as well) it's because it's getting killed by the kernel OOM killer:
[1888018.327103] Out of memory: Killed process 2510239 (python) total-vm:20822252kB, anon-rss:6211836kB, file-rss:127360kB, shmem-rss:10240kB, UID:1000 pgtables:16556kB oom_score_adj:0
It's trying to allocate ~20GB of ram, and I only have about 6GB available.
from stable-diffusion.
I have observed this too, eating ram before being killed.
I followed it to webui.py model = load_model_from_config(config, opt.ckpt)
Then that takes it to util.py where 'get_obj_from_str(string, reload=False)' runs twice before it dies
There might be more it gets through but thats how close ive got so far.
I'm gonna stick another 16GB of ram in my server and see if it gets further.
from stable-diffusion.
Yep, I got past that point now. It's just trying to eat too much ram. Hovering around 10GB of usage afterwards.
from stable-diffusion.
@yourjelly have you tried running it with the --turbo
mode or the --optimized-turbo
flags on and see if it's better?
from stable-diffusion.
cc @toboshii
from stable-diffusion.
Will do, because my GPU ran out of memory when i tried to text2img
!!Runtime error (txt2img)!!
CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 7.79 GiB total capacity; 5.64 GiB already allocated; 1008.06 MiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
exiting...calling os._exit(0)
from stable-diffusion.
Optimized Turbo works using 98% of GPU ram
from stable-diffusion.
ok progress! It doesn't crash anymore without a notification, it's a good thing :)
I guess we found the issue, now question is, given --turbo, should we close this issue?
from stable-diffusion.
Getting the same issue on Linux installed using this guide/script.
Script creator here, I'm really not sure what to make of this issue? It's not associated with my script itself, but might be tied back to a faulty conda env? Make sure you're using the newest version of my script and make sure it pulls in the latest updates from the repo. Choose no on the previous parameters question, then yes on the do you want to update screen. If that fails, delete the conda env, conda env remove -n lsd
, run the script again, select no on the previous parameters and let it generate a new one.
If it still fails after that, then it is definitely something tied back either to your conda installation itself or a niche bug inside of the python code, most likely the latter at that point since it will at least partially run.
from stable-diffusion.
[1888018.327103] Out of memory: Killed process 2510239 (python) total-vm:20822252kB, anon-rss:6211836kB, file-rss:127360kB, shmem-rss:10240kB, UID:1000 pgtables:16556kB oom_score_adj:0
@toboshii I haven't seen that error.
@yourjelly have you tried running it with the
--turbo
mode or the--optimized-turbo
flags on and see if it's better?
I guess we found the issue, now question is, given --turbo, should we close this issue?
Is that a Docker compose flag? Or could it be included when βRelaunch countβ?
If the problem can't be optimised away, then this solution should at least be mentioned in the wiki.
I haven't tried the solution yet (or searched about it, just woke up π), but I will check it out as soon as I can!
from stable-diffusion.
I believe it should be added here python -u scripts/webui.py --optimized-turbo
in the entrypoint.sh file for dockers
from stable-diffusion.
Script creator here, I'm really not sure what to make of this issue?
I don't think it's anything off with your script, I only mentioned that to make it clear I was running on bare metal and was using a "supported" method of installation (I hadn't just hacked stuff together myself π€£). This was in a clean miniconda install, I tried rebuilding the env, same issue.
have you tried running it with the
--turbo
mode or the--optimized-turbo
flags on and see if it's better?
--turbo
doesn't seem to exist and --optimized-turbo
made no difference in my case.
@toboshii I haven't seen that error.
Did you look in the kernel logs? If it's the same issue you should be able to find it there using dmesg
(assuming WSL provides for that, not sure, haven't used Windows in like 12 years π
)
All in all I'm not really sure this is a "bug" or "issue" I think maybe in my and others cases we just greatly underestimated the amount of memory needed to load the models. As @yourjelly did, I moved to trying it on another machine with 32GB free and had no issues. The original machine I'm trying it on (my desktop) has 16GB and generally around 6-8GB free. From what I see on both machines it needs a minimum of about 26GB free to load the model initially and then idles around 10GB as @yourjelly mentioned, which seems pretty odd given the model is ~4GB but honestly this is my first foray into AI stuff outside of collabs, etc so maybe this is expected.
from stable-diffusion.
dmesg
in WSL: dmesg: read kernel buffer failed: Operation not permitted
π
I'll be monitoring the Windows logs, just to confirm the hypothesis.
from stable-diffusion.
@toboshii @altryne meant --optimized
or --optimized-turbo
Ideally you need 8gb+ vram and 16gb+ ram
The --optimized
option is designed for 4gb vram and --optimized-turbo
for 6-8gb vram, and will increase ram usage compared to without either of those options.
from stable-diffusion.
My laptop has 16 GB RAM installed and 8 GB video memory. I do not want to compare, because I understand they are different projects with diffident goals, but I have able with hat to start (one, all I have tried) other Stable Diffusion-project.
EDIT: --optimized-turbo
did not solve it for me.
from stable-diffusion.
Just read this #134. I will try again.
from stable-diffusion.
Sadly it still crashed.
sd | saved RealESRGAN_x4plus_anime_6B.pth
sd | entrypoint.sh: Launching...
sd | Loaded GFPGAN
sd | Loaded RealESRGAN with model RealESRGAN_x4plus
sd | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd | Global Step: 470000
sd | UNet: Running in eps-prediction mode
sd | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd | /sd/entrypoint.sh: line 89: 559 Killed python -u scripts/webui.py --optimized-turbo
sd | entrypoint.sh: Launching...
sd | Relaunch count: 1
from stable-diffusion.
Slight progress. The UI is now starting in --optimized
, but with optimizedSD.ddpm.UNet
removed. This means of cause that UI is not working, but now I know it only is optimizedSD.ddpm.UNet
(while --optimized
) that causing problems for me.
Are there other models I can switch it out for? Any tip on how to do so?
from stable-diffusion.
I have seen that Docker has an argument for disabling the out-of-memory watcher (--oom-kill-disable
). I'm trying to get it to work and will reply later.
from stable-diffusion.
Related Issues (20)
- No Image Showing After Generation
- [BUG] When SD is relaunched it switches port numbers, it should maintain the same port to preserve user parameters
- ReadMe update - using GFPGAN reduces available memory
- [BUG] When attempting to use the img2img functionality, the server outputs an error instead of generating a new image. HOT 1
- [Bug]ZeroDivisionError: division by zero. No i2i result since yesterday update HOT 4
- Resize mode in img2img : tensor size error HOT 7
- Seed number is not precise for single txt2img image generated in batch HOT 3
- Not able to regenerate exact result from yaml file / some info is missing HOT 1
- Repeated relaunches upon first run after Linux installation
- CPU Support HOT 2
- New error on start after latest update. HOT 1
- k_euler very slow suddenly HOT 1
- Newest release broke webui IMG2IMG HOT 1
- Can't run batches more than 10 after latest commits HOT 8
- Aspect ratio box seems to be broken HOT 2
- Poor text responsiveness in prompt input box after recent update HOT 29
- unsupported operand type(s) for +=: 'dict' and 'str' HOT 7
- Script for AMD? HOT 7
- Expose API HOT 2
- Personalization (textual inversion) seems broken with WebUi
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stable-diffusion.