Comments (6)
This should work fine on most 1.5 models. Can you try with the latest updates?
from dreambooth-stable-diffusion.
I am having the same issue, but using the recommended model
When I start my training , I get the following error,
```torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 23.70 GiB total capacity; 22.15 GiB already allocated; 16.38 MiB free; 22.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF````
Tried to setup the env variable but still not working
import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512"
from dreambooth-stable-diffusion.
Tested for JoePenna repo on Runpod and Vast Templates.
Vast.ai definitely seems more robust
RUNPOD
runpod/pytorch:3.10-2.0.0-117
No (out of memory error)
runpod/pytorch-3.10-1.13.1-116
Yes
runpod/pytorch-3.9-1.13.1-116
No (ModuleNotFoundError: No module named ‘taming’)
runpod/pytorch-latest (python=3.7, torch=1.12.0)
No (AttributeError: ‘str’ object has no attribute ‘name’ in Cell : Dreambooth Training Environment Setup)
VAST.AI
pytorch:latest (python=3.10.8, torch=1.13.1)
Yes
pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime (python3.10.9)
Yes
pytorch/pytorch:1.13.1-cuda11.6-cudnn8-runtime (python3.10.8)
Yes
pytorch/pytorch:1.13.0-cuda11.6-cudnn8-runtime (python3.9.2)
Yes
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime (python3.7.13)
Yes
pytorch/pytorch:1.12.0-cuda11.3-cudnn8-runtime (python3.7.13)
Yes
pytorch/pytorch:1.11.0-cuda11.3-cudnn8-runtime (python3.8.12)
Yes
pytorch/pytorch:1.10.0-cuda11.3-cudnn8-runtime (python3.7.11)
Yes
from dreambooth-stable-diffusion.
Training seems to work when the docker images is set as runpod/pytorch
as recommended in the README.md
from dreambooth-stable-diffusion.
Training seems to work when the docker images is set as
runpod/pytorch
as recommended in theREADME.md
runpod/pytorch
produces the same env as runpod/pytorch:latest
(torch 1.12.0, python 3.7.13) and produces the same error "AttributeError: 'str' object has no attribute 'name'" in the Training Setup cell.
runpod/pytorch-3.10-1.13.1-116
however does seem to work.
This applies for the latest updated notebook. If running a different or older version then results may differ.
from dreambooth-stable-diffusion.
runpod/pytorch-3.10-1.13.1-116 works for me!
from dreambooth-stable-diffusion.
Related Issues (20)
- NameError: name 'trainer' is not defined HOT 8
- Does dreambooth support multi-subjects training? HOT 12
- Regularization step always stops almost at the end HOT 1
- Conda OOM issue locally HOT 1
- Upload Images in Dreambooth Training Environment Setup fails on dreambooth_joepenna.ipynb HOT 4
- num_samples should be a positive integer value, but got num_samples=0 HOT 1
- "No training images provided" error HOT 5
- Establish a baseline with a sample set of training images HOT 1
- Ubuntu Running Error HOT 9
- ImportError: cannot import name '_PATH' from 'pytorch_lightning.utilities.types'
- support freezing text_encoder layers for OpenCLIP
- how to train x4-upscaling? HOT 1
- OutOfMemoryError: CUDA out of memory (WHY?) HOT 2
- Failure in installation step 2: ERROR: file:///content does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found. HOT 2
- pickle.UnpicklingError: invalid load key - Issue Using Safetensor training model HOT 1
- Torch Install Failure: "raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled" HOT 1
- Where is PPL implemented
- Error: HeaderTooLarge HOT 4
- Does this program support training based on muti-GPUS ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dreambooth-stable-diffusion.