Comments (22)
looks like something wrong with your CUDA installation. installed requirements.txt ?
from kohya_ss.
Did you install CUDA 11.8 as per the README instructions?
from kohya_ss.
I installed not only Kohya from the setup.bat, but also CuDDN, bitsandbytes, CUDA 11.8 (cuda_11.8.0_522.06_windows to be precise), the files required in the sd-script folder, and I updated PIP from venv and redid all of that...I'm not sure what else I need to do.
from kohya_ss.
I'm having a similar issue. I've been trying to get kohya to work for a few days, and I see a tangentially related error:
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 15744, 1, 512) (torch.float32)
key : shape=(1, 15744, 1, 512) (torch.float32)
value : shape=(1, 15744, 1, 512) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
Same as @Deejay85 - CUDA 11.8, CuCDDN... ran everything from setup.bat.
from kohya_ss.
I was hoping that, since someone else is having the same problem I am, that an answer would be forthcoming, but apparently not. I did try to install the new version that was released today, and even tried to reinstall all the packages for Kohya just in case, but that didn't fix it either...I seem to be getting the same messages as I did before. I really don't know if it is something on my end, or if it's something on Kohya's end, but I really would wish the dev would take a look at it, so that I at least know where to start in fixing the problem.
from kohya_ss.
Didn't want to make a new thread, so I decided to bump the old one I made two weeks ago. I tried copying the newest release into Kohya, but that didn't make any difference, so even after two releases, I'm still having the same problems I did before.
from kohya_ss.
I ended up uninstalling anything related to python, cuda, nvidia, and microsoft development (cpp redistributables), then reinstalled and it fixed all of my issues. Before I also had Cuda 11.8 and 12.x installed and I'm guessing something went stupid there. So I stuck with cuda 11.8 this time. But not really sure - basically uninstalling and reinstalling fixed everything.
from kohya_ss.
Yeah, so many thing. A break down within the software stackโฆ This was the best thing to do. Glad it fixed thing for you.
from kohya_ss.
I am having the exact same issue. Did anyone find a solution that does not involve reinstalling everything?
from kohya_ss.
I uninstalled everything as listed by Machineminded, and mine is still producing the same exact problems as before. Should I paste the entire log just to verify?
from kohya_ss.
Downloaded the newest version of Kohya, did a fresh install to a new directory, installed everything, and here are the results I got when I tried to train something:
07:34:51-795074 INFO Kohya_ss GUI version: v24.1.4
fatal: not a git repository (or any of the parent directories): .git
07:34:52-077285 ERROR Error during Git operation: Command '['git', 'submodule', 'update', '--init', '--recursive',
'--quiet']' returned non-zero exit status 128.
07:34:52-081194 INFO nVidia toolkit detected
07:34:53-412725 INFO Torch 2.1.2+cu118
07:34:53-437137 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8905
07:34:53-439117 INFO Torch detected GPU: NVIDIA GeForce RTX 4090 VRAM 24564 Arch (8, 9) Cores 128
07:34:53-444947 INFO Python version is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit
(AMD64)]
07:34:53-447878 INFO Verifying modules installation status from requirements_pytorch_windows.txt...
07:34:53-450808 INFO Verifying modules installation status from requirements_windows.txt...
07:34:53-451785 WARNING Package wrong version: bitsandbytes 0.41.2.post2 required 0.43.0
07:34:53-453735 INFO Installing package: bitsandbytes==0.43.0
07:34:58-070392 INFO Verifying modules installation status from requirements.txt...
07:35:06-749071 INFO headless: False
07:35:06-783250 INFO Using shell=True when running external commands...
Running on local URL: http://127.0.0.1:7861
To create a public link, set `share=True` in `launch()`.
IMPORTANT: You are using gradio version 4.26.0, however version 4.29.0 is available, please upgrade.
--------
Exception in thread Thread-5 (_do_normal_analytics_request):
Traceback (most recent call last):
File "M:\z\venv\lib\site-packages\httpx\_transports\default.py", line 69, in map_httpcore_exceptions
yield
File "M:\z\venv\lib\site-packages\httpx\_transports\default.py", line 233, in handle_request
resp = self._pool.handle_request(req)
File "M:\z\venv\lib\site-packages\httpcore\_sync\connection_pool.py", line 216, in handle_request
raise exc from None
File "M:\z\venv\lib\site-packages\httpcore\_sync\connection_pool.py", line 196, in handle_request
response = connection.handle_request(
File "M:\z\venv\lib\site-packages\httpcore\_sync\connection.py", line 101, in handle_request
return self._connection.handle_request(request)
File "M:\z\venv\lib\site-packages\httpcore\_sync\http11.py", line 143, in handle_request
raise exc
File "M:\z\venv\lib\site-packages\httpcore\_sync\http11.py", line 95, in handle_request
self._send_request_body(**kwargs)
File "M:\z\venv\lib\site-packages\httpcore\_sync\http11.py", line 166, in _send_request_body
self._send_event(event, timeout=timeout)
File "M:\z\venv\lib\site-packages\httpcore\_sync\http11.py", line 175, in _send_event
self._network_stream.write(bytes_to_send, timeout=timeout)
File "M:\z\venv\lib\site-packages\httpcore\_backends\sync.py", line 133, in write
with map_exceptions(exc_map):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "M:\z\venv\lib\site-packages\httpcore\_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.WriteTimeout: The write operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "M:\z\venv\lib\site-packages\gradio\analytics.py", line 63, in _do_normal_analytics_request
httpx.post(url, data=data, timeout=5)
File "M:\z\venv\lib\site-packages\httpx\_api.py", line 319, in post
return request(
File "M:\z\venv\lib\site-packages\httpx\_api.py", line 106, in request
return client.request(
File "M:\z\venv\lib\site-packages\httpx\_client.py", line 827, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "M:\z\venv\lib\site-packages\httpx\_client.py", line 914, in send
response = self._send_handling_auth(
File "M:\z\venv\lib\site-packages\httpx\_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
File "M:\z\venv\lib\site-packages\httpx\_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
File "M:\z\venv\lib\site-packages\httpx\_client.py", line 1015, in _send_single_request
response = transport.handle_request(request)
File "M:\z\venv\lib\site-packages\httpx\_transports\default.py", line 232, in handle_request
with map_httpcore_exceptions():
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "M:\z\venv\lib\site-packages\httpx\_transports\default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.WriteTimeout: The write operation timed out
07:35:38-075101 INFO Loading config...
07:35:43-538559 INFO Start training LoRA Standard ...
07:35:43-540511 INFO Validating lr scheduler arguments...
07:35:43-541488 INFO Validating optimizer arguments...
07:35:43-542464 INFO Validating M:/kohya_ss/Sampleimages/log existence and writability... SUCCESS
07:35:43-543440 INFO Validating M:/kohya_ss/Sampleimages/model existence and writability... SUCCESS
07:35:43-544417 INFO Validating M:/StableDiffusion/models/Stable-diffusion/SDXL/sd_xl_base_1.0.safetensors
existence... SUCCESS
07:35:43-545393 INFO Validating M:/kohya_ss/Sampleimages/Images existence... SUCCESS
07:35:43-546370 INFO Folder 4_giganticbreasts: 4 repeats found
07:35:43-547347 INFO Folder 4_giganticbreasts: 115 images found
07:35:43-548324 INFO Folder 4_giganticbreasts: 115 * 4 = 460 steps
07:35:43-551252 INFO Regulatization factor: 1
07:35:43-553205 INFO Total steps: 460
07:35:43-553205 INFO Train batch size: 1
07:35:43-554183 INFO Gradient accumulation steps: 1
07:35:43-555159 INFO Epoch: 40
07:35:43-556136 INFO Max train steps: 1600
07:35:43-556136 INFO stop_text_encoder_training = 0
07:35:43-557111 INFO lr_warmup_steps = 160
07:35:43-559066 INFO Saving training config to
M:/kohya_ss/Sampleimages/model\giganticbreasts_20240512-073543.json...
07:35:43-561017 INFO Executing command: M:\z\venv\Scripts\accelerate.EXE launch --dynamo_backend no --dynamo_mode
default --gpu_ids 10de268488e21043 --mixed_precision bf16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 M:/z/sd-scripts/sdxl_train_network.py --config_file
M:/kohya_ss/Sampleimages/model/config_lora-20240512-073543.toml
07:35:43-564925 INFO Command executed.
2024-05-12 07:35:51 INFO Loading settings from train_util.py:3744
M:/kohya_ss/Sampleimages/model/config_lora-20240512-073543.toml...
INFO M:/kohya_ss/Sampleimages/model/config_lora-20240512-073543 train_util.py:3763
2024-05-12 07:35:51 INFO prepare tokenizers sdxl_train_util.py:134
2024-05-12 07:35:53 INFO update token length: 75 sdxl_train_util.py:159
INFO Using DreamBooth method. train_network.py:172
INFO prepare images. train_util.py:1572
INFO found directory M:\kohya_ss\Sampleimages\Images\4_giganticbreasts train_util.py:1519
contains 115 image files
INFO 460 train images with repeating. train_util.py:1613
INFO 0 reg images. train_util.py:1616
WARNING no regularization images / ๆญฃๅๅ็ปๅใ่ฆใคใใใพใใใงใใ train_util.py:1621
INFO [Dataset 0] config_util.py:565
batch_size: 1
resolution: (1024, 1024)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 64
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "M:\kohya_ss\Sampleimages\Images\4_giganticbreasts"
image_count: 115
num_repeats: 4
shuffle_caption: True
keep_tokens: 1
keep_tokens_separator:
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: giganticbreasts
caption_extension: .txt
INFO [Dataset 0] config_util.py:571
INFO loading image sizes. train_util.py:853
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 115/115 [00:00<00:00, 39272.51it/s]
INFO make buckets train_util.py:859
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:876
set, because bucket reso is defined by image size automatically /
bucket_no_upscaleใๆๅฎใใใๅ ดๅใฏใbucketใฎ่งฃๅๅบฆใฏ็ปๅใตใคใบใใ่ชๅ่จ
็ฎใใใใใใmin_bucket_resoใจmax_bucket_resoใฏ็ก่ฆใใใพใ
INFO number of images (including repeats) / train_util.py:905
ๅbucketใฎ็ปๅๆๆฐ๏ผ็นฐใ่ฟใๅๆฐใๅซใ๏ผ
INFO bucket 0: resolution (576, 832), count: 4 train_util.py:910
INFO bucket 1: resolution (576, 960), count: 4 train_util.py:910
INFO bucket 2: resolution (640, 640), count: 4 train_util.py:910
INFO bucket 3: resolution (704, 960), count: 4 train_util.py:910
INFO bucket 4: resolution (704, 1280), count: 4 train_util.py:910
INFO bucket 5: resolution (704, 1344), count: 4 train_util.py:910
INFO bucket 6: resolution (704, 1408), count: 4 train_util.py:910
INFO bucket 7: resolution (768, 704), count: 4 train_util.py:910
INFO bucket 8: resolution (768, 1152), count: 12 train_util.py:910
INFO bucket 9: resolution (768, 1216), count: 4 train_util.py:910
INFO bucket 10: resolution (768, 1344), count: 4 train_util.py:910
INFO bucket 11: resolution (832, 768), count: 4 train_util.py:910
INFO bucket 12: resolution (832, 896), count: 4 train_util.py:910
INFO bucket 13: resolution (832, 1024), count: 4 train_util.py:910
INFO bucket 14: resolution (832, 1088), count: 36 train_util.py:910
INFO bucket 15: resolution (832, 1152), count: 68 train_util.py:910
INFO bucket 16: resolution (832, 1216), count: 44 train_util.py:910
INFO bucket 17: resolution (896, 832), count: 4 train_util.py:910
INFO bucket 18: resolution (896, 1024), count: 16 train_util.py:910
INFO bucket 19: resolution (896, 1088), count: 40 train_util.py:910
INFO bucket 20: resolution (896, 1152), count: 40 train_util.py:910
INFO bucket 21: resolution (960, 960), count: 16 train_util.py:910
INFO bucket 22: resolution (960, 1024), count: 36 train_util.py:910
INFO bucket 23: resolution (1024, 896), count: 4 train_util.py:910
INFO bucket 24: resolution (1024, 960), count: 4 train_util.py:910
INFO bucket 25: resolution (1024, 1024), count: 36 train_util.py:910
INFO bucket 26: resolution (1088, 832), count: 8 train_util.py:910
INFO bucket 27: resolution (1088, 896), count: 4 train_util.py:910
INFO bucket 28: resolution (1152, 832), count: 8 train_util.py:910
INFO bucket 29: resolution (1152, 896), count: 8 train_util.py:910
INFO bucket 30: resolution (1216, 832), count: 12 train_util.py:910
INFO bucket 31: resolution (1280, 704), count: 4 train_util.py:910
INFO bucket 32: resolution (1344, 768), count: 8 train_util.py:910
INFO mean ar error (without repeats): 0.012568990147454271 train_util.py:915
WARNING clip_skip will be unexpected / SDXLๅญฆ็ฟใงใฏclip_skipใฏๅไฝใใพใใ sdxl_train_util.py:343
INFO preparing accelerator train_network.py:225
accelerator device: cpu
INFO loading model for process 0/1 sdxl_train_util.py:30
INFO load StableDiffusion checkpoint: sdxl_train_util.py:70
M:/StableDiffusion/models/Stable-diffusion/SDXL/sd_xl_base_1.0.safete
nsors
INFO building U-Net sdxl_model_util.py:192
2024-05-12 07:35:54 INFO loading U-Net from checkpoint sdxl_model_util.py:196
2024-05-12 07:36:06 INFO U-Net: <All keys matched successfully> sdxl_model_util.py:202
INFO building text encoders sdxl_model_util.py:205
INFO loading text encoders from checkpoint sdxl_model_util.py:258
INFO text encoder 1: <All keys matched successfully> sdxl_model_util.py:272
2024-05-12 07:36:10 INFO text encoder 2: <All keys matched successfully> sdxl_model_util.py:276
INFO building VAE sdxl_model_util.py:279
INFO loading VAE from checkpoint sdxl_model_util.py:284
INFO VAE: <All keys matched successfully> sdxl_model_util.py:287
INFO Enable xformers for U-Net train_util.py:2660
Traceback (most recent call last):
File "M:\z\sd-scripts\sdxl_train_network.py", line 185, in <module>
trainer.train(args)
File "M:\z\sd-scripts\train_network.py", line 242, in train
vae.set_use_memory_efficient_attention_xformers(args.xformers)
File "M:\z\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 262, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "M:\z\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\z\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\z\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\z\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 255, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid, attention_op)
File "M:\z\venv\lib\site-packages\diffusers\models\attention_processor.py", line 260, in set_use_memory_efficient_attention_xformers
raise ValueError(
ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Traceback (most recent call last):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "M:\z\venv\Scripts\accelerate.EXE\__main__.py", line 7, in <module>
File "M:\z\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "M:\z\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "M:\z\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['M:\\z\\venv\\Scripts\\python.exe', 'M:/z/sd-scripts/sdxl_train_network.py', '--config_file', 'M:/kohya_ss/Sampleimages/model/config_lora-20240512-073543.toml']' returned non-zero exit status 1.
07:36:12-768539 INFO Training has ended.
Keyboard interruption in main thread... closing server.
In short the same old song and dance. ๐ฉ Any advice?
from kohya_ss.
ValueError: torch.cuda.is_available() should be True but is False.
Not sure what happened but you should be able to install torch+cu118 and resolve this. Check this link:
https://pytorch.org/get-started/locally/
from kohya_ss.
@Deejay85
This part of your log indicates the most likely problem: --gpu_ids 10de268488e21043
The GPU IDs option on your config seems to be junk text. (It's an option located under the Accelerate launch category)
Leave it blank so it resembles the screenshot below, and the training should be able to run.
from kohya_ss.
I might add an input validator and log a message if it does not match the expected pattern
from kohya_ss.
I tried leaving it blank, with spaces, dashes, and as two blocks of text separated only by a hyphen...none of that worked. I am using only one graphics card BTW, because 4090s don't grow on trees you know? ๐
from kohya_ss.
Supposing you kept it blank for GPU ID, does the log still show this error you had before like ValueError: torch.cuda.is_available() should be True but is False
.. or was it a different error?
from kohya_ss.
Same error. If you want I could copy/paste the new log.
from kohya_ss.
Sure, post your log output.
from kohya_ss.
18:54:24-292864 INFO Kohya_ss GUI version: v24.1.4
fatal: not a git repository (or any of the parent directories): .git
18:54:24-530151 ERROR Error during Git operation: Command '['git', 'submodule', 'update', '--init', '--recursive',
'--quiet']' returned non-zero exit status 128.
18:54:24-535034 INFO nVidia toolkit detected
18:54:25-865018 INFO Torch 2.1.2+cu118
18:54:25-885525 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8905
18:54:25-888454 INFO Torch detected GPU: NVIDIA GeForce RTX 4090 VRAM 24564 Arch (8, 9) Cores 128
18:54:25-892360 INFO Python version is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit
(AMD64)]
18:54:25-895289 INFO Verifying modules installation status from requirements_pytorch_windows.txt...
18:54:25-898219 INFO Verifying modules installation status from requirements_windows.txt...
18:54:25-900172 INFO Verifying modules installation status from requirements.txt...
18:54:31-930997 INFO headless: False
18:54:31-969079 INFO Using shell=True when running external commands...
IMPORTANT: You are using gradio version 4.26.0, however version 4.29.0 is available, please upgrade.
Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
Exception in thread Thread-5 (_do_normal_analytics_request):
Traceback (most recent call last):
File "M:\kohya_ss\venv\lib\site-packages\httpx_transports\default.py", line 69, in map_httpcore_exceptions
yield
File "M:\kohya_ss\venv\lib\site-packages\httpx_transports\default.py", line 233, in handle_request
resp = self._pool.handle_request(req)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\connection_pool.py", line 216, in handle_request
raise exc from None
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\connection_pool.py", line 196, in handle_request
response = connection.handle_request(
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\connection.py", line 101, in handle_request
return self._connection.handle_request(request)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\http11.py", line 143, in handle_request
raise exc
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\http11.py", line 95, in handle_request
self._send_request_body(**kwargs)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\http11.py", line 166, in _send_request_body
self._send_event(event, timeout=timeout)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_sync\http11.py", line 175, in _send_event
self._network_stream.write(bytes_to_send, timeout=timeout)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_backends\sync.py", line 133, in write
with map_exceptions(exc_map):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "M:\kohya_ss\venv\lib\site-packages\httpcore_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.WriteTimeout: The write operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "M:\kohya_ss\venv\lib\site-packages\gradio\analytics.py", line 63, in _do_normal_analytics_request
httpx.post(url, data=data, timeout=5)
File "M:\kohya_ss\venv\lib\site-packages\httpx_api.py", line 319, in post
return request(
File "M:\kohya_ss\venv\lib\site-packages\httpx_api.py", line 106, in request
return client.request(
File "M:\kohya_ss\venv\lib\site-packages\httpx_client.py", line 827, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "M:\kohya_ss\venv\lib\site-packages\httpx_client.py", line 914, in send
response = self._send_handling_auth(
File "M:\kohya_ss\venv\lib\site-packages\httpx_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
File "M:\kohya_ss\venv\lib\site-packages\httpx_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
File "M:\kohya_ss\venv\lib\site-packages\httpx_client.py", line 1015, in _send_single_request
response = transport.handle_request(request)
File "M:\kohya_ss\venv\lib\site-packages\httpx_transports\default.py", line 232, in handle_request
with map_httpcore_exceptions():
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "M:\kohya_ss\venv\lib\site-packages\httpx_transports\default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.WriteTimeout: The write operation timed out
18:54:53-247570 INFO Loading config...
18:55:04-303432 INFO Save...
18:55:07-492658 INFO Start training LoRA Standard ...
18:55:07-493635 INFO Validating lr scheduler arguments...
18:55:07-495588 INFO Validating optimizer arguments...
18:55:07-496565 INFO Validating M:/kohya_ss/Sampleimages/log existence and writability... SUCCESS
18:55:07-497541 INFO Validating M:/kohya_ss/Sampleimages/model existence and writability... SUCCESS
18:55:07-498521 INFO Validating M:/StableDiffusion/models/Stable-diffusion/SDXL/sd_xl_base_1.0.safetensors
existence... SUCCESS
18:55:07-499494 INFO Validating M:/kohya_ss/Sampleimages/Images existence... SUCCESS
18:55:07-500471 INFO Folder 4_giganticbreasts: 4 repeats found
18:55:07-501447 INFO Folder 4_giganticbreasts: 115 images found
18:55:07-502424 INFO Folder 4_giganticbreasts: 115 * 4 = 460 steps
18:55:07-504377 INFO Regulatization factor: 1
18:55:07-505353 INFO Total steps: 460
18:55:07-508283 INFO Train batch size: 1
18:55:07-512189 INFO Gradient accumulation steps: 1
18:55:07-516095 INFO Epoch: 40
18:55:07-517072 INFO Max train steps: 1600
18:55:07-518049 INFO stop_text_encoder_training = 0
18:55:07-519025 INFO lr_warmup_steps = 160
18:55:07-520978 INFO Saving training config to
M:/kohya_ss/Sampleimages/model\giganticbreasts_20240526-185507.json...
18:55:07-521953 INFO Executing command: M:\kohya_ss\venv\Scripts\accelerate.EXE launch --dynamo_backend no
--dynamo_mode default --mixed_precision bf16 --num_processes 1 --num_machines 1
--num_cpu_threads_per_process 2 M:/kohya_ss/sd-scripts/sdxl_train_network.py --config_file
M:/kohya_ss/Sampleimages/model/config_lora-20240526-185507.toml
18:55:07-526836 INFO Command executed.
2024-05-26 18:55:14 INFO Loading settings from train_util.py:3744
M:/kohya_ss/Sampleimages/model/config_lora-20240526-185507.toml...
INFO M:/kohya_ss/Sampleimages/model/config_lora-20240526-185507 train_util.py:3763
2024-05-26 18:55:14 INFO prepare tokenizers sdxl_train_util.py:134
2024-05-26 18:55:15 INFO update token length: 75 sdxl_train_util.py:159
INFO Using DreamBooth method. train_network.py:172
INFO prepare images. train_util.py:1572
INFO found directory M:\kohya_ss\Sampleimages\Images\4_giganticbreasts train_util.py:1519
contains 115 image files
INFO 460 train images with repeating. train_util.py:1613
INFO 0 reg images. train_util.py:1616
WARNING no regularization images / ๆญฃๅๅ็ปๅใ่ฆใคใใใพใใใงใใ train_util.py:1621
INFO [Dataset 0] config_util.py:565
batch_size: 1
resolution: (1024, 1024)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 64
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "M:\kohya_ss\Sampleimages\Images\4_giganticbreasts"
image_count: 115
num_repeats: 4
shuffle_caption: True
keep_tokens: 1
keep_tokens_separator:
secondary_separator: None
enable_wildcard: False
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: giganticbreasts
caption_extension: .txt
INFO [Dataset 0] config_util.py:571
INFO loading image sizes. train_util.py:853
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 115/115 [00:00<00:00, 39262.92it/s]
INFO make buckets train_util.py:859
WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is train_util.py:876
set, because bucket reso is defined by image size automatically /
bucket_no_upscaleใๆๅฎใใใๅ ดๅใฏใbucketใฎ่งฃๅๅบฆใฏ็ปๅใตใคใบใใ่ชๅ่จ
็ฎใใใใใใmin_bucket_resoใจmax_bucket_resoใฏ็ก่ฆใใใพใ
INFO number of images (including repeats) / train_util.py:905
ๅbucketใฎ็ปๅๆๆฐ๏ผ็นฐใ่ฟใๅๆฐใๅซใ๏ผ
INFO bucket 0: resolution (576, 832), count: 4 train_util.py:910
INFO bucket 1: resolution (576, 960), count: 4 train_util.py:910
INFO bucket 2: resolution (640, 640), count: 4 train_util.py:910
INFO bucket 3: resolution (704, 960), count: 4 train_util.py:910
INFO bucket 4: resolution (704, 1280), count: 4 train_util.py:910
INFO bucket 5: resolution (704, 1344), count: 4 train_util.py:910
INFO bucket 6: resolution (704, 1408), count: 4 train_util.py:910
INFO bucket 7: resolution (768, 704), count: 4 train_util.py:910
INFO bucket 8: resolution (768, 1152), count: 12 train_util.py:910
INFO bucket 9: resolution (768, 1216), count: 4 train_util.py:910
INFO bucket 10: resolution (768, 1344), count: 4 train_util.py:910
INFO bucket 11: resolution (832, 768), count: 4 train_util.py:910
INFO bucket 12: resolution (832, 896), count: 4 train_util.py:910
INFO bucket 13: resolution (832, 1024), count: 4 train_util.py:910
INFO bucket 14: resolution (832, 1088), count: 36 train_util.py:910
INFO bucket 15: resolution (832, 1152), count: 68 train_util.py:910
INFO bucket 16: resolution (832, 1216), count: 44 train_util.py:910
INFO bucket 17: resolution (896, 832), count: 4 train_util.py:910
INFO bucket 18: resolution (896, 1024), count: 16 train_util.py:910
INFO bucket 19: resolution (896, 1088), count: 40 train_util.py:910
INFO bucket 20: resolution (896, 1152), count: 40 train_util.py:910
INFO bucket 21: resolution (960, 960), count: 16 train_util.py:910
INFO bucket 22: resolution (960, 1024), count: 36 train_util.py:910
INFO bucket 23: resolution (1024, 896), count: 4 train_util.py:910
INFO bucket 24: resolution (1024, 960), count: 4 train_util.py:910
INFO bucket 25: resolution (1024, 1024), count: 36 train_util.py:910
INFO bucket 26: resolution (1088, 832), count: 8 train_util.py:910
INFO bucket 27: resolution (1088, 896), count: 4 train_util.py:910
INFO bucket 28: resolution (1152, 832), count: 8 train_util.py:910
INFO bucket 29: resolution (1152, 896), count: 8 train_util.py:910
INFO bucket 30: resolution (1216, 832), count: 12 train_util.py:910
INFO bucket 31: resolution (1280, 704), count: 4 train_util.py:910
INFO bucket 32: resolution (1344, 768), count: 8 train_util.py:910
INFO mean ar error (without repeats): 0.012568990147454271 train_util.py:915
WARNING clip_skip will be unexpected / SDXLๅญฆ็ฟใงใฏclip_skipใฏๅไฝใใพใใ sdxl_train_util.py:343
INFO preparing accelerator train_network.py:225
accelerator device: cpu
INFO loading model for process 0/1 sdxl_train_util.py:30
INFO load StableDiffusion checkpoint: sdxl_train_util.py:70
M:/StableDiffusion/models/Stable-diffusion/SDXL/sd_xl_base_1.0.safete
nsors
2024-05-26 18:55:16 INFO building U-Net sdxl_model_util.py:192
INFO loading U-Net from checkpoint sdxl_model_util.py:196
2024-05-26 18:55:28 INFO U-Net: sdxl_model_util.py:202
INFO building text encoders sdxl_model_util.py:205
INFO loading text encoders from checkpoint sdxl_model_util.py:258
INFO text encoder 1: sdxl_model_util.py:272
2024-05-26 18:55:32 INFO text encoder 2: sdxl_model_util.py:276
INFO building VAE sdxl_model_util.py:279
INFO loading VAE from checkpoint sdxl_model_util.py:284
INFO VAE: sdxl_model_util.py:287
INFO Enable xformers for U-Net train_util.py:2660
Traceback (most recent call last):
File "M:\kohya_ss\sd-scripts\sdxl_train_network.py", line 185, in
trainer.train(args)
File "M:\kohya_ss\sd-scripts\train_network.py", line 242, in train
vae.set_use_memory_efficient_attention_xformers(args.xformers)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 262, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 258, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 255, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid, attention_op)
File "M:\kohya_ss\venv\lib\site-packages\diffusers\models\attention_processor.py", line 260, in set_use_memory_efficient_attention_xformers
raise ValueError(
ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Traceback (most recent call last):
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Ande\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "M:\kohya_ss\venv\Scripts\accelerate.EXE_main.py", line 7, in
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "M:\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['M:\kohya_ss\venv\Scripts\python.exe', 'M:/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', 'M:/kohya_ss/Sampleimages/model/config_lora-20240526-185507.toml']' returned non-zero exit status 1.
18:55:35-725043 INFO Training has ended.
from kohya_ss.
Can you look in the folder at C:\Users\yourname\.cache\huggingface\accelerate
If you see a file called default_config.yaml
then delete that file, and see if that fixes it.
from kohya_ss.
Surprisingly it did. ๐ Now if I only knew what value was messing it up.
from kohya_ss.
It's probably the gpu_ids
setting in that file. The default value is all
and I'll assume it wasn't at default which caused the problems here.
from kohya_ss.
Related Issues (20)
- MultipleInvalid: extra keys not allowed @ data['datasets'][0]['subsets'][1]['is_reg']
- Could anyone help me qq?
- dreambooth lora extraction = bad results.
- Full FP16 Dreambooth for SDXL does not work HOT 2
- Missing keys & size mismatch when merging LORAs
- Blue screen "video scheduler internal error" when using SD 3 branch with SD 3 training (normal branch XL works fine) HOT 1
- SD3 Gui Sampling + lora extraction HOT 1
- Request to add controlnet fine-tuning
- Network dropout and samples
- RuntimeError: The size of tensor a (8) must match the size of tensor b (2744) at non-singleton dimension 1
- ValueError: num_samples should be a positive integer value, but got num_samples=0 HOT 1
- Security vulnerabilties: gradio and onnx HOT 2
- request to add weighted captions for SDXL HOT 2
- Runpod does not work.
- WD14 captioning in Runpod not working
- No module named 'xformers' on AMD rx7800XT [fedora40]
- AssertionError Related to Bucket Info in Fine-Tuning Script
- Training [LoRA] has ended, returned non-zero exit status 1 HOT 2
- ChainedScheduler and SequentialLR?
- SD3 into main branch
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kohya_ss.