Giter Club home page Giter Club logo

vqgan-clip's People

Contributors

caladri avatar drjkl avatar microraptor avatar nerdyrodent avatar sgimmini avatar thehappydinoa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vqgan-clip's Issues

about args.augment ,What are parameters for

I pulled the new code and found that many new parameters appeared. I would like to know the functions of these parameters. I am looking forward to your reply

This error occurred when I adjusted the code to make it easier for the output path to be not output.png, but I could not solve it. This error also occurred when I downloaded a new source code, I do not know where I got it wrong
image

Multi GPU ?

Could not make this run on multi GPUs. Would love some help!

Magic Key Words

Are there any good resources of key words that work well with VQGAN+CLIP?

I compiled some I heard so far:

  • unreal engine | hyperrealistic | vray
  • trending on artstation
  • photorealistic
  • render
  • psychedelic | surreal | weird
  • pencil art sketch
  • drawn by a child
  • in the style of xxx

Since this isn't really an issue, perhaps opening up Github discussions in this repo would be better for these kinds of topics.

RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling `cusolverDnCreate(handle)`

Hey!

Thanks for this, I am so ready to create bizarreness.

Hardware:
Ryzen 7 3700X
32GB RAM
RTX 2070 Super

OS: Windows 10 Pro

I'm getting the below error when running generate.py:

python generate.py -p "Yee"

Output:
(vqgan) PS C:\Users\andre\anaconda3\envs\vqgan\VQGAN-CLIP> python generate.py -p "Yee" Working with z of shape (1, 256, 16, 16) = 65536 dimensions. loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth VQLPIPSWithDiscriminator running with hinge loss. Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\torchvision\transforms\transforms.py:280: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( Using device: cuda:0 Optimising using: Adam Using text prompts: ['Yee'] Using seed: 329366907029900 0it [00:01, ?it/s] Traceback (most recent call last): File "C:\Users\andre\anaconda3\envs\vqgan\VQGAN-CLIP\generate.py", line 461, in <module> train(i) File "C:\Users\andre\anaconda3\envs\vqgan\VQGAN-CLIP\generate.py", line 444, in train lossAll = ascend_txt() File "C:\Users\andre\anaconda3\envs\vqgan\VQGAN-CLIP\generate.py", line 423, in ascend_txt iii = perceptor.encode_image(normalize(make_cutouts(out))).float() File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\andre\anaconda3\envs\vqgan\VQGAN-CLIP\generate.py", line 241, in forward batch = self.augs(torch.cat(cutouts, dim=0)) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\torch\nn\modules\container.py", line 139, in forward input = module(input) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\augmentation\base.py", line 245, in forward output = self.apply_func(in_tensor, in_transform, self._params, return_transform) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\augmentation\base.py", line 210, in apply_func output[to_apply] = self.apply_transform(in_tensor[to_apply], params, trans_matrix[to_apply]) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\augmentation\augmentation.py", line 684, in apply_transform return warp_affine( File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\geometry\transform\imgwarp.py", line 192, in warp_affine dst_norm_trans_src_norm: torch.Tensor = normalize_homography(M_3x3, (H, W), dsize) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\geometry\transform\homography_warper.py", line 380, in normalize_homography src_pix_trans_src_norm = _torch_inverse_cast(src_norm_trans_src_pix) File "C:\Users\andre\anaconda3\envs\vqgan\lib\site-packages\kornia\utils\helpers.py", line 48, in _torch_inverse_cast return torch.inverse(input.to(dtype)).to(input.dtype) RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling cusolverDnCreate(handle)
`

Error when trying to generate image (noob) any help would be appreciated

(vqgan) D:\art\VQGAN-CLIP>python generate.py -p "A painting of an apple in a fruit bowl"
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Traceback (most recent call last):
File "D:\art\VQGAN-CLIP\generate.py", line 546, in
model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
File "D:\art\VQGAN-CLIP\generate.py", line 520, in load_vqgan_model
model.init_from_ckpt(checkpoint_path)
File "D:\art\VQGAN-CLIP\taming-transformers\taming\models\vqgan.py", line 52, in init_from_ckpt
self.load_state_dict(sd, strict=False)
File "D:\ana3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VQModel:
size mismatch for loss.discriminator.main.8.weight: copying a param with shape torch.Size([1, 256, 4, 4]) from checkpoint, the shape in current model is torch.Size([512, 256, 4, 4]).
size mismatch for quantize.embedding.weight: copying a param with shape torch.Size([16384, 256]) from checkpoint, the shape in current model is torch.Size([1024, 256]).

Error on taming.models

Please consider adding to your setup instructions:

Was receiving an error:
File "generate.py", line 18, in
from taming.models import cond_transformer, vqgan
ModuleNotFoundError: No module named 'taming'

Solution:
pip3 install taming-transformers

Simple zoom option zooms into corner?

Excuse my tech illiteracy if this is obvious. I'm on Windows. I'm trying to combine a zoom with the "storyboard" mode, with multiple sequential text inputs. Currently I only know how do this with the simple built-in zoom option (as opposed to zoom.sh), but it zooms into the bottom-right corner by, apparently, displacing the entire image up and left 5 pixels per iteration. Is there a solution to this?

Most likely unimportant, but here's what I use:
python generate.py -p "Roses|photo:-1 ^ Sunflowers ^ Daisies ^ Daffodils" -cpe 1500 -zvid -i 6000 -zse 10 -vl 20 -zsc 1.005 -opt Adagrad -lr 0.15 -se 6000 -s 250 250

which CUDA version is required for pytorch here?

I'm getting UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:115.). I do have a GPU, but I'm using CUDA version 8 (it's a shared lab machine).

Is the old CUDA version why I get the above error? Any way to fix this, apart from setting up a brand new system?

not an issue -

on your youtube videos - you have a ubuntu desktop showing some heads up display.
does it show gpu? whats the name?

Can't go past iteration 1500

Temporary solution is to pass in a 'cpe' argument that is any value greater the 'i' argument. As in:

python generate.py -i 2000 -p "vase" -cpe 3000

Error message points to generate.py line 620
Code is trying to go to the next story step when prompt is a single sentence.

No module named 'CLIP'

After following your video—with the conda approach, making the environment, updating it with the .yml and getting torch==1.9.0—I am getting the following error from generate.py:

ModuleNotFoundError: No module named 'CLIP'

I tried to even install the CLIP repo via pip before re-installing torch and everything else but it didn't work...

I am sure this is a silly issue

Can't generate video

I get this error when using -vid

Traceback (most recent call last):
  File "C:\Users\vanceagher\vqgan\generate.py", line 669, in <module>
    p = Popen(['ffmpeg',
  File "C:\Users\vanceagher\anaconda3\envs\vqgan\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\vanceagher\anaconda3\envs\vqgan\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified````

Why do I get different ouputs when using the same input on a different repo?

I got an amazing output that i got on the google collab notebook and am trying to replicate it to a larger scale running locally on my 3090. For some reason the outputs appear to have a different style than that of collab (im using the same model, prompt, seed and save interval..)

Is there something thats been altered ? Is it to do with the optimizer or learning rate? (these can't be specified on the collab notebook).

Thanks alot for this bit of software it has given me hours of experimenting and fun!

Saving each iteration to create a video

Is there a way I can save each image along the process rather than just the final output? Then using ffmpeg to combine the images into an animation? I've got it working on my PC! just interested in that feature as I can't get 900*900 on collab.. Thanks!

How to save generated art?

Hi there! I'm completely new to this.
Where are the images saved after generation? Sorry, if this question is stupid.

Error message about conda activate

Error message about conda activate:

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'. To initialize your shell, run $ conda init <SHELL_NAME> Currently supported shells are: - bash

But if I try to run conda init I get this:

a init no change /root/anaconda3/condabin/conda no change /root/anaconda3/bin/conda no change /root/anaconda3/bin/conda-env no change /root/anaconda3/bin/activate no change /root/anaconda3/bin/deactivate no change /root/anaconda3/etc/profile.d/conda.sh no change /root/anaconda3/etc/fish/conf.d/conda.fish no change /root/anaconda3/shell/condabin/Conda.psm1 no change /root/anaconda3/shell/condabin/conda-hook.ps1 no change /root/anaconda3/lib/python3.8/site-packages/xontrib/conda.xsh no change /root/anaconda3/etc/profile.d/conda.csh no change /root/.bashrc No action taken.

And activate vqgan will still not work.

Fixed

Hi there, just trying to generate video using the -vid argument but getting the following error

_pickle.UnpicklingError: invalid load key, 'm'.

I installed it yesterday on my work machine and it worked just as it should

Today I tried to install it on my home machine, but I get following error:

vstil@DESKTOP-R251CM7 MINGW64 ~/VQGAN-CLIP (main)
$ python generate.py -se 1 -p "a cat"
C:\Users\vstil\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\distutils_patch.py:25: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
  warnings.warn(
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Traceback (most recent call last):
  File "C:\Users\vstil\VQGAN-CLIP\generate.py", line 546, in <module>
    model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
  File "C:\Users\vstil\VQGAN-CLIP\generate.py", line 520, in load_vqgan_model
    model.init_from_ckpt(checkpoint_path)
  File "taming-transformers\taming\models\vqgan.py", line 45, in init_from_ckpt
    sd = torch.load(path, map_location="cpu")["state_dict"]
  File "C:\Users\vstil\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\serialization.py", line 608, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "C:\Users\vstil\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\serialization.py", line 777, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'm'.

I already tried to redownload the VQGAN-CLIP repo to no avail...

Any help would be greatly appreciated!

No such file or directory: 'ffmpeg'

When I run the generate.py script, I get the following error when the video is being made.

Generating video...
Traceback (most recent call last):
  File "/home/ubuntu/VQGAN-CLIP/generate.py", line 581, in <module>
    p = Popen(['ffmpeg',
  File "/home/ubuntu/anaconda3/envs/vqgan/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/ubuntu/anaconda3/envs/vqgan/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'

memory issues

I've been trying to do larger resolution images but no matter what size GPU I use, i get a message like the one below where it seems pytorch is using a massive amount of the available memory? Any advice on how to go about creating larger images?

GPU 0; 31.75 GiB total capacity; 29.72 GiB already allocated; 381.00 MiB free; 29.94 GiB reserved in total by PyTorch

problems with video generation

when I try running the example line python generate.py -p "The inside of a sphere" -zvid -i 4500 -zse 20 -vl 10 -zsc 0.97 -opt Adagrad -lr 0.15 -se 4500 i receive an error that states
File "C:\Users\user\VQGAN-CLIP\generate.py", line 808, in <module> p = Popen(['ffmpeg', File "C:\Users\user\anaconda3\envs\vqgan\lib\subprocess.py", line 951, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\user\anaconda3\envs\vqgan\lib\subprocess.py", line 1420, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified

im not really sure where to go withh this error and I dont know if this is a problem on my side or something with the program

Error when running in CPU mode

Bug

I get RuntimeError: "softmax_lastdim_kernel_impl" not implemented for 'Half' when running this against my CPU.

To reproduce

$ python generate.py -p "A painting of an apple in a fruit bowl" -cd cpu

Gives

Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Traceback (most recent call last):
  File "/home/daniel/repos/vqgan-clip/generate.py", line 633, in <module>
    embed = perceptor.encode_text(clip.tokenize(txt).to(device)).float()
  File "/home/daniel/repos/vqgan-clip/CLIP/clip/model.py", line 344, in encode_text
    x = self.transformer(x)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/daniel/repos/vqgan-clip/CLIP/clip/model.py", line 199, in forward
    return self.resblocks(x)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/daniel/repos/vqgan-clip/CLIP/clip/model.py", line 186, in forward
    x = x + self.attention(self.ln_1(x))
  File "/home/daniel/repos/vqgan-clip/CLIP/clip/model.py", line 183, in attention
    return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/modules/activation.py", line 1031, in forward
    attn_output, attn_output_weights = F.multi_head_attention_forward(
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/functional.py", line 5082, in multi_head_attention_forward
    attn_output, attn_output_weights = _scaled_dot_product_attention(q, k, v, attn_mask, dropout_p)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/functional.py", line 4828, in _scaled_dot_product_attention
    attn = softmax(attn, dim=-1)
  File "/home/daniel/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/nn/functional.py", line 1679, in softmax
    ret = input.softmax(dim)
RuntimeError: "softmax_lastdim_kernel_impl" not implemented for 'Half'

Expected behavior

No error; generate an output image.

Additional notes

  • I followed the setup described in the readme (kudos - it's very thorough!)
  • Image generation using my GPU works fine, i.e. without the -cd cpu parameter

Environment

Collecting environment information...
PyTorch version: 1.9.0+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1 
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.9 (64-bit runtime)
Python platform: Linux-5.4.0-88-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 11.4.120
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080 Ti
Nvidia driver version: 470.57.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] pytorch-lightning==1.4.9
[pip3] pytorch-ranger==0.1.1
[pip3] torch==1.9.0+cu111
[pip3] torch-optimizer==0.1.0
[pip3] torchaudio==0.9.0
[pip3] torchmetrics==0.5.1
[pip3] torchvision==0.10.0+cu111
[conda] numpy                     1.21.2                   pypi_0    pypi
[conda] pytorch-lightning         1.4.9                    pypi_0    pypi
[conda] pytorch-ranger            0.1.1                    pypi_0    pypi
[conda] torch                     1.9.0+cu111              pypi_0    pypi
[conda] torch-optimizer           0.1.0                    pypi_0    pypi
[conda] torchaudio                0.9.0                    pypi_0    pypi
[conda] torchmetrics              0.5.1                    pypi_0    pypi
[conda] torchvision               0.10.0+cu111             pypi_0    pypi

I keep getting a traceback trying to make a video

in line 988 "AttributeError: 'int' object has no attribute 'stdin'"

ffmpeg command failed - check your installation
0%| | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\Caleb\VQGAN-CLIP\generate.py", line 988, in
im.save(p.stdin, 'PNG')
AttributeError: 'int' object has no attribute 'stdin'

Maybe i'm trying to make a video wrong, but the issue persists even with the provided example of the telephone box.

custom filename and location for video is giving errors

-o flag is working properly in the case of image generation, but there is no specific information is available on how to create video with custom name. In case of providing a file name with any extension the script result in the following error

ValueError: unknown file extension: .png'

On windows we cannot use the zoom.sh script in conda prompt. So using the command

python generate.py -p "An apple in a bowl" -zvid -i 2000 -vl 10 -o "output/test.mp4"

Central Park - exploration

central park concept art
central-park-concept-art
central prak photoillustration
central-park-photoillustration
central park watercolor
central-parl-watercolor

I think this was "nyc,drone,tiltshift,behance hd"
adsf

wc

"central park, 35mm film"
asdf

quite impressed by prompting a location and passing in "drone"
"vernazza,italy drone"
drone

Requirements

Are there any specific requirements to get this working? ie. Do I need an NVIDIA GPU / CUDA?

attempting to spit out an image in the style of....

7531b6c8513bbbbdd5913cb396f5f221

I use this
python generate.py --image_prompts '/home/jp/Desktop/7531b6c8513bbbbdd5913cb396f5f221.png' --prompts "dream" -i 1000

but results not quite there.

I'll have a play with the iw parameter and see if I can get a look and feel like the original image.

requires_grad_ is not supported on ScriptModules

using cuda 11.2, built torch from source

Traceback (most recent call last):
  File "/home/julianallchin/github/VQGAN-CLIP/generate.py", line 548, in <module>
    perceptor = clip.load(args.clip_model, jit=jit)[0].eval().requires_grad_(False).to(device)
  File "/home/julianallchin/anaconda3/envs/vqgan/lib/python3.9/site-packages/torch/jit/_script.py", line 915, in fail
    raise RuntimeError(name + " is not supported on ScriptModules")
RuntimeError: requires_grad_ is not supported on ScriptModules

RuntimeError: Error(s) in loading state_dict for VQModel

So I'm trying to be brave and set this up on my Windows 10 machine running Conda since my Titan RTX GPU is on that box. I was able to install everything w/o any issues but when I try to run the example it bails out. Not 100% sure what the error is.

(vqgan) PS C:\Users\stiet\Desktop\Work\AIStuff\VQGAN-CLIP> python generate.py -p "A painting of an apple in a fruit bowl"
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Traceback (most recent call last):
  File "C:\Users\stiet\Desktop\Work\AIStuff\VQGAN-CLIP\generate.py", line 546, in <module>
    model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
  File "C:\Users\stiet\Desktop\Work\AIStuff\VQGAN-CLIP\generate.py", line 520, in load_vqgan_model
    model.init_from_ckpt(checkpoint_path)
  File "C:\Users\stiet\anaconda3\envs\vqgan\lib\site-packages\taming\models\vqgan.py", line 48, in init_from_ckpt
    self.load_state_dict(sd, strict=False)
  File "C:\Users\stiet\anaconda3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VQModel:
        size mismatch for loss.discriminator.main.8.weight: copying a param with shape torch.Size([1, 256, 4, 4]) from checkpoint, the shape in current model is torch.Size([512, 256, 4, 4]).
        size mismatch for quantize.embedding.weight: copying a param with shape torch.Size([16384, 256]) from checkpoint, the shape in current model is torch.Size([1024, 256]).
(vqgan) PS C:\Users\stiet\Desktop\Work\AIStuff\VQGAN-CLIP> ls


    Directory: C:\Users\stiet\Desktop\Work\AIStuff\VQGAN-CLIP


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----         9/30/2021   3:52 PM                checkpoints
d-----         9/30/2021   3:23 PM                CLIP
d-----         9/30/2021   3:19 PM                samples
d-----         9/30/2021   3:54 PM                taming
d-----         9/30/2021   3:23 PM                taming-transformers
-a----         9/30/2021   3:19 PM            190 .gitignore
-a----         9/30/2021   3:19 PM           5277 download_models.sh
-a----         9/30/2021   3:19 PM          42380 generate.py
-a----         9/30/2021   3:19 PM           1095 LICENSE
-a----         9/30/2021   3:19 PM           1592 opt_tester.sh
-a----         9/30/2021   3:19 PM           1474 random.sh
-a----         9/30/2021   3:19 PM          13240 README.md
-a----         9/30/2021   3:19 PM           1187 requirements.txt
-a----         9/30/2021   3:19 PM           1544 video_styler.sh
-a----         9/30/2021   3:19 PM           2376 vqgan.yml
-a----         9/30/2021   3:19 PM           1444 zoom.sh

CUDA out of memory.

How can i fix this?

"CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 2.00 GiB total capacity; 1.13 GiB already allocated; 0 bytes free; 1.16 GiB reserved in total by PyTorch)"

I understand that I need to allocate more memory or change the batch parameters. But In which file should I change it? Or what command should I use? I'm newbie btw...

RuntimeError: requires_grad_ is not supported on ScriptModules

I don't know what happened - but had a working setup - and then was tinkering with facebook faiss - and gcc and now hit this problem.

python generate.py -p "The fashion of tomorrow"
/home/jp/Documents/gitWorkspace/VQGAN-CLIP/CLIP/clip/clip.py:23: UserWarning: PyTorch version 1.7.1 or higher is recommended
  warnings.warn("PyTorch version 1.7.1 or higher is recommended")
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Traceback (most recent call last):
  File "generate.py", line 361, in <module>
    perceptor = clip.load(args.clip_model, jit=jit)[0].eval().requires_grad_(False).to(device)
  File "/home/jp/miniconda3/lib/python3.8/site-packages/torch/jit/_script.py", line 919, in fail
    raise RuntimeError(name + " is not supported on ScriptModules")
RuntimeError: requires_grad_ is not supported on ScriptModules

I'm on 1.10 nightly build of pytorch.

>>> print(torch.__version__)
1.10.0.dev20210715+cu111
>>> exit
Use exit() or Ctrl-D (i.e. EOF) to exit
>>> exit()

Model Not Loading

What do these lines mean and why aren't they working?

FileNotFoundError Traceback (most recent call last)

in ()
3 #@markdown Once this has been run successfully you only need to run parameters and then the program to execute with new parameters
4 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
----> 5 model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
6 perceptor = clip.load(args.clip_model, jit=False)[0].eval().requires_grad_(False).to(device)
7


/usr/local/lib/python3.7/dist-packages/omegaconf/omegaconf.py in load(file_)
181
182 if isinstance(file_, (str, pathlib.Path)):
--> 183 with io.open(os.path.abspath(file_), "r", encoding="utf-8") as f:
184 obj = yaml.load(f, Loader=get_yaml_loader())
185 elif getattr(file_, "read", None):

FileNotFoundError: [Errno 2] No such file or directory: '/content/vqgan_imagenet_f16_16384.yaml'

Problem unidentified by a newbie (me)

Hello, I followed your video (thank a lot by the way, it seems like I did not followed well actually)
Maybe you'll understand what I can do at this point :

(vqgan) C:\Users\Milaj\github\VQGAN-CLIP>python generate.py -p "A painting of an apple in a fruit bowl"
Traceback (most recent call last):
File "C:\Users\Milaj\github\VQGAN-CLIP\generate.py", line 466, in
model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
File "C:\Users\Milaj\github\VQGAN-CLIP\generate.py", line 436, in load_vqgan_model
config = OmegaConf.load(config_path)
File "C:\Users\Milaj\anaconda3\envs\vqgan\lib\site-packages\omegaconf\omegaconf.py", line 183, in load
with io.open(os.path.abspath(file_), "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\Milaj\github\VQGAN-CLIP\checkpoints\vqgan_imagenet_f16_16384.yaml'

Thank you in advance, tell me if you need more infos

About code implementation Feedback example

  1. I particularly like this example, which is a great discovery. Can you use code to realize this example? I'm running under WIN, but I can't realize zoom.sh
  2. Is there any text prompt that can be generated automatically? I wonder if I can generate it myself ,
    replace random.sh

Improvement: Add cog files

https://github.com/replicate/cog makes it easy to build Docker containers for machine learning. A cog.yaml has to be configured and the interface code written, which looks pretty straightforward. The project could probably be also be added here: https://replicate.ai/explore
Anyone who has Docker installed could then run it on there system as easy as executing something like this:

docker run -d -p 5000:5000 r8.im/nerdyrodent/VQGAN-CLIP@sha256:fe8d040a80609ff5643815e28bc3c488faf8870d968f19e045c4d0e043ffae59
curl http://localhost:5000/predict -X POST -F p="A painting of an apple in a fruit bowl"

seed argument on random.sh

When I use '--seed 42' on generate.py it performs as expected but when using random.py it doesn't appear to be using seed 42, or at least the print command isn't listing the same value. It doesn't make sense that it's not behaving the same. Any ideas?

requirements.txt

Hey! Would you mind adding a requirements.txt? I'm really just looking for the version #s of the relevant repos that are used here. It should be straightforward to extract from the output of "pip freeze". Thanks in advance!

Tensor is not a torch image

Hi. Thanks for the repo. I was just trying to test it, but I keep running into this:

traceback (most recent call last): file "/home/paperspace/vqgan-clip/generate.py", line 552, in train(i) file "/home/paperspace/vqgan-clip/generate.py", line 535, in train lossall = ascend_txt() file "/home/paperspace/vqgan-clip/generate.py", line 514, in ascend_txt iii = perceptor.encode_image(normalize(make_cutouts(out))).float() file "/home/paperspace/anaconda3/envs/vqgan/lib/python3.9/site-packages/torchvision/transforms/transforms.py", line 163, in call return f.normalize(tensor, self.mean, self.std, self.inplace) file "/home/paperspace/anaconda3/envs/vqgan/lib/python3.9/site-packages/torchvision/transforms/functional.py", line 201, in normalize raise typeerror('tensor is not a torch image.') typeerror: tensor is not a torch image.

Any idea how to fix it? Really appreciate any help.

Models mirrors have been removed

First of all, thank you so much for this notebook. It's my favorite version of the VQGAN + CLIP notebooks out there 😊.

As noted by @nerdyrodent in a previous issue, since a couple of days ago, no matter what model you choose to download you'll get the message Could not resolve host: mirror.io.community.

Wikiart checkpoint issue

If I specify the wikiart_16384 checkpoint, the following error occurs:

Traceback (most recent call last):
  File "C:\Development\ml\VQGAN-CLIP\generate.py", line 364, in <module>
    model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
  File "C:\Development\ml\VQGAN-CLIP\generate.py", line 338, in load_vqgan_model
    model.init_from_ckpt(checkpoint_path)
  File "C:\Development\ml\VQGAN-CLIP\taming-transformers\taming\models\vqgan.py", line 52, in init_from_ckpt
    self.load_state_dict(sd, strict=False)
  File "C:\ProgramData\Miniconda3\envs\vqgan\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VQModel:
        size mismatch for loss.discriminator.main.8.weight: copying a param with shape torch.Size([512, 256, 4, 4]) from checkpoin
t, the shape in current model is torch.Size([1, 256, 4, 4]).

Is there a way to specify the initial model shape?

yaml.scanner.ScannerError: mapping values are not allowed here

(base) PS C:\Users\Alex\vqgan-clip> python generate.py -p "A painting of an apple in a fruit bowl" Traceback (most recent call last): File "generate.py", line 546, in <module> model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device) File "generate.py", line 516, in load_vqgan_model config = OmegaConf.load(config_path) File "C:\Users\Alex\anaconda3\lib\site-packages\omegaconf\omegaconf.py", line 184, in load obj = yaml.load(f, Loader=get_yaml_loader()) File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\__init__.py", line 114, in load return loader.get_single_data() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\constructor.py", line 49, in get_single_data node = self.get_single_node() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\composer.py", line 36, in get_single_node document = self.compose_document() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\composer.py", line 58, in compose_document self.get_event() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\parser.py", line 118, in get_event self.current_event = self.state() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\parser.py", line 193, in parse_document_end token = self.peek_token() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\scanner.py", line 129, in peek_token self.fetch_more_tokens() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\scanner.py", line 223, in fetch_more_tokens return self.fetch_value() File "C:\Users\Alex\anaconda3\lib\site-packages\yaml\scanner.py", line 577, in fetch_value raise ScannerError(None, None, yaml.scanner.ScannerError: mapping values are not allowed here in "C:\Users\Alex\vqgan-clip\checkpoints\vqgan_imagenet_f16_16384.yaml", line 43, column 15 (base) PS C:\Users\Alex\vqgan-clip>

Idea, if we're being extra arty about videos.

Another change I've made for myself is to break every n iterations (after checkin) and await user input. If I input Y it reloads the image from disk and reinitialises the optimiser (the same as you do for a zoom video). This way I can "guide" it quite forcefully: if I want a skull with glowing blue eyes, and the blue eyes are not picked up from the init image (or have dissolved into nothing) by the 50th step, I can paint them in. I can also "promote" features in the output by exaggerating their presence.

image

Since we're reinitialising the optimiser, we can presumably also switch up the prompts 'in the middle' of the run, when loss has 'stabilised'? Depending on how far you want to take this (and I'll be doing my own experimentation) maybe we can draw up a timeline and construct a video based on prompts that change over time.

Accidental multi-GPU?

I have a cut of this code from a week or two ago.

Funnily enough I also added the option to run it on another GPU. When I do choose cuda:1, though, I get 2GB allocated on cuda:0 although that device is not specified anywhere in generate.py. Combined with disabling ECC (nvidia-smi -i 1 -e 0) this is Fine, because I can get over 912KibiPixels (1280x720 or 1488x624), but it would be good to understand what, why and how.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.