baaivision / vid2vid-zero Goto Github PK

View Code? Open in Web Editor NEW

331.0 331.0 22.0 27.77 MB

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

Python 99.98% CSS 0.02%

vid2vid-zero's People

Contributors

Stargazers

Watchers

vid2vid-zero's Issues

DDIM inversion and DPM inversion？

Excellent work! I have a question to ask, is this DDIM inversion necessary in the process of diffusion of video data? I tried to use DPM inversion and got seriously wrong results. What are the available diffusion samplers that are available?

Code for Spatial Regularization

Thanks for the good work. In the Algorithm 1, there is a attention mask computing process. Then the inference is based on the attention mask. But I do not find the code for attention mask computing process. Can you help to locate the corresponding code. Thanks!

python version.xformer version

Hi, what is the version of python you are using and what is the version of xformers?

data

I found child-riding.mp4 is green. Hope u update the data folder.
Btw, it is really a great work. Congrats!

ValueError: torch.cuda.is_available() should be True but is False.

The result of >>> torch.cuda.is_available() is True

import torch
print(torch.cuda.is_available())
.....True

so what's the reason of such error："ValueError: torch.cuda.is_available() should be True but is False. "？

accelerate launch test_vid2vid_zero.py --config configs/car-moving.yaml
D:\software\anaconda3\Lib\site-packages\transformers\utils\generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 0
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
D:\software\anaconda3\Lib\site-packages\transformers\utils\generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.2.0+cu121 with CUDA 1201 (you have 2.2.0+cpu)
Python 3.11.7 (you have 3.11.5)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
D:\software\anaconda3\Lib\site-packages\transformers\utils\generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
03/13/2024 21:03:44 - INFO - main - Distributed environment: DistributedType.NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu

Mixed precision type: no

The config attributes {'scaling_factor': 0.18215} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.
{'norm_num_groups'} was not found in config. Values will be initialized to default values.
{'num_class_embeds', 'dual_cross_attention', 'upcast_attention', 'class_embed_type', 'only_cross_attention', 'resnet_time_scale_shift', 'use_linear_projection', 'mid_block_type'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
File "D:\笔记\毕设\vid2vid-zero-main\test_vid2vid_zero.py", line 269, in
main(**OmegaConf.load(args.config))
File "D:\笔记\毕设\vid2vid-zero-main\test_vid2vid_zero.py", line 136, in main
unet.enable_xformers_memory_efficient_attention()
File "D:\software\anaconda3\Lib\site-packages\diffusers\modeling_utils.py", line 215, in enable_xformers_memory_efficient_attention
self.set_use_memory_efficient_attention_xformers(True)
File "D:\software\anaconda3\Lib\site-packages\diffusers\modeling_utils.py", line 203, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "D:\software\anaconda3\Lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "D:\software\anaconda3\Lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "D:\software\anaconda3\Lib\site-packages\diffusers\modeling_utils.py", line 196, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid)
File "D:\笔记\毕设\vid2vid-zero-main\vid2vid_zero\models\attention_2d.py", line 235, in set_use_memory_efficient_attention_xformers
raise ValueError(
ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "D:\software\anaconda3\Scripts\accelerate.exe_main.py", line 7, in
File "D:\software\anaconda3\Lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "D:\software\anaconda3\Lib\site-packages\accelerate\commands\launch.py", line 1023, in launch_command
simple_launcher(args)
File "D:\software\anaconda3\Lib\site-packages\accelerate\commands\launch.py", line 643, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\software\anaconda3\python.exe', 'test_vid2vid_zero.py', '--config', 'configs/car-moving.yaml']' returned non-zero exit status 1.

my generated video is black

Why is the generated video black and the value of sample is nan?

The input video and the edited video have different time lengths .

What should be edited in the code or added to it such that the total length of the edited video is the same as the input?

How to modify the code is that the output time becomes the original video time

The code that runs out is a one-second video, and how it should be modified so that it can generate a generated video at the same time as the input video

Stable-diffusion-xl-base-1.0 compatibility

Hi. Are there any plans for compatibility with https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 ?
The image results seem very promissing !

Cannot modify batch size and video length

Experience

I wanted to process a video of 11 frames, so I modified the config.

...
input_data:
  ...
  n_sample_frames: 11
...
validation_data:
   ...
  video_length: 11
...

But when I ran the script, I got "CUDA out of memory". I tried to modify the code by spliting the data into batchs manually (in test_vid2vid_zero.py):

bs = 8
for step, batch in enumerate(input_dataloader):
    all_pixel_values = batch["pixel_values"]
    n_batch = all_pixel_values.shape[1]
    samples = []
    for i in range((n_batch - 1) // bs + 1):
        s_id = i * bs
        e_id = (i + 1) * bs
        if e_id > n_batch:
            e_id = n_batch
        pixel_values = all_pixel_values[:, s_id:e_id].to(weight_dtype).to(
            'cuda:0')
     ......

After the modification, I still got errors indicating the shapes of some tensors are not aligned. I found it is because when you pass validation_data to validation_pipeline

                sample = validation_pipeline(
                    prompts,
                    generator=generator,
                    latents=ddim_inv_latent,
                    uncond_embeddings=uncond_embeddings,
                    **validation_data).images

, video_length is also passed to this function. To allow the batch processing, I have to pop video_length from validation_data and pass current batch size to the function. I didn't want to modify the codes anymore.

Summary

Currently, the users cannot set the batch size of testing data. All the frames are processed in a single batch, which may lead to "CUDA out of memory". I think you should refactor the codes in Dataset class and let DataLoader manage the batch prcoessing.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.