sczhou / propainter Goto Github PK

View Code? Open in Web Editor NEW

4.9K 4.9K 587.0 55.15 MB

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting

Home Page: https://shangchenzhou.com/projects/ProPainter/

License: Other

Python 100.00%

object-removal video-completion video-inpainting video-outpainting watermark-removal

propainter's Introduction

Hi there 👋

👨🏼‍💻 I am a Ph.D. student at MMLab@NTU, Nanyang Technological University (NTU)
🔭 I’m currently working on image/video restoration, enhancement, and editing ...
🚀 Most of my projects are open-sourced at GitHub
🏠 How to reach me: my homepage
📖 Check my publications: google scholar

propainter's People

Contributors

Stargazers

Watchers

Forkers

jmu201521121021 camenduru whuhxb paperwave stevensk-king pustar reeshark eltociear note-liu deanvangreunen ugrkilc techthiyanes mrcodechef redsos dharmikjagodana xymfei matrixy elpolini haniissa sbotirov rkp64 gitb0rg robotpin haoyaogang huzunjie yomaser tony163163 sergesv77 jmanhype gusevanatali immissile henryhesz tommyzhn daiqingping cclauss 17354407644 jackzhousz 5l1v3r1 cornrn unsamwill 406345 tztztz2 kytrascript xabibakarimova catcher1975 hitech777 mbzj pancho-hub 16892434 chunyu215 xiaotwins hxtlr abhimanyu891998 inspired3d elwingaofork manasch1k sunyanjie2020 mcyrj bingogo888 ccartermices cuican813 x-ck-x piaohan xiaocai0001 zsc8917zsc onecoconutcoder ishow520 wswbd1314520 cxywssb voroninvisuals ashwinrajendraprasad hubin858130 bodhi2023 xmple067 jinghao666 zhan2046 nftoracle88 coingeek08 duanexiao samopim ahappymosquito regbess yuyedaidao 123github0123 zhanggangxian samcarry 17660935173 watermelonich fishouter omnipotentai philmccarty bjq mduongvandinh princetrunks 1007881269 kanezeng ronghuaxueleng yfstxj winalzikril fraise0324

propainter's Issues

[Feature Request] Add Demo with Colab Notebook

Hi! Adding the demo usage of ProPainter in a Colab Notebook would be useful for everyone to tryout the project

存在伪影still some artifacts

demo视频中在抠除后存在伪影，有办法解决吗？Is there a way to address the presence of artifacts or ghosting in the demo video after removing them?

Content Misuse for Watermark Removal

What does this mean?
2023.09.24: We remove the watermark removal demo officially to prevent the misuse of our work for unethical purposes.

The video is still available on YouTube and features one of our clips available on Shutterstock, demoing Watermark removal.
It also has been promoted on x.com with similar content. Simply stating that the removal demo has been deleted isn't enough. All videos showing and promoting watermark removal have to be eliminated immediately.

为什么视频分辨率240x432的显存占用要比432x240的显存占用多很多？

321帧的视频, 240x432的显存占用38GB, 432x240的显存占用23GB

🦒 colab

Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/ProPainter-colab

在视频去水印的样例中，需要输入视频和要去除的水印图片

在视频去水印的样例中，需要输入视频和要去除的水印图片。这个要去除的水印图片是否是必须的？如果是必须的，除了人工标注之外，是否有好的自动获取的方式？

RuntimeError: CUDA out of memory

I encountered an error while running:
RuntimeError: CUDA out of memory. Tried to allocate 1.20 GiB (GPU 0; 12.00 GiB total capacity; 21.33 GiB already allocated; 0 bytes free; 25.16 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I'm using a dedicated graphics card with 12GB of memory，using 32GB of memory.
My max_split_size_mb is set to 512, and the batch size is set to 4. When running python inference_propainter.py --video .\inputs\video_completion\my.mp4 --mask .\inputs\video_completion\test_2.png --height 720 --width 1080 --neighbor_length 8 --ref_stride 8 --fp16, I encountered an error.
How can I solve the GPU memory issue without reducing the video size or resolution? Increasing the runtime is also an option.
thanks

The result is not very good, do you want to train it?

High GPU usage

Here are the arguments that I set and I have a 10GB GPU I just uploaded a 29-second video (804 frames) my GPU is gone OOM. Give me suggestions how I can make my GPU usage less. I already set these param after reading the guidelines given in repo for memory inference.
parser.add_argument(
"--resize_ratio", type=float, default=1.0, help='Resize scale for processing video.')
parser.add_argument(
'--height', type=int, default=240, help='Height of the processing video.')
parser.add_argument(
'--width', type=int, default=432, help='Width of the processing video.')
parser.add_argument(
'--mask_dilation', type=int, default=4, help='Mask dilation for video and flow masking.')
parser.add_argument(
"--ref_stride", type=int, default=20, help='Stride of global reference frames.')
parser.add_argument(
"--neighbor_length", type=int, default=5, help='Length of local neighboring frames.')
parser.add_argument(
"--subvideo_length", type=int, default=10, help='Length of sub-video for long video inference.')

Create Conda Environment and Install Dependencies

When I type conda env create -f environment.yaml then
ninja: build stopped: subcommand failed.
RuntimeError: Error compiling objects for extension
CondaEnvException: Pip failed
How to solve this problem？ thank you！

How to inference on my own data?

1. I could see from the example video `running_car`, you are using a single image as mask. In that case, video completion is happening using `RAFT`, am I right?

I could see that ProPainter could accomplish:

Video Completion
Object Removal
Video Restoration
Watermark and Logo Removal

In the inference code, I could see only two modes: video_inpainting, video_outpainting

2. What is the difference between these two modes?

3. How can I perform `Object Removal` with my own video data?

Thanks to the team and NTU for such a wonderful project!

Kudos 🥇

how can i solve this problem

Maximize the Impact of Your Model with Gradio demo on Hugging Face Hub

Hi :hugging_face: !
Very cool work! It would be nice to create a research demo using Gradio on the Hugging Face Hub !
Some of the benefits of sharing your models through the Hub would be:

Wider reach of your work to the ecosystem
Seamless integration with popular libraries and frameworks, enhancing usability
Real-time feedback and collaboration opportunities with a global community of researchers and developers

This is a step-by-step guide explaining the process in case you're interested. 😊 This is the docs on Community GPU Grants.

mac book m1运行测试用例报错

I get an installation error for pip install mmcv-full

the error message is as following:
Pip subprocess error:
ERROR: Command errored out with exit status 1:
command: /home/zhenhuaai/.conda/envs/propainter/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-jalet417/mmcv-full_34951aab882f4e63a328b1c0c23cb237/setup.py'"'"'; file='"'"'/tmp/pip-install-jalet417/mmcv-full_34951aab882f4e63a328b1c0c23cb237/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-x04zc70y
cwd: /tmp/pip-install-jalet417/mmcv-full_34951aab882f4e63a328b1c0c23cb237/
Complete output (13 lines):
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-jalet417/mmcv-full_34951aab882f4e63a328b1c0c23cb237/setup.py", line 403, in
ext_modules=get_extensions(),
File "/tmp/pip-install-jalet417/mmcv-full_34951aab882f4e63a328b1c0c23cb237/setup.py", line 313, in get_extensions
extra_compile_args=extra_compile_args)
File "/home/zhenhuaai/.conda/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 779, in CUDAExtension
library_dirs += library_paths(cuda=True)
File "/home/zhenhuaai/.conda/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 869, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
File "/home/zhenhuaai/.conda/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1783, in _join_cuda_home
raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/02/39/ef8b2c52e73a90df6cbb1e6fbd5eb374ee6d0a1888ed2e758a98098b479c/mmcv-full-1.4.8.tar.gz#sha256=329a68d80367901e68c1a2445beb09a72ed53700d7226e1199bbf5d12e91d506 (from https://pypi.org/simple/mmcv-full/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement mmcv-full==1.4.8 (from versions: 1.0rc0, 1.0rc2, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.2.4, 1.2.5, 1.2.6, 1.2.7, 1.3.0, 1.3.1, 1.3.3, 1.3.4, 1.3.5, 1.3.6, 1.3.7, 1.3.8, 1.3.9, 1.3.10, 1.3.11, 1.3.12, 1.3.13, 1.3.14, 1.3.15, 1.3.16, 1.3.17, 1.3.18, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.7, 1.4.8, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1)
ERROR: No matching distribution found for mmcv-full==1.4.8

failed

CondaEnvException: Pip failed

wrong setup

conda env create -f environment.yml

conda env create -f environment.yaml

Could you help to further explain your sub-video idea

Hi, thanks for your great work and contribution to video inpainting, may I consult what's the key difference between your idea of "sub-video" with this solution proposed to improve E2FGVI's memory efficiency?

Why is there always flickering?

Hey,

after processing my result flickered all the time.
Did I forget to set something?
（I was using SAM and Xmem to get masks.)

my origin video:

goddess.mp4

result:

inpaint_out.mp4

License

Hey,
whom can I contact regarding licensing issues?

我在本地测了一下去固定位置的水印，但是效果感觉比你的差一些，有什么trick？

很惊艳的工作！所以本地测试了一下，发现同类型的视频感觉比你的demo差一些
源视频hild-runs-barefoot-on-grass-park-joyful跟你的demo一样来自shutterstock，下载后的视频默认是webm格式的，我用FFmpeg转为了基于libx264的MP4格式

child-runs-barefoot-on-grass-in-park-joyful-kid-running-with-dog-healthy-active-lifestyle-of.mp4

mask是直接使用的你的mask，因为视频尺寸都是一致的

但是结果相比你demo中的，感觉要差很多，mask区域有比较明显的模糊块

结果视频为：

inpaint_out.mp4

有什么提升的办法？

How to draw mask?

Do I need to use photoshop to draw mask for every frame????

ValueError: Unknown CUDA arch (8.6) or GPU not supported

  File "/root/miniconda3/envs/propainter/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/root/miniconda3/envs/propainter/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/root/miniconda3/envs/propainter/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/root/miniconda3/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 464, in unix_wrap_ninja_compile
    cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
  File "/root/miniconda3/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 372, in unix_cuda_flags
    cflags + _get_cuda_arch_flags(cflags))
  File "/root/miniconda3/envs/propainter/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1414, in _get_cuda_arch_flags
    raise ValueError("Unknown CUDA arch ({}) or GPU not supported".format(arch))
ValueError: Unknown CUDA arch (8.6) or GPU not supported
----------------------------------------

ERROR: Command errored out with exit status 1: /root/miniconda3/envs/propainter/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-62dm2xrk/mmcv-full_9023a3a25f6e4603a11a13ceb418fa40/setup.py'"'"'; file='"'"'/tmp/pip-install-62dm2xrk/mmcv-full_9023a3a25f6e4603a11a13ceb418fa40/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-weiiqwr9/install-record.txt --single-version-externally-managed --compile --install-headers /root/miniconda3/envs/propainter/include/python3.7m/mmcv-full Check the logs for full command output.
failed

CondaEnvException: Pip failed

环境：
镜像
PyTorch 1.11.0
Python 3.8(ubuntu20.04)
Cuda 11.3
GPU
A40(48GB) * 1

请问如何运行在两块GPU
谢谢！

A bug or using the wrong package (training)

Hi Mr. Zhou
Thanks for answering my two questions. Got another one.
I came across an issue at line #291 "if flows_f == 'None' or flows_b == 'None':" in trainer_flow_w_edges.py when I try to run the training code.
The issue is that "flows_f" and "flows_b" are actually list, so flows_f == 'None' always return false even when "load_flow": is set to "0" in the json file.
The training started running after changing that line to "if flows_f[0] == 'None' or flows_b[0] == 'None':"

Not sure if this issue is related to the python packages I'm using or not?
Also is there any potential issues for the change "if flows_f[0] == 'None' or flows_b[0] == 'None':" ?

Thanks!

环境安装不了呢？大家呢？

我用梯子和不用梯子都安装不了环境呢？你们能成功？

custom the output resolution

Can the ProPainter custom the output resolution like E2FGVI?

DefaultCPUAllocator: can't allocate memory:

(propainter) [[email protected] ProPainter]# python inference_propainter.py --video ../test-videos/4.\ Ben\'s\ Robot\ Gets\ Dressed.mp4 --mask ../test-videos/remove_subtitle.jpg 
Traceback (most recent call last):
  File "inference_propainter.py", line 232, in <module>
    frames = to_tensors()(frames).unsqueeze(0) * 2 - 1    
  File "/root/miniconda3/envs/propainter/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
    img = t(img)
  File "/opt/vipkid/pro-painter/ProPainter/core/utils.py", line 169, in __call__
    img = img.float().div(255) if self.div else img.float()
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 10230727680 bytes. Error code 12 (Cannot allocate memory)

I have 2 questions. First, can the above exception be avoided by setting parameters? Secondly, I see that this exception is likely using CPU. Can CUDA be used for computation instead?

--set_size Will still report an error

python inference_propainter.py --video inputs/v/video3.mp4 --mask inputs/v/video3.png --set_size --height 480 --width 852

return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum, align_corners)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

I changed the video to the default resolution and the error disappeared.

Content Misuse for Watermark Removal

Can I get an archive of watermark removal? Can you get a link of cloned repo with demo? If anyone has the demo, please send me the link, thanks

Simple demo of ProPrinter in colab

Simple demo of ProPrinter in colab:
https://github.com/Ehsan-Shams-Davodly/Colab_Demo/blob/main/Demo_ProPainter.ipynb

Can i select desired object from video for it's removal?

when we take video as input for object removal and video inpainting, can we remove an object that we want?

Not using Gpu ?

cpu usage at 80% and 0 for gpu , 1.2 gb vram usage while running the inference painter.py , takes about 20-25 minutes for a 1 second video at 720p

What are the data formats of "flow_root": "your_flow_root" for training

I assume that I need to preprocess all the videos to get the forward and backward optical flows and save them somewhere("your_flow_root") for the model trainings.
Could you please share the pre-processing code or tell us what formats we need to follow?

Thanks

Inference is very slow, seems no GPU usage

Thanks for your awesome work!

I have already employed the project on my machine which has an A10 GPU (20G memory).

But I found the inference time is very long (it cost 1.5h running the video completion of running car demo) and I also found that no GPU was used when running the demo.

May I ask that why this happened? Did I do some thing wrong?

Thanks.

Reduce the frames of sub-videos --subvideo_lentgh (default 80), which effectively decouples GPU memory costs and video length. (coming soon!)

你好Team,
是不是能把长视频（十分钟以上）每10帧处理一次，这样就不需要把长视频分割成大量小文件，然后批量处理，处理完然后再拼接这样挺麻烦的还容易出错

谢谢！

crash when running " python train.py -c configs/train_propainter.json"

Hi Mr. Zhou,
One more training related question, sorry.

The training crashes when I try "python train.py -c configs/train_propainter.json". Here is what I did:

Update the json file for the "video_root"
Pointing to the downloaded flow completion model - self.fix_flow_complete = RecurrentFlowCompleteNet('/mnt/lustre/sczhou/VQGANs/CodeMOVI/experiments_model/recurrent_flow_completion_v5_train_flowcomp_v5/gen_760000.pth')

   self.fix_flow_complete = RecurrentFlowCompleteNet('**/weights/recurrent_flow_completion.pth')

run the training.
But I see this:
File "/Documents/MyCode/Inpainting/ProPainter/model/propainter.py", line 354, in forward
_, _, local_feat, _ = self.feat_prop_module(local_feat, ds_flows_f, ds_flows_b, prop_mask_in, interpolation)
File "/anaconda3/envs/propainter3_8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/Documents/MyCode/Inpainting/ProPainter/model/propainter.py", line 183, in forward
outputs = self.fuse(torch.cat([outputs_b, outputs_f, mask_in], dim=1)) + x.view(-1, c, h, w)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

I haven't had the chance to trace the code yet, just in case, to you, it is a super easy fix :-)

Result as JPEG or PNG files not a video file

How can I change the code so that the layout is not a mp4-video file? It would prefer to have rendered JPEG oder PNG files.

Do I need to download the YouTube-VOS and DAVIS datasets for local training?

Do I need to train the optical flow myself for it to be effective?

带蒙版什么时候出

How do masks get generated? With an external tool?

I'm interested to know how the masks input are generated

Could not run 'torchvision::deform_conv2d' with arguments from the 'CUDA' backend

NotImplementedError: Could not run 'torchvision::deform_conv2d' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::deform_conv2d' is only available for these backends: [CPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

Why configs/train_e2fgvi.json?

Awesome algorithm! I tested the inference code and it works great!

I have a question on the train.py.
Does your training code need anything from E2FGVI code?
parser = argparse.ArgumentParser(description='E2FGVI')
parser.add_argument('-c',
'--config',
default='configs/train_e2fgvi.json',
type=str)

Is it possible to make the size flexible?

Hey everyone,

I have utilized --size 228 512 to achieve the right dimensions, but an error occurs when I begin processing.

RuntimeError: grid_sampler(): expected grid and input to have the same batch size, but received input with sizes [109504, 1, 64, 29] and grid with sizes [105728, 9, 9, 2]

I'm not sure about the specific rule that should be followed here.

thx

Add `hacktoberfest` in topics

我希望知道如何使用它。

Excuse me,sir. How can l start it? l don't know what should l do......
打扰一下，我该怎样才能开始使用它呢？我不知道我该怎样操作......（麻烦您教一下）

Davis数据集指标有差异

运行命令

python scripts/compute_flow.py -i datasets/davis/JPEGImages_432_240 -o datasets/datasets/davis/Flows_flo
python scripts/evalute_propainter.py --dataset davis --video_rooot datasets/davis/JPEGImages_432_240 --mask_root datasets/davis/test_masks --load_flow True --flow_root datasets/davis/Flows_flo

结果为
PSNR/SSIM/VFID 34.17/0.9771/0.099 ｜ Time 0.1105
与论文结果有较大差距，img和mask均为根据readme下载

export frames

Is that possible to export to png sequence?

Remove shadow effects

@sczhou I remove objects from video succesfully but its not removing the shadow effects of object. IS there any suggestion how it could work better.

视频分辨率不能高于432*240？

想请问一下，如果1080P的视频是不是不可以去除水印或者物体呢？另外一个题外话，不会PS的遮罩怎么弄，有没有可以告知一下

ModuleNotFoundError: No module named 'cv2'

D:\test\propainter\ProPainter>python inference_propainter.py --video inputs/object_removal/bmx-trees --mask inputs/object_removal/bmx-trees_mask
Traceback (most recent call last):
File "D:\test\propainter\ProPainter\inference_propainter.py", line 3, in
import cv2
ModuleNotFoundError: No module named 'cv2'

Object Extraction, Not Removal?

I haven't used this amazing model/tool yet, but I have a question regarding a task capability. When using ProPainter for Object Removal, does it also output the removed object? I would love to use to help make After Effect's rotoscoping much easier and with cleaner cuts by first extracting the object using ProPainter and then using After Effect's rotoscope to further remove the background

Thanks for your hard word and for releasing this model

sczhou / propainter Goto Github PK

propainter's Introduction

Hi there 👋

propainter's People

Contributors

Stargazers

Watchers

Forkers

propainter's Issues

1. I could see from the example video running_car, you are using a single image as mask. In that case, video completion is happening using RAFT, am I right?

2. What is the difference between these two modes?

3. How can I perform Object Removal with my own video data?

Thanks to the team and NTU for such a wonderful project!

Kudos 🥇

CondaEnvException: Pip failed

Recommend Projects

Recommend Topics

Recommend Org

1. I could see from the example video `running_car`, you are using a single image as mask. In that case, video completion is happening using `RAFT`, am I right?

3. How can I perform `Object Removal` with my own video data?