Giter Club home page Giter Club logo

all-in-one-deflicker's Introduction

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Blind Video Deflickering by Neural Filtering with a Flawed Atlas
Chenyang Lei*, Xuanchi Ren*, Zhaoxiang Zhang and Qifeng Chen
CVPR 2023
* indicates equal contribution

[Paper] [ArXiv] [Project Website]



this slowpoke moves

News!

  • Mar 14, 2023: We are working on supporting segmentation masks. Stay tuned!
  • Mar 12, 2023: Inference code and paper are released! Collected dataset will release soon.
  • Mar 1, 2023: Paper will be public in one week.
  • Feb 28, 2023: Our paper is accepted by CVPR 2023, code will be released in two weeks.

Environment & Dependency

We provide an environment with python 3.10 & torch 1.12 with CUDA 11. If you want a torch 1.6 with CUDA 10, please check this env file.

Install environment:

conda env create -f environment.yml 
conda activate deflicker

Download pretrained ckpt:

git clone https://github.com/ChenyangLEI/cvpr2023_deflicker_public_folder
mv cvpr2023_deflicker_public_folder/pretrained_weights ./ && rm -r cvpr2023_deflicker_public_folder

Inference

Put your video or image folder under data/test. For example:

export PYTHONPATH=$PWD
python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4 # for video input
python test.py --video_frame_folder data/test/Winter_Scenes_in_Holland # for image folder input

Find the results under results/$YOUR_DATA_NAME/final/output.mp4.

Note: our inference code only takes about 3000M GPU memory.

All evaluated types of flickering videos:

Suggestions for Choosing the Hyperparameters

If you want to find the best setting for getting an atlas for deflickering, we provide a reference guide here:

  1. (Important) Iteration number: Please change this according to the total frame number of your video and the downsample rate of the image size. For example, we adopt 10000 iteration number for the example video with 80 frames and downsample rate 4. If you find the results are not as expected, you can try to increase the iters_num (for example: 100000).

  2. Optical flow loss weight: Please change optical_flow_coeff according the intensity of flicker in your video. For example, we adopt 500.0 for the sample video. If the video has minor flickering, you can use 5.0 as the optical_flow_coeff.

  3. Downsample rate: We find that downsampling the resolution of the neural atlas by 4 times make the convergence much faster and slightly influences the quality. You can define your own downsample rate.

  4. Maximum number of frames: We set the maximum_number_of_frames to 200. The performance for longer videos is not evaluated. It would be better to split the long video into several shorter sequences.

  5. Useness of segmentation masks: Perfect segmentation masks will increase the quality of the neural atlas, especially for objects with significant motion. However, in most cases, the improvement brought by segmentation on the final prediction is not obvious since neural filtering can filter the flaws in the atlas. If you want to use segmentation for better results, refer to layered-neural-atlases and use our src/neural_filter_and_refinement.py based on it. Note that layered-neural-atlases use Mask-RCNN, you can also try lang-seg or ODISE.

Discussion and Related work

Potential applications: Our model can be applied to all evaluated types of flickering videos. Besides, while our approach is designed for videos, it is possible to apply Blind Deflickering for other tasks (e.g., novel view synthesis) where flickering artifacts exist.

Temporal consistency beyond our scope: Solving the temporal inconsistency of video content is beyond the scope of deflickering. For example, the contents obtained by video generation algorithms can be very different. Large scratches in old films can destroy the contents and result in unstable videos, which requires extra restoration technique. We leave the study for a general framework to solve these temporally inconsistent artifacts for future work.

Credit

Our code is heavily relied on layered-neural-atlases, fast_blind_video_consistency, and pytorch-deep-video-prior.

Others

While we do not work on this project full-time, please feel free to provide any suggestions. We would also appreciate it if anyone could help us improve the engineering part of this project.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Lei_2023_CVPR,
      author    = {Lei, Chenyang and Ren, Xuanchi and Zhang, Zhaoxiang and Chen, Qifeng},
      title     = {Blind Video Deflickering by Neural Filtering with a Flawed Atlas},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {June},
      year      = {2023},
  }

or

@article{lei2023blind,
  title={Blind Video Deflickering by Neural Filtering with a Flawed Atlas},
  author={Lei, Chenyang and Ren, Xuanchi and Zhang, Zhaoxiang and Chen, Qifeng},
  journal={arXiv preprint arXiv:2303.08120},
  year={2023}
}

all-in-one-deflicker's People

Contributors

xrenaa avatar chenyanglei avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.