ltkong218 / ifrnet Goto Github PK

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation (CVPR 2022)

License: MIT License

Python 100.00%

real-time slow-motion frame-interpolation deep-networks pytorch deep-learning video optical-flow knowledge-distillation view-synthesis

ifrnet's Introduction

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

The official PyTorch implementation of IFRNet (CVPR 2022).

Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, Jie Yang

Highlights

Existing flow-based frame interpolation methods almost all first estimate or model intermediate optical flow, and then use flow warped context features to synthesize target frame. However, they ignore the mutual promotion of intermediate optical flow and intermediate context feature. Also, their cascaded architecture can substantially increase the inference delay and model parameters, blocking them from lots of mobile and real-time applications. For the first time, we merge above separated flow estimation and context feature refinement into a single encoder-decoder based IFRNet for compactness and fast inference, where these two crucial elements can benefit from each other. Moreover, task-oriented flow distillation loss and feature space geometry consistency loss are newly proposed to promote intermediate motion estimation and intermediate feature reconstruction of IFRNet, respectively. Benchmark results demonstrate that our IFRNet not only achieves state-of-the-art VFI accuracy, but also enjoys fast inference speed and lightweight model size.

YouTube Demos

[4K60p] うたわれるもの偽りの仮面 OP フレーム補間+超解像 (IFRnetとReal-CUGAN)

[4K60p] 天神乱漫 -LUCKY or UNLUCKY!?- OP (IFRnetとReal-CUGAN)

RIFE IFRnet 比較

IFRNet frame interpolation

Preparation

PyTorch >= 1.3.0 (We have verified that this repository supports Python 3.6/3.7, PyTorch 1.3.0/1.9.1).
Download training and test datasets: Vimeo90K, UCF101, SNU-FILM, Middlebury, GoPro and Adobe240.
Set the right dataset path on your machine.

Download Pre-trained Models and Play with Demos

Figures from left to right are overlaid input frames, 2x and 8x video interpolation results respectively.

Download our pre-trained models in this link, and then put file checkpoints into the root dir.
Run the following scripts to generate 2x and 8x frame interpolation demos

$ python demo_2x.py
$ python demo_8x.py

Training on Vimeo90K Triplet Dataset for 2x Frame Interpolation

First, run this script to generate optical flow pseudo label

$ python generate_flow.py

Then, start training by executing one of the following commands with selected model

$ python -m torch.distributed.launch --nproc_per_node=4 train_vimeo90k.py --world_size 4 --model_name 'IFRNet' --epochs 300 --batch_size 6 --lr_start 1e-4 --lr_end 1e-5
$ python -m torch.distributed.launch --nproc_per_node=4 train_vimeo90k.py --world_size 4 --model_name 'IFRNet_L' --epochs 300 --batch_size 6 --lr_start 1e-4 --lr_end 1e-5
$ python -m torch.distributed.launch --nproc_per_node=4 train_vimeo90k.py --world_size 4 --model_name 'IFRNet_S' --epochs 300 --batch_size 6 --lr_start 1e-4 --lr_end 1e-5

Benchmarks for 2x Frame Interpolation

To test running time and model parameters, you can run

$ python benchmarks/speed_parameters.py

To test frame interpolation accuracy on Vimeo90K, UCF101 and SNU-FILM datasets, you can run

$ python benchmarks/Vimeo90K.py
$ python benchmarks/UCF101.py
$ python benchmarks/SNU_FILM.py

Quantitative Comparison for 2x Frame Interpolation

Proposed IFRNet achieves state-of-the-art frame interpolation accuracy with less inference time and computation complexity. We expect proposed single encoder-decoder joint refinement based IFRNet to be a useful component for many frame rate up-conversion, video compression and intermediate view synthesis systems. Time and FLOPs are measured on 1280 x 720 resolution.

Qualitative Comparison for 2x Frame Interpolation

Video comparison for 2x interpolation of methods using 2 input frames on SNU-FILM dataset.

Middlebury Benchmark

Results on the Middlebury online benchmark.

Results on the Middlebury Other dataset.

Training on GoPro Dataset for 8x Frame Interpolation

Start training by executing one of the following commands with selected model

$ python -m torch.distributed.launch --nproc_per_node=4 train_gopro.py --world_size 4 --model_name 'IFRNet' --epochs 600 --batch_size 2 --lr_start 1e-4 --lr_end 1e-5
$ python -m torch.distributed.launch --nproc_per_node=4 train_gopro.py --world_size 4 --model_name 'IFRNet_L' --epochs 600 --batch_size 2 --lr_start 1e-4 --lr_end 1e-5
$ python -m torch.distributed.launch --nproc_per_node=4 train_gopro.py --world_size 4 --model_name 'IFRNet_S' --epochs 600 --batch_size 2 --lr_start 1e-4 --lr_end 1e-5

Since inter-frame motion in 8x interpolation setting is relatively small, task-oriented flow distillation loss is omitted here. Due to the GoPro training set is a relatively small dataset, we suggest to use your specific datasets to train slow-motion generation for better results.

Quantitative Comparison for 8x Frame Interpolation

Qualitative Results on GoPro and Adobe240 Datasets for 8x Frame Interpolation

Each video has 9 frames, where the first and the last frames are input, and the middle 7 frames are predicted by IFRNet.

ncnn Implementation of IFRNet

ifrnet-ncnn-vulkan uses ncnn project as the universal neural network inference framework. This package includes all the binaries and models required. It is portable, so no CUDA or PyTorch runtime environment is needed.

Citation

When using any parts of the Software or the Paper in your work, please cite the following paper:

@InProceedings{Kong_2022_CVPR, 
  author = {Kong, Lingtong and Jiang, Boyuan and Luo, Donghao and Chu, Wenqing and Huang, Xiaoming and Tai, Ying and Wang, Chengjie and Yang, Jie}, 
  title = {IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation}, 
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  year = {2022}
}

ifrnet's People

Contributors

Stargazers

Watchers

ifrnet's Issues

Training code usage

Hello,
Can you provide the details regarding the environment setup which is required to run the training code?
I am trying to run it on A10 with 2 GPU's. Will I have to make any changes to run the code?

Thank you in advance!!

How to train using a single GPU？

Hello author, thank you for your excellent work.

Your code seems to be for multi-card distributed training. I only have one GPU. Could you please tell me where to change it for single-card training, so that it can be debugged with a single card.

Looking forward to your reply.

GPU memory leak in class ResBlock (line 44 & 46 in IFRNet_S.py) in pytorch 1.13.0

I found gpu memory leak when program runs into following lines. (It might be a bug in higher version of pytorch)

https://github.com/ltkong218/IFRNet/blob/main/models/IFRNet_S.py#L44
https://github.com/ltkong218/IFRNet/blob/main/models/IFRNet_S.py#L46

officical code (which leads to GPU memory leak):

out = self.conv1(x)
out[:, -self.side_channels:, :, :] = self.conv2(out[:, -self.side_channels:, :, :])
out = self.conv3(out)
out[:, -self.side_channels:, :, :] = self.conv4(out[:, -self.side_channels:, :, :])
out = self.prelu(x + self.conv5(out))
return out

then I changed codes to the following, GPU mem leak disappeared. (using concat to get a new feature after each side conv)

  out = self.conv1(x)
  side_ft = out[:, :-self.side_channels, :, :]
  conv_ft = out[:, -self.side_channels:, :, :]
  conv_ft = self.conv2(conv_ft)
  out = torch.cat([side_ft, conv_ft], axis=1)
  out = self.conv3(out)
  side_ft = out[:, :-self.side_channels, :, :]
  conv_ft = out[:, -self.side_channels:, :, :]
  conv_ft = self.conv4(conv_ft)
  out = torch.cat([side_ft, conv_ft], axis=1)
  out = self.prelu(x + self.conv5(out))

my specs:
ubuntu 20.04, python 3.9, pytorch1.13.1+cu117, with gpu v100(single card)

AttributeError: module 'correlation' has no attribute 'FunctionCorrelation'

The generated frame is blurrier than the original frame

I tested 8x interpolation using IFRNet_GoPro.pth and found that although the generated frames have good coherence, the generated frames are blurrier than the original frames. Why is this?

Doubt about hardware use

I have been using this interpolation model and the results have seemed quite good, but I have grown quite a doubt.

Does this model benefit from the use of VRAM or Core Frequency? Because I usually see that the GPU Core is at 0%, but the VRAM does heat up with the use of this model.

The problem encountered in training procedure

I attempt to follow this excellent work and encounter some problems in training. I'll appreciate it if you could help me solve it. The log is presented below:

2022-07-09 13:40:33:INFO:Namespace(batch_size=6, device=device(type='cuda', index=0), epochs=300, eval_interval=1, local_rank=0, log_path='checkpoint/IFRNet', lr_end=1e-05, lr_start=0.0001, model_name='IFRNet', num_workers=6, resume_epoch=0, resume_path=None, world_size=4)
Distributed Data Parallel Training IFRNet on Rank 2
Traceback (most recent call last):
File "train_vimeo90k.py", line 188, in
train(args, ddp_model)
File "train_vimeo90k.py", line 89, in train
loss.backward()
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/autograd/init.py", line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 32, 112, 112]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Traceback (most recent call last):
File "train_vimeo90k.py", line 188, in
train(args, ddp_model)
File "train_vimeo90k.py", line 89, in train
loss.backward()
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/autograd/init.py", line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 32, 112, 112]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Traceback (most recent call last):
File "train_vimeo90k.py", line 188, in
train(args, ddp_model)
File "train_vimeo90k.py", line 89, in train
loss.backward()
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/autograd/init.py", line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 32, 112, 112]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Traceback (most recent call last):
File "train_vimeo90k.py", line 188, in
train(args, ddp_model)
File "train_vimeo90k.py", line 89, in train
loss.backward()
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/autograd/init.py", line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 32, 112, 112]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 29114) of binary: /home/vm411/miniconda3/envs/ifrnet/bin/python
Traceback (most recent call last):
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/run.py", line 718, in run
)(*cmd_args)
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/vm411/miniconda3/envs/ifrnet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
train_vimeo90k.py FAILED

And here is our conda environment:

_libgcc_mutex 0.1 main defaults
_openmp_mutex 5.1 1_gnu defaults
absl-py 0.15.0 pyhd3eb1b0_0 defaults
aiohttp 3.8.1 py37h7f8727e_1 defaults
aiosignal 1.2.0 pyhd3eb1b0_0 defaults
async-timeout 4.0.1 pyhd3eb1b0_0 defaults
asynctest 0.13.0 py_0 defaults
attrs 21.4.0 pyhd3eb1b0_0 defaults
blas 1.0 mkl defaults
blinker 1.4 py37h06a4308_0 defaults
brotli 1.0.9 he6710b0_2 defaults
brotlipy 0.7.0 py37h27cfd23_1003 defaults
bzip2 1.0.8 h7b6447c_0 defaults
c-ares 1.18.1 h7f8727e_0 defaults
ca-certificates 2022.4.26 h06a4308_0 defaults
cachetools 4.2.2 pyhd3eb1b0_0 defaults
cairo 1.16.0 h19f5f5c_2 defaults
certifi 2022.6.15 py37h06a4308_0 defaults
cffi 1.15.0 py37hd667e15_1 defaults
charset-normalizer 2.0.4 pyhd3eb1b0_0 defaults
click 8.0.4 py37h06a4308_0 defaults
cryptography 3.4.8 py37hd23ed53_0 defaults
cudatoolkit 11.3.1 h2bc3f7f_2 defaults
cupy-cuda114 10.6.0 pypi_0 pypi
cycler 0.11.0 pyhd3eb1b0_0 defaults
dataclasses 0.8 pyh6d0b6a4_7 defaults
dbus 1.13.18 hb2f20db_0 defaults
expat 2.4.4 h295c915_0 defaults
fastrlock 0.8 pypi_0 pypi
ffmpeg 4.0 hcdf2ecd_0 defaults
fontconfig 2.13.1 h6c09931_0 defaults
fonttools 4.25.0 pyhd3eb1b0_0 defaults
freeglut 3.0.0 hf484d3e_5 defaults
freetype 2.11.0 h70c0345_0 defaults
frozenlist 1.2.0 py37h7f8727e_0 defaults
giflib 5.2.1 h7b6447c_0 defaults
glib 2.69.1 h4ff587b_1 defaults
google-auth 2.6.0 pyhd3eb1b0_0 defaults
google-auth-oauthlib 0.4.1 py_2 defaults
graphite2 1.3.14 h295c915_1 defaults
grpcio 1.42.0 py37hce63b2e_0 defaults
gst-plugins-base 1.14.0 h8213a91_2 defaults
gstreamer 1.14.0 h28cd5cc_2 defaults
harfbuzz 1.8.8 hffaf4a1_0 defaults
hdf5 1.10.2 hba1933b_1 defaults
icu 58.2 he6710b0_3 defaults
idna 3.3 pyhd3eb1b0_0 defaults
imageio 2.9.0 pyhd3eb1b0_0 defaults
importlib-metadata 4.11.3 py37h06a4308_0 defaults
intel-openmp 2021.4.0 h06a4308_3561 defaults
jasper 2.0.14 hd8c5072_2 defaults
jpeg 9e h7f8727e_0 defaults
kiwisolver 1.4.2 py37h295c915_0 defaults
lcms2 2.12 h3be6417_0 defaults
ld_impl_linux-64 2.38 h1181459_1 defaults
libffi 3.3 he6710b0_2 defaults
libgcc-ng 11.2.0 h1234567_1 defaults
libgfortran-ng 7.5.0 ha8ba4b0_17 defaults
libgfortran4 7.5.0 ha8ba4b0_17 defaults
libglu 9.0.0 hf484d3e_1 defaults
libgomp 11.2.0 h1234567_1 defaults
libopencv 3.4.2 hb342d67_1 defaults
libopus 1.3.1 h7b6447c_0 defaults
libpng 1.6.37 hbc83047_0 defaults
libprotobuf 3.20.1 h4ff587b_0 defaults
libstdcxx-ng 11.2.0 h1234567_1 defaults
libtiff 4.2.0 h2818925_1 defaults
libuuid 1.0.3 h7f8727e_2 defaults
libuv 1.40.0 h7b6447c_0 defaults
libvpx 1.7.0 h439df22_0 defaults
libwebp 1.2.2 h55f646e_0 defaults
libwebp-base 1.2.2 h7f8727e_0 defaults
libxcb 1.15 h7f8727e_0 defaults
libxml2 2.9.14 h74e7548_0 defaults
lz4-c 1.9.3 h295c915_1 defaults
markdown 3.3.4 py37h06a4308_0 defaults
matplotlib 3.5.1 py37h06a4308_1 defaults
matplotlib-base 3.5.1 py37ha18d171_1 defaults
mkl 2021.4.0 h06a4308_640 defaults
mkl-service 2.4.0 py37h7f8727e_0 defaults
mkl_fft 1.3.1 py37hd3c417c_0 defaults
mkl_random 1.2.2 py37h51133e4_0 defaults
multidict 5.2.0 py37h7f8727e_2 defaults
munkres 1.1.4 py_0 defaults
ncurses 6.3 h5eee18b_3 defaults
numpy 1.21.5 py37h6c91a56_3 defaults
numpy-base 1.21.5 py37ha15fc14_3 defaults
oauthlib 3.2.0 pyhd3eb1b0_0 defaults
opencv 3.4.2 py37h6fd60c2_1 defaults
openssl 1.1.1p h5eee18b_0 defaults
packaging 21.3 pyhd3eb1b0_0 defaults
pcre 8.45 h295c915_0 defaults
pillow 9.0.1 py37h22f2fdc_0 defaults
pip 21.2.2 py37h06a4308_0 defaults
pixman 0.40.0 h7f8727e_1 defaults
protobuf 3.20.1 py37h295c915_0 defaults
py-opencv 3.4.2 py37hb342d67_1 defaults
pyasn1 0.4.8 pyhd3eb1b0_0 defaults
pyasn1-modules 0.2.8 py_0 defaults
pycparser 2.21 pyhd3eb1b0_0 defaults
pyjwt 2.1.0 py37h06a4308_0 defaults
pyopenssl 21.0.0 pyhd3eb1b0_1 defaults
pyparsing 3.0.4 pyhd3eb1b0_0 defaults
pyqt 5.9.2 py37h05f1152_2 defaults
pysocks 1.7.1 py37_1 defaults
python 3.7.13 h12debd9_0 defaults
python-dateutil 2.8.2 pyhd3eb1b0_0 defaults
pytorch 1.11.0 py3.7_cuda11.3_cudnn8.2.0_0 pytorch
pytorch-mutex 1.0 cuda pytorch
qt 5.9.7 h5867ecd_1 defaults
readline 8.1.2 h7f8727e_1 defaults
requests 2.28.0 py37h06a4308_0 defaults
requests-oauthlib 1.3.0 py_0 defaults
rsa 4.7.2 pyhd3eb1b0_1 defaults
scipy 1.7.3 py37hc147768_0 defaults
setuptools 61.2.0 py37h06a4308_0 defaults
sip 4.19.8 py37hf484d3e_0 defaults
six 1.16.0 pyhd3eb1b0_1 defaults
sqlite 3.38.5 hc218d9a_0 defaults
tensorboard 2.6.0 py_1 defaults
tensorboard-data-server 0.6.0 py37hca6d32c_0 defaults
tensorboard-plugin-wit 1.6.0 py_0 defaults
tk 8.6.12 h1ccaba5_0 defaults
torchvision 0.2.2 py_3 pytorch
tornado 6.1 py37h27cfd23_0 defaults
typing-extensions 4.1.1 hd3eb1b0_0 defaults
typing_extensions 4.1.1 pyh06a4308_0 defaults
urllib3 1.26.9 py37h06a4308_0 defaults
werkzeug 2.0.3 pyhd3eb1b0_0 defaults
wheel 0.37.1 pyhd3eb1b0_0 defaults
xz 5.2.5 h7f8727e_1 defaults
yarl 1.6.3 py37h27cfd23_0 defaults
zipp 3.8.0 py37h06a4308_0 defaults
zlib 1.2.12 h7f8727e_2 defaults
zstd 1.5.2 ha4553b6_0 defaults

GOPRO training commands

Hi, thanks for your great work! The training commands for the gopro dataset in the README are likely for the Vimeo90K, since they are same as the commands for the Vimeo90K. Could you plz update the commands for gopro dataset?

怎么预训练Liteflownet的？

请问在vimeo上预训练Liteflownet，是用I1 warp 到 I2，再和I2的真实值算loss吗？这部分无监督的预训练代码可以提供吗？谢谢

About optical flow

Thank you for your source code! I want to train your model by using X-ray images. Seems liteflownet estimates the optical flow from image0 to image1. But in your dataset code is flow_t0.flo and flow_t1.flo. How do I get the optical flow from intermediate to image0 or image1?

Memory accumulation

Hey, thanks for sharing your work, model is great so far.

I'm finding that the inference tends to accumulate memory each run, for example if you run the model in a for loop about 10 times you end up without memory.

for i in range(20): imgt_pred = model.inference(img0_, img8_, embt)

CUDA out of memory. Tried to allocate 48.00 MiB (GPU 0; 8.00 GiB total capacity; 7.19 GiB already allocated; 0 bytes free; 7.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF File "C:\Users\Pablo\IFRNet\models\IFRNet.py", line 149, in forward f_out = self.convblock(f_in) File "C:\Users\Pablo\IFRNet\models\IFRNet.py", line 193, in inference out1 = self.decoder1(ft_1_, f0_1, f1_1, up_flow0_2, up_flow1_2) File "C:\Users\Pablo\IFRNet\interpolation.py", line 27, in inference imgt_pred = self.model.inference(img0_, img1_, embt) File "C:\Users\Pablo\IFRNet\interpolation.py", line 47, in <module> inter = model.inference(img0, img1, 3) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 48.00 MiB (GPU 0; 8.00 GiB total capacity; 7.19 GiB already allocated; 0 bytes free; 7.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I'll see if I can find where it's accumulating.

How to test frame interpolation on user videos?

I am trying to run frame interpolation on videos. I did not get the example in the document or the python file. please let me know if I have missed anything.

Video frame interpolation usage

Hello!
Thanks for your work!

Can you, please, suggest the best way to use this model to interpolate video? Just take 2 neibour frames of video, infer them and then stitch new frames back?

Should model be retrained for each video or it can be used to interpolate any video with good quality?

Thank you in advance for your reply!

Question

How does this method compare to ST-MFNet? Does it yield better results would you say?

https://github.com/danielism97/ST-MFNet

the census loss and the distillation loss

Hi, could you release the loss function code?

Demo with Videos

Can I somehow use videos (mp4) instead of images with demo_2x.py

Feature request: Add ensembling

Ensembling is used in frame interpolation to drastically improve visual quality by using different predictions and generating a mean. Here is a paper talking about ensembling.

Rife does use it as well and uses 2 predictions, which can be seen here. It should not be very hard to add. I would trade off some speed to have more quality. Thanks.

Large gap in my running time with data in the paper

Hi, I try to evaluate the running time of IFRNet-S. The time testing on my V100 server is 0.131 s with 1x3x720x1280 input, which is 10x longer than the data mentioned in your paper.
The testing code is shown below（modified with demo_2x.py）:
import os
import numpy as np
import torch
from models.IFRNet_S import Model
import time

model = Model().cuda().eval()
model.load_state_dict(torch.load('./checkpoints/IFRNet_small/IFRNet_S_Vimeo90K.pth'))

with torch.no_grad():
inp_size = [1, 3, 720, 1280]
inps = [torch.Tensor(*inp_size).cuda() for _ in range(2)]
embt = torch.tensor(1/2).view(1, 1, 1, 1).float().cuda()
#warm up
for i in range(5):
model.inference(inps[0],inps[1],embt)
torch.cuda.synchronize()
t1 = time.time()
for i in range(10):
model.inference(inps[0],inps[1],embt)
torch.cuda.synchronize()
t2 = time.time()
print("inference time average:",(t2-t1)/10)

The data using torch profiler is shown below:

                                               Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg     Self CUDA   Self CUDA %    CUDA total  CUDA time avg    # of Calls

                                    model_inference         0.22%       5.682ms       100.00%        2.604s        2.604s       0.000us         0.00%      23.485ms      23.485ms             1  
                                  aten::convolution         0.01%     280.000us         0.20%       5.329ms     121.114us       0.000us         0.00%      16.184ms     367.818us            44  
                                 aten::_convolution         0.02%     610.000us         0.19%       5.049ms     114.750us       0.000us         0.00%      16.184ms     367.818us            44  
                                       aten::conv2d         0.01%     267.000us         0.19%       4.979ms     124.475us       0.000us         0.00%      12.114ms     302.850us            40  
                            aten::cudnn_convolution         0.07%       1.771ms         0.10%       2.689ms      67.225us      10.563ms        44.98%      10.563ms     264.075us            40

volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148... 0.00% 0.000us 0.00% 0.000us 0.000us 9.754ms 41.53% 9.754ms 304.812us 32
aten::conv_transpose2d 0.00% 32.000us 0.02% 649.000us 162.250us 0.000us 0.00% 4.070ms 1.018ms 4
aten::cudnn_convolution_transpose 0.01% 184.000us 0.02% 421.000us 105.250us 3.901ms 16.61% 3.901ms 975.250us 4
aten::copy_ 0.01% 295.000us 0.47% 12.158ms 715.176us 3.300ms 14.05% 3.300ms 194.118us 17
aten::to 0.00% 68.000us 0.46% 11.992ms 1.499ms 0.000us 0.00% 3.134ms 391.750us 8

Self CPU time total: 2.604s
Self CUDA time total: 23.485ms

Error running train_vimeo90k.py

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 32, 112, 112]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead.

May I ask what is causing this and how can I make the necessary modifications

Error train Vimeo90K with dataset

I get an error
No such file or directory: '/frame_int/datasets/vimeo_triplet/flow/00034/0051/flow_t0.flo'

I don't quite understand the structure of the Vimeo90K dataset, It shouldn't be there .flo files? does the model have to create them itself? then why does the script require this file and where to get it if it is not in the original dataset?

Evaluation Results for gopro dataset

Hi,
When I use your provided checkpoint for gopro dataset, the evaluate result seems very low.
I set the resume_epoch to 599 (epochs=600) and resume_path to the provided checkpoint (checkpoints/IFRNet_GoPro.pth), also comments all the training parts so that it will directly go to the evaluation part.
The results are:

Could you give me some possible reasons so that I can debug?

The generated image sizes have changed(on 1080X1920)

Adobe240 dataset

I would like to know which subset of Adobe dataset is used, I tested all of them and the values are very low, is it also cropped to 512*512? Thanks

What is embt used for?

This work is very interesting, but what is embt used for?

Need to convert to onnx or torchscript model

The project is awesome, one of the few projects that can run demos properly，Request to add conversion methods for onnx

Other flow teacher network?

May I ask if you have tried using RAFT or other methods as the teacher network? Would the results be better? How did you pretrain LiteFlowNet on Vimeo90k, and could you provide the code for this pretraining?

Colab notebook request

Hi, I found your project really interesting. I'd suggest creating a colab notebook for everyone to try out what your project can do. Thanks so much!

Convert to NCNN

How to convert the model to ncnn? If there is no WARP layer in ncnn, then how can it be converted with a dimension of not 3 channels? in the inference function, the flow dimension has 2 channels instead of the required three to replace, as they do here - https://github.com/TNTwise/REAL-Video-Enhancer/wiki/Convert-Rife-Models

Can the model work on 4k video?

If yes, how ? I am getting cuda out of memory error.

The generated image sizes have changed

Hi, when I test the middlebury dataset, the sizes of the generated images and the ground truth are different, what is the reason for this, and how should I correct it?

Acknowledgement and a problem

Thank for your work. I integrated your model into my project (with slightly changes). My program published at https://github.com/lotress/MoePhoto. Also I'm working on some improvement including this.

The problem about 3 Vimeo90K models is they didn't trained on configuration other than 2x slomo, the result is they are insensitive about the embt input. They output almost the same predictions no matter what embt is. I can only use the GoPro models now, but they have a tiny more illusion than the Vimeo90K models, maybe they are both under trained.

code release

Hi, will you release code? Thanks

训练模型无法复现结果

您好，感谢您开源代码，代码写的很清晰，readme的可操作性也很强。但是我在尝试训练IFRNet_S的时候，发现我训练后的模型无法正确复现论文中的结果。我用了4张3090，完全按照readme中的命令进行的操作。最终只能得到34.45的PSNR。我的train.log如下，想问问可能出现的问题会在哪里，期待您的回复。
train.log

Speed_paramters of my own trained models

Hello,

how can I test the speed_paramters of my own trained models. i cant use the pth s

Thank You

Image size for running demo_8x.py

What are the image sizes that I can use to run demo_8x.py ?
I am trying to run using 1920 X 1080 image size but I am getting an error.

Questions on generating flow

Thank you for your source code! I want to train your model by using Vimeo90k data. I tried to run the generate_flow.py but I've got these errors.
On your code on run.py of liteflownet, I've realized that the 297th line on the run.py might have a problem.
I accessed the link ("http://content.sniklaus.com/github/pytorch-liteflownet") where the error occurred directly, but the page of the link seems to have disappeared. When I tried the actual connection, only the phrase "no so much file or directory" came out.

and below is the error printed on my log.txt file!

Downloading: "http://content.sniklaus.com/github/pytorch-liteflownet/network-default.pytorch" to /home/chaeyun/.cache/torch/hub/checkpoints/network-default.pytorch
Traceback (most recent call last):
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 1354, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 1256, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 1302, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 1251, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 1011, in _send_output
self.send(msg)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 951, in send
self.connect()
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/http/client.py", line 922, in connect
self.sock = self._create_connection(
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/socket.py", line 787, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "generate_flow.py", line 59, in
flow_t0 = pred_flow(imgt, img0)
File "generate_flow.py", line 36, in pred_flow
flow = estimate(img1, img2)
File "/data/projects/chaeyun/IFRNet/liteflownet/run.py", line 341, in estimate
netNetwork = Network().cuda().eval()
File "/data/projects/chaeyun/IFRNet/liteflownet/run.py", line 297, in init
self.load_state_dict({ strKey.replace('module', 'net'): tenWeight for strKey, tenWeight in torch.hub.load_state_dict_from_url(url='http://content.sniklaus.com/github/pytorch-liteflownet/network-' + arguments_strModel + '.pytorch').items() })
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/site-packages/torch/hub.py", line 727, in load_state_dict_from_url
download_url_to_file(url, cached_file, hash_prefix, progress=progress)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/site-packages/torch/hub.py", line 593, in download_url_to_file
u = urlopen(req)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 1383, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/chaeyun/miniconda3/envs/cytorch/lib/python3.8/urllib/request.py", line 1357, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>
srun: error: node01: task 0: Exited with exit code 1

训练完gopro(9帧),测试出现问题

测试的时候出现：运行demo_8x.py 结果 out1 到out7 一致的问题，是训练的问题还是哪里出现问题了呢

Results seem a little off when trained from scratch

Hi, I trained the model using the given model script and the hyperparameters for IFRnet(base model) on 8xV100 and 4xV100s. I am getting a PSNR of around 34.5 on the 8GPUs trained model and 35 on the 4 GPUs trained model.

Are there some hyperparameters that I should change to reproduce the results given in the paper? Also, any intuition on the results varies. I m using the vimeo90k dataset for training.

I am using a batchsize of 55, my V100 are 32gigs