RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Overview

RIFE - Real Time Video Interpolation

arXiv | YouTube | Colab | Tutorial | Demo

Table of Contents

  1. Introduction
  2. Collection
  3. Usage
  4. Evaluation
  5. Training and Reproduction
  6. Citation
  7. Reference
  8. Sponsor

Introduction

This project is the implement of RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation. If you are a developer, welcome to follow Practical-RIFE, which aims to make RIFE more practical for users by adding various features and design new models.

Currently, our model can run 30+FPS for 2X 720p interpolation on a 2080Ti GPU. It supports 2X,4X,8X... interpolation, and multi-frame interpolation between a pair of images.

16X interpolation results from two input images:

Demo Demo

Software

Squirrel-RIFE(中文软件) | Waifu2x-Extension-GUI | Flowframes | RIFE-ncnn-vulkan | RIFE-App(Paid) | Autodesk Flame | SVP |

CLI Usage

Installation

git clone [email protected]:hzwer/arXiv2020-RIFE.git
cd arXiv2020-RIFE
pip3 install -r requirements.txt

Run

Video Frame Interpolation

You can use our demo video or your own video.

python3 inference_video.py --exp=1 --video=video.mp4 

(generate video_2X_xxfps.mp4)

python3 inference_video.py --exp=2 --video=video.mp4

(for 4X interpolation)

python3 inference_video.py --exp=1 --video=video.mp4 --scale=0.5

(If your video has very high resolution such as 4K, we recommend set --scale=0.5 (default 1.0). If you generate disordered pattern on your videos, try set --scale=2.0. This parameter control the process resolution for optical flow model.)

python3 inference_video.py --exp=2 --img=input/

(to read video from pngs, like input/0.png ... input/612.png, ensure that the png names are numbers)

python3 inference_video.py --exp=2 --video=video.mp4 --fps=60

(add slomo effect, the audio will be removed)

python3 inference_video.py --video=video.mp4 --montage --png

(if you want to montage the origin video, skip static frames and save the png format output)

The warning info, 'Warning: Your video has *** static frames, it may change the duration of the generated video.' means that your video has changed the frame rate by adding static frames, it is common if you have processed 25FPS video to 30FPS.

Image Interpolation

python3 inference_img.py --img img0.png img1.png --exp=4

(2^4=16X interpolation results) After that, you can use pngs to generate mp4:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -c:v libx264 -pix_fmt yuv420p output/slomo.mp4 -q:v 0 -q:a 0

You can also use pngs to generate gif:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -vf "split[s0][s1];[s0]palettegen=stats_mode=single[p];[s1][p]paletteuse=new=1" output/slomo.gif

Run in docker

Place the pre-trained models in train_log/\*.pkl (as above)

Building the container:

docker build -t rife -f docker/Dockerfile .

Running the container:

docker run --rm -it -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4
docker run --rm -it -v $PWD:/host rife:latest inference_img --img img0.png img1.png --exp=4

Using gpu acceleration (requires proper gpu drivers for docker):

docker run --rm -it --gpus all -v /dev/dri:/dev/dri -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4

Evaluation

Download RIFE model reported by our paper.

UCF101: Download UCF101 dataset at ./UCF101/ucf101_interp_ours/

Vimeo90K: Download Vimeo90K dataset at ./vimeo_interp_test

MiddleBury: Download MiddleBury OTHER dataset at ./other-data and ./other-gt-interp

HD: Download HD dataset at ./HD_dataset. We also provide a google drive download link.

# RIFE
python3 benchmark/UCF101.py
# "PSNR: 35.282 SSIM: 0.9688"
python3 benchmark/Vimeo90K.py
# "PSNR: 35.615 SSIM: 0.9779"
python3 benchmark/MiddleBury_Other.py
# "IE: 1.956"
python3 benchmark/HD.py
# "PSNR: 32.14"
python3 benchmark/HD_multi.py
# "PSNR: 18.60(544*1280), 29.02(720p), 24.73(1080p)"

Training and Reproduction

Download Vimeo90K dataset.

We use 16 CPUs, 4 GPUs and 20G memory for training:

python3 -m torch.distributed.launch --nproc_per_node=4 train.py --world_size=4

Citation

@article{huang2020rife,
  title={RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation},
  author={Huang, Zhewei and Zhang, Tianyuan and Heng, Wen and Shi, Boxin and Zhou, Shuchang},
  journal={arXiv preprint arXiv:2011.06294},
  year={2020}
}

Reference

Optical Flow: ARFlow pytorch-liteflownet RAFT pytorch-PWCNet

Video Interpolation: DVF TOflow SepConv DAIN CAIN MEMC-Net SoftSplat BMBC EDSC EQVI

Sponsor

感谢支持 Paypal Sponsor: https://www.paypal.com/paypalme/hzwer

imageimage

Comments
  • Welcome to try v3.8 model

    Welcome to try v3.8 model

    Based on the evaluation of dozens of videos, the v3.8 model has achieved an acceleration effect of more than 2X while surpassing the effect of the RIFEv2.4 model. And v3.8 can better handle 2d scenes. At the same time, we welcome you to submit bad cases to help us in the future model improvement.

    v3.8 model: https://github.com/hzwer/Practical-RIFE#model-list

    opened by hzwer 23
  • 24 to 60 fps?

    24 to 60 fps?

    Hi there,

    RIFE looks fantastic, but as fair as I know I only can enter integer numbers as scale factor, correct? So when I want to interpolate 24 fps to 60 (by far the most common case I suppose) I know no other way than interpolating to 120 (factor 5) and then drop any other frame to get 60.

    But even that doesn't seem to be possible as supported scale factors are only 2x, 4x, 8x (no 5x option).

    So, is RIFE able to make 24 fps movie content run smooth on 60 Hz displays?

    opened by spyro2000 22
  • can't train because torch incompatible with python version

    can't train because torch incompatible with python version

    /home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
    and will be removed in future. Use torch.distributed.run.
    Note that --use_env is set by default in torch.distributed.run.
    If your script expects `--local_rank` argument to be set, please
    change it to read from `os.environ['LOCAL_RANK']` instead. See 
    https://pytorch.org/docs/stable/distributed.html#launch-utility for 
    further instructions
    
      warnings.warn(
    WARNING:torch.distributed.run:*****************************************
    Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
    *****************************************
    Traceback (most recent call last):
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch._C._cuda_setDevice(device)
        RuntimeErrortorch._C._cuda_setDevice(device): 
    CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
        torch._C._cuda_setDevice(device)
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 651166) of binary: /usr/bin/python3
    Traceback (most recent call last):
      File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in <module>
        main()
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main
        launch(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch
        run(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 689, in run
        elastic_launch(
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
        return launch_agent(self._config, self._entrypoint, list(args))
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
        raise ChildFailedError(
    torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
    ***************************************
                train.py FAILED            
    =======================================
    Root Cause:
    [0]:
      time: 2021-09-30_17:35:27
      rank: 1 (local_rank: 1)
      exitcode: 1 (pid: 651166)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    =======================================
    Other Failures:
    [1]:
      time: 2021-09-30_17:35:27
      rank: 2 (local_rank: 2)
      exitcode: 1 (pid: 651167)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    [2]:
      time: 2021-09-30_17:35:27
      rank: 3 (local_rank: 3)
      exitcode: 1 (pid: 651168)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    ***************************************
    
    
    opened by arch-user-france1 19
  • Transparent PNG support

    Transparent PNG support

    Seeing that recently EXR support was added, is it possible to support transparency (alpha channel) for PNG input and output (using --img --png) for inference_video.py?

    This would enable interpolation of transparent GIFs.

    opened by n00mkrad 19
  • Image sequence and input

    Image sequence and input

         Thanks for adding the png output function. Can you make the output name to be consistent with ffmpeg ? i.e. 0000.png 0001.png ----- 7821.png.And then we can use ffmpeg to deal with image sequence.
         Adding image sequence input would also be great.
    
    opened by Michaelwhite34 15
  • 问题: Dataset有multiframes的时候,该如何prepare

    问题: Dataset有multiframes的时候,该如何prepare

    你好,首先非常感谢在github上共享这个repo!

    在用您release的model运行inference之后,我想尝试用customer dataset来训练。

    我的dataset每一个video里面有24个frames, 所以目标就是生成中间的22个frames.

    我参考了 一下在dataset.py 中的 VimeoDataset class, 发现在prepare的data的时候, 因为这个dataset每个video只有3个frames,所以return的都是 第一个和最后一个frame,要求interpolate的是中间的frame.

    • 想请问一下如果我想interpolate多个frames,是有可能实现的吗?

    目前我已经开始训练了,我大概做了一个稍微的调整,就是input是第一个和最后一个frame, 要求predict中间的那个frame (第十二个frame),然后Model的选择,我选的是 RIFE.pyIFNet.py,对应的生成Model应该是最robust的(42.9MB). 因为我的数据集比较单一,为了防止overfitting,我先预先load了你们release的Model, 然后继续训练。

    • 但是我发现loss在几十个epoch之后,出现了井喷的状态,最后生成的Model在做inference的时候,完全生成不了中间的22个frames(全部都是黑图), 跟我一开始用您release的Model运行的结果相差甚远...

    后来我尝试用另一个Model训练(RIFE_HDV3.pyIFNet_HDV3.py)(生成Model是12.2MB), 但是pytorch一直报错。

    -RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argumentfind_unused_parameters=Truetotorch.nn.parallel.DistributedDataParallel; (2) making sure allforwardfunction outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module'sforwardfunction. Please include the loss function and the structure of the return value offorwardof your module when reporting this issue (e.g. list, dict, iterable). 错误源头是在RIFE_HDV3.py中的 update()里的 flow, mask, merged = self.flownet(imgs, scale=[4,2,1]) 我查了一下,出现这个错误原因是因为在 forward()的output里有些variables没有用来calculate loss. 我又去仔仔细细的查看了一下IFNet_HDV3下的forward() , 还是无果..

    如果您有好的建议的话,不甚感谢!

    opened by chenyuZha 13
  • Not the fastest for multi-frame interpolation

    Not the fastest for multi-frame interpolation

    Hi,

    Thanks for open sourcing the code and contributing to the video frame interpolation community.

    In the paper, it mentioned: "Coupled with the large complexity in the bi-directional flow estimation, none of these methods can achieve real-time speed"

    I believe that might be inappropriate to say, as the recent published paper (https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720103.pdf) targets efficient multi-frame interpolation.

    It utilizes bi-directional flow estimation as well, but it generates 7 frames for 0.12 second. where your method requires 0.036 * 7 = 0.252 seconds.

    And the model from that paper is compact, which consists of only ~2M parameters, where your fast model has ~10M parameters.

    opened by mrchizx 13
  • 关于hdv2和hdv3模型的复现

    关于hdv2和hdv3模型的复现

    您好, 我复现了hdv2和hdv3模型,但是和您提供的结果总有一些差距。 配置超参数: weight_decay=1e-4, learing rate:3e-4 *mul ,vgg loss,还有 #146 中提到的数据增广方式,patch size 256,训练300epoch。 我看您在hdv2和v3提供的版本中,模型结构都没有变化,它们都有什么区别呢?除了我上面的配置,想要达到您的效果还有哪些没有做到呢?

    opened by tqyunwuxin 12
  • Model v2 update log

    Model v2 update log

    We show some hard case results for every version model. v2 google drive download link: (https://drive.google.com/file/d/1wsQIhHZ3Eg4_AfCXItFKqqyDMB4NS0Yd/view).

    v1.1 2020.11.16 链接:https://pan.baidu.com/s/1SPRw_u3zjaufn7egMr19Eg 密码:orkd imageimage

    opened by hzwer 12
  • Training with other datasets

    Training with other datasets

    Has anyone trained RIFE_HDv2 with training set other than vimeo dataset: such as HD dataset. I

    And were they able to get better visual quality for HD content.

    opened by rsjjdesj 11
  • replicating benchmarks

    replicating benchmarks

    Thank you for sharing your code! I was trying to replicate the numbers you stated in your paper using this implementation but have unfortunately been unsuccessful so far. Would you be able to share a script that can be used to replicate the Vimeo-90k metrics you quoted? Also, I think the following padding has some issues.

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L26-L28

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L45

    The pw - w and [:h, :w] indicate that pw > w (and ph > h). However, pw = 340 // 32 * 32 = 320 for w = 340 which violates this condition. Thanks for looking into this and thanks again for sharing your code!

    opened by sniklaus 11
  • Reproducibility results

    Reproducibility results

    Hi,

    I checked if I can reproduce the results similar to those in the paper to make sure I am training the model properly. These are the results that I got on Vimeo triplets:

    interpol_flow

    The model prediction is shown for t=1 (predicting between t=0 and t=2), the second row corresponds to interpolation ("Interpol"), and the last row to flow ("Flow pred"). I see that interpolation results are very good, however, I expected the flow to be a bit more accurate. In section 6.2 of the appendix you mention that "IFNet produces clear motion boundaries.", this is also what can be seen in Figure 10. Therefore I wanted to ask if there were any other training steps that I need to add to get flow prediction more accurate. I can of course share more prediction examples that I got.

    The training was done for 300 epochs using the reconstruction losses and distillation loss (the latter is with coeff. 0.01) as described in the paper, I didn't change anything in the code to train and obtain these results. This is the loss plot: val_loss_300

    I would appreciate if you can confirm that you trained the model the same way and that your flow predictions look similar. I used IFNet for training (self.flownet = IFNet()).

    opened by HamidGadirov 4
  • How to visualize flow that model inferenced ?

    How to visualize flow that model inferenced ?

    flow, mask, merged = self.flownet(imgs, scale_list) flow must be the flow between two images. it's shape (bs, 4, H, W) , how to visualize it ? like this: image and how to generate the flow groudtruth ? Thx !

    opened by zhishao 12
  • RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    hi, im trying to interpolate a png sequence. they are properly numbered and such. here is my console:

    conda run -n RIFE py D:\Development\RIFE\inference_video.py --img "D:\Game Assets\Super Outrun Rush\Animations\Pixelated\WadeCharge" --exp=4
    Loaded v3.x HD model.
    
      0%|          | 0/16 [00:00<?, ?it/s]Traceback (most recent call last):
      File "D:\Development\RIFE\inference_video.py", line 259, in <module>
        output = make_inference(I0, I1, 2**args.exp-1) if args.exp else []
      File "D:\Development\RIFE\inference_video.py", line 180, in make_inference
        middle = model.inference(I0, I1, args.scale)
      File "D:\Development\RIFE\train_log\RIFE_HDv3.py", line 58, in inference
        flow, mask, merged = self.flownet(imgs, scale_list)
      File "C:\Users\Jackson\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "D:\Development\RIFE\train_log\IFNet_HDv3.py", line 113, in forward
        merged[i] = merged[i][0] * mask_list[i] + merged[i][1] * (1 - mask_list[i])
    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1
    
    opened by Apple-Fritter-Money-Entertainment 0
Releases(arxiv_v5_code)
Owner
hzwer
hzwer
Official code for the paper "Self-Supervised Prototypical Transfer Learning for Few-Shot Classification"

Self-Supervised Prototypical Transfer Learning for Few-Shot Classification This repository contains the reference source code and pre-trained models (

EPFL INDY 44 Nov 04, 2022
Datasets and pretrained Models for StyleGAN3 ...

Datasets and pretrained Models for StyleGAN3 ... Dear arfiticial friend, this is a collection of artistic datasets and models that we have put togethe

lucid layers 34 Oct 06, 2022
Learned Initializations for Optimizing Coordinate-Based Neural Representations

Learned Initializations for Optimizing Coordinate-Based Neural Representations Project Page | Paper Matthew Tancik*1, Ben Mildenhall*1, Terrance Wang1

Matthew Tancik 127 Jan 03, 2023
QT Py Media Knob using rotary encoder & neopixel ring

QTPy-Knob QT Py USB Media Knob using rotary encoder & neopixel ring The QTPy-Knob features: Media knob for volume up/down/mute with "qtpy-knob.py" Cir

Tod E. Kurt 56 Dec 30, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
Boundary-aware Transformers for Skin Lesion Segmentation

Boundary-aware Transformers for Skin Lesion Segmentation Introduction This is an official release of the paper Boundary-aware Transformers for Skin Le

Jiacheng Wang 79 Dec 16, 2022
Deep Learning Training Scripts With Python

Deep Learning Training Scripts DNN Frameworks Caffe PyTorch Tensorflow CNN Models VGG ResNet DenseNet Inception Language Modeling GatedCNN-LM Attentio

Multicore Computing Research Lab 16 Dec 15, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021
Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge

SEAM Match-RCNN Official code of MovingFashion: a Benchmark for the Video-to-Shop Challenge paper Installation Requirements: Pytorch 1.5.1 or more rec

HumaticsLAB 31 Oct 10, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

hierarchical-transformer-1d Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers In Progress!! 2021.

MyungHoon Jin 7 Nov 06, 2022
Flexible-Modal Face Anti-Spoofing: A Benchmark

Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-

Zitong Yu 22 Nov 10, 2022
Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

LapDepth-release This repository is a Pytorch implementation of the paper "Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals" M

Minsoo Song 205 Dec 30, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022
Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

Sahil Singla 33 Dec 05, 2022
Tools for investing in Python

InvestOps Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction This is a Python package with simple and effective

24 Nov 26, 2022
Anonymize BLM Protest Images

Anonymize BLM Protest Images This repository automates @BLMPrivacyBot, a Twitter bot that shows the anonymized images to help keep protesters safe. Us

Stanford Machine Learning Group 40 Oct 13, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
SOLOv2 on onnx & tensorRT

SOLOv2.tensorRT: NOTE: code based on WXinlong/SOLO add support to TensorRT inference onnxruntime tensorRT full_dims and dynamic shape postprocess with

47 Nov 26, 2022