This repository contains the official PyTorch implementation of VFI_Adapter: https://arxiv.org/abs/2306.13933/
Project Page
This code has been tested with PyTorch 1.12 and CUDA 11.1. It should also be compatible with higher versions of PyTorch and CUDA. Several essential dependencies are as follows:
- Python >= 3.8 (Recommend to use Anaconda or Miniconda)
- PyTorch >= 1.12
- torchvision == 0.13.1
- cudatoolkit == 11.3.1
- cupy-cuda11x == 11.6.0
A suitable conda environment named vfi_adapter
can be created
and activated with:
conda env create -f environment.yaml
conda activate vfi_adapter
Following RIFE and VFIT, we evaluate our proposed method on Vimeo90K, DAVIS, and SNU-FILM datasets.
If you want to train and benchmark our method, please download Vimeo90K-Triplet, Vimeo90K-Septuplet, DAVIS, and SNU-FILM. You can place the downloaded datasets in ./Dataset/
folder, where the index of frames has been given.
Our proposed plug-in Adapter is trained based on three different pre-trained VFI models, so you need to download the pre-trained models and put them into the corresponding directory for initialization. The pre-trained checkpoints can be downloaded from: RIFE, IFRNet, UPRNet. Specially, for IFRNet and UPRNet, we use IFRNet_large and UPRNet-LARGE as backbones.
With the pre-trained backbones, you can freeze their parameters and train our plug-in Adapter now.
For RIFE_adapter, you can train via:
cd RIFE_adapter
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py --world_size=2
For IFRNet_adapter, you can train via:
cd IFRNet_adapter
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py --world_size=2
For UPRNet_adapter, you can train via:
cd UPRNet_adapter
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py --world_size=2
After training the plug-in Adapter based on pre-trained backbones, you can run benchmark tests in each subdirectory, here we take IFRNet as an example:
cd IFRNet_adapter
CUDA_VISIBLE_DEVICES=0 python benchmark/Vimeo90K_sep.py
CUDA_VISIBLE_DEVICES=0 python benchmark/DAVIS.py
CUDA_VISIBLE_DEVICES=0 python benchmark/SNU_FILM.py
In each script, there is a hyperparameter adap_step
, that controls the test-time adaptation steps of the model. The default number is set to 10.
NOTE: If you want to reproduce the results of ene-to-end adaptation, you should load the original pre-trained backbone models and adapt all parameters. In addition, considering that the gradient descent of each adaptation has a certain degree of randomness, multiple experiments are expected to achieve desired results.
- Data Preparation
- Model Code
- Training Code
- Benchmark Code
- Release Checkpoints
If you use this code for your research or project, please cite:
@inproceedings{Wu_2023_BMVC,
author = {Haoning Wu and Xiaoyun Zhang and Weidi Xie and Ya Zhang and Yan-Feng Wang},
title = {Boost Video Frame Interpolation via Motion Adaptation},
booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},
publisher = {BMVA},
year = {2023},
url = {https://papers.bmvc2023.org/0179.pdf}
}
Many thanks to the code bases from RIFE, IFRNet, UPRNet.
If you have any question, please feel free to contact haoningwu3639@gmail.com.