MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Last update: Jan 29, 2022

Related tags

Overview

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Overview

We release the code of the MVFNet (Multi-View Fusion Network). The core code to implement the Multi-View Fusion Module is codes/models/modules/MVF.py.

[Mar 24, 2021] We has released the code of MVFNet.

[Dec 20, 2020] MVFNet has been accepted by AAAI 2021.

Prerequisites
Data Preparation
Model Zoo
Testing
Training

Prerequisites

All dependencies can be installed using pip:

python -m pip install -r requirements.txt

Our experiments run on Python 3.7 and PyTorch 1.5. Other versions should work but are not tested.

Download Pretrained Models

Download ImageNet pre-trained models

cd pretrained
sh download_imgnet.sh

Download K400 pre-trained models

Please refer to Model Zoo.

Data Preparation

Please refer to DATASETS.md for data preparation.

Model Zoo

Architecture	Dataset	T x interval	Top-1 Acc.	Pre-trained model	Train log	Test log
MVFNet-ResNet50	Kinetics-400	4x16	74.2%	Download link	Log link	Log link
MVFNet-ResNet50	Kinetics-400	8x8	76.0%	Download link	Miss	Log link
MVFNet-ResNet50	Kinetics-400	16x4	77.0%	Download link	Log link	Log link
MVFNet-ResNet101	Kinetics-400	4x16	76.0%	Download link	Log link	Log link
MVFNet-ResNet101	Kinetics-400	8x8	77.4%	Download link	Log link	Log link
MVFNet-ResNet101	Kinetics-400	16x4	78.4%	Download link	Log link	Log link

Testing

For 3 crops, 10 clips, the processing of testing

# Dataset: Kinetics-400
# Architecture: R50_8x8 [email protected]=76.0%
bash scripts/dist_test_recognizer.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py ckpt_path 8 --fcn_testing

Training

This implementation supports multi-gpu, DistributedDataParallel training, which is faster and simpler.

For example, to train MVFNet-ResNet50 on Kinetics400 with 8 gpus, you can run:

bash scripts/dist_train_recognizer.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8

We also provide the script to train MVFNet on Kinetics400 with multiple machines (e.g., 2 machines and 16 GPUs).

# For first machine, --master_addr is the ip of your first machine
bash scripts/dist_train_multinode_1.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8

# For second machine, --master_addr is still the ip of your first machine
bash scripts/dist_train_multinode_2.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8

Acknowledgements

We especially thank the contributors of the mmaction codebase for providing helpful code.

License

This repository is released under the Apache-2.0. license as found in the LICENSE file.

Citation

If you think our work is useful, please feel free to cite our paper 😆 :

@inproceedings{wu2020MVFNet,
  author    = {Wu, Wenhao and He, Dongliang and Lin, Tianwei and Li, Fu and Gan, Chuang and Ding, Errui},
  title     = {MVFNet: Multi-View Fusion Network for Efficient Video Recognition},
  booktitle = {AAAI},
  year      = {2021}
}

Contact

For any question, please file an issue or contact

Wenhao Wu: [email protected]

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Related tags

Overview

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Overview

Prerequisites

Download Pretrained Models

Data Preparation

Model Zoo

Testing

Training

Acknowledgements

License

Citation

Contact

Owner

[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.

PyTorch version implementation of DORN

Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

An implementation of MobileFormer

Code for "Learning to Regrasp by Learning to Place"

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Python code for the paper How to scale hyperparameters for quickshift image segmentation

Newt - a Gaussian process library in JAX.

This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

Submodular Subset Selection for Active Domain Adaptation (ICCV 2021)

Image De-raining Using a Conditional Generative Adversarial Network

Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

Direct LiDAR Odometry: Fast Localization with Dense Point Clouds

This repo is developed for Strong Baseline For Vehicle Re-Identification in Track 2 Ai-City-2021 Challenges

A library that can print Python objects in human readable format

PyTorch implementation of PNASNet-5 on ImageNet

Outlier Exposure with Confidence Control for Out-of-Distribution Detection