Compressed Video Action Recognition

Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl.
In CVPR, 2018. [Project Page]

Overview

This is a reimplementation of CoViAR in PyTorch (the original paper uses MXNet). This code currently supports UCF-101 and HMDB-51; Charades coming soon. (This is a work in progress. Any suggestions are appreciated.)

Results

This code produces comparable or better results than the original paper:
HMDB-51: 52% (I-frame), 40% (motion vector), 43% (residuals), 59.2% (CoViAR).
UCF-101: 87% (I-frame), 70% (motion vector), 80% (residuals), 90.5% (CoViAR).
(average of 3 splits; without optical flow. )

Data loader

We provide a python data loader that directly takes a compressed video and returns the compressed representation (I-frames, motion vectors, and residuals) as a numpy array . We can thus train the model without extracting and storing all representations as image files.

In our experiments, it's fast enough so that it doesn't delay GPU training. Please see GETTING_STARTED.md for details and instructions.

Using CoViAR

Please see GETTING_STARTED.md for instructions for training and inference.

Citation

If you find this model useful for your resesarch, please use the following BibTeX entry.

@inproceedings{wu2018coviar,
  title={Compressed Video Action Recognition},
  author={Wu, Chao-Yuan and Zaheer, Manzil and Hu, Hexiang and Manmatha, R and Smola, Alexander J and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={CVPR},
  year={2018}
}

Acknowledgment

This implementation largely borrows from tsn-pytorch by yjxiong. Part of the dataloader implementation is modified from this tutorial and FFmpeg extract_mv example.

Compressed Video Action Recognition

Related tags

Overview

Compressed Video Action Recognition

Overview

Results

Data loader

Using CoViAR

Citation

Acknowledgment

Owner

Chao-Yuan Wu

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

Hitters Linear Regression - Hitters Linear Regression With Python

List some popular DeepFake models e.g. DeepFake, FaceSwap-MarekKowal, IPGAN, FaceShifter, FaceSwap-Nirkin, FSGAN, SimSwap, CihaNet, etc.

Predictive Maintenance LSTM

A more easy-to-use implementation of KPConv

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders".

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Code for CVPR2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》

Official implementation of Protected Attribute Suppression System, ICCV 2021

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

A Python framework for conversational search

Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

AdaDM: Enabling Normalization for Image Super-Resolution

EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

Official repository for ABC-GAN

Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

Download & Install mods for your favorit game with a few simple clicks

This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.