A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Last update: Dec 23, 2022

Overview

SlowFast

A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition.

Requirements

conda install pytorch=1.9.1 torchvision cudatoolkit -c pytorch

PyTorchVideo

pip install pytorchvideo

Dataset

kinetics-400 dataset is used in this repo, you could download these datasets from official websites. The data directory structure is shown as follows:

├──data
  ├── train
      ├── abseiling
          ├── _4YTwq0-73Y_000044_000054.mp4
          └── ...
          ...
      ├── archery
          same structure as abseiling
  ├── test
     same structure as train
     ...

Usage

Train Model

python train.py --batch_size 16
optional arguments:
--data_root                   Datasets root path [default value is 'data']
--batch_size                  Number of videos in each mini-batch [default value is 8]
--epochs                      Number of epochs over the model to train [default value is 10]
--save_root                   Result saved root path [default value is 'result']

Test Model

python test.py --video_path data/test/beatboxing/5s_gFWie1Ys_000069_000079.mp4
optional arguments:
--model_path                  Model path [default value is 'result/slow_fast.pth']
--video_path                  Video path [default value is 'data/test/applauding/_V-dzjftmCQ_000023_000033.mp4']

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Related tags

Overview

SlowFast

Requirements

Dataset

Usage

Train Model

Test Model

Owner

Hao Ren

Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker

FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks

Certifiable Outlier-Robust Geometric Perception

Boosted CVaR Classification (NeurIPS 2021)

DeiT: Data-efficient Image Transformers

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Rule Based Classification Project

Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

PyoMyo - Python Opensource Myo library

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

🔪 Elimination based Lightweight Neural Net with Pretrained Weights

A library for using chemistry in your applications

Implementation of ViViT: A Video Vision Transformer

DeepVoxels is an object-specific, persistent 3D feature embedding.

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

PECOS - Prediction for Enormous and Correlated Spaces

Semi-supervised semantic segmentation needs strong, varied perturbations