PyVideoAI: Action Recognition Framework

Overview

This reposity contains official implementation of:

PyVideoAI: Action Recognition Framework

The only framework that completes your computer vision, action recognition research environment.

** Key features **

  • Supports multi-gpu, multi-node training.
  • STOA models such as I3D, Non-local, TSN, TRN, TSM, MVFNet, ..., and even ImageNet training!
  • Many datasets such as Kinetics-400, EPIC-Kitchens-55, Something-Something-V1/V2, HMDB-51, UCF-101, Diving48, CATER, ...
  • Supports both video decoding (straight from .avi/mp4) and frame extracted (.jpg/png) dataloaders, sparse-sample and dense-sample.
  • Any popular LR scheduling like Cosine Annealing with Warm Restart, Step LR, and Reduce LR on Plateau.
  • Early stopping when training doesn't improve (customise your condition)
  • Easily add custom model, optimiser, scheduler, loss and dataloader!
  • Telegram bot reporting experiment status.
  • TensorBoard reporting stats.
  • Colour logging
  • All of the above come with no extra setup. Trust me and try some examples.

** Papers implemented **

This package is motivated by PySlowFast from Facebook AI. The PySlowFast is a cool framework, but it depends too much on their config system and it was difficult to add new models (other codes) or reuse part of the modules from the framework.
This framework by Kiyoon, is designed to replace all the configuration systems to Python files, which enables easy-addition of custom models/LR scheduling/dataloader etc.
Just modify the function bodies in the config files!

Difference between the two config systems can be found in CONFIG_SYSTEM.md.

Getting Started

Jupyter Notebook examples to run:

  • HMDB-51 data preparation
  • Inference on pre-trained model from the model zoo, and visualise model/dataloader/per-class performance.
  • Training I3D using Kinetics pretrained model
  • Using image model and ImageNet dataset

is provided in the examples!

Structure

All of the executable files are in tools/.
dataset_configs/ directory configures datasets. For example, where is the dataset stored, number of classes, single-label or multi-label training, dataset-specific visualisation settings (confusion matrix has different output sizes)
model_configs/ directory configures model architectures. For example, model definition, input preprocessing mean/std.
exp_configs/ directory configures other training settings like optimiser, scheduling, dataloader, number of frames as input. The config file path has to be in exp_configs/[dataset_name]/[model_name]_[experiment_name].py format.

Usage

Preparing datasets

This package supports many action recognition datasets such as HMDB-51, EPIC-Kitchens-55, Something-Something-V1, CATER, etc.
Refer to DATASET.md.

Training command

CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/run_train.py -D {dataset_config_name} -M {model_config_name} -E {exp_config_name} --local_world_size {num_GPUs} -e {num_epochs}

--local_world_size denotes the number of GPUs per computing node.

Telegram Bot

You can preview experiment results using Telegram bots!
Telegram bot stat report example

If your code raises an exception, it will report you too.
Telegram error report example

You can quickly take a look at example video inputs (as GIF or JPEGs) from the dataloader.
Use tools/visualisations/model_and_dataloader_visualiser.py
Telegram video input report example

[Telegram0]
token=
chat_id=

Model Zoo and Baselines

Refer to MODEL_ZOO.md

Installation

Refer to INSTALL.md.

TL;DR,

conda create -n videoai python=3.8
conda activate videoai
conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=10.2 -c pytorch
### For RTX 30xx GPUs,
#conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.1 -c pytorch -c nvidia
 

git clone --recurse-submodules https://github.com/kiyoon/PyVideoAI.git
cd PyVideoAI
git checkout v0.3
git submodule update --recursive
cd submodules/video_datasets_api
pip install -e .
cd ../experiment_utils
pip install -e .
cd ../..
pip install -e .

Experiment outputs

The experiment results (log, training stats, weights, tensorboard, plots, etc.) are saved to data/experiments by default. This can be huge, so make sure you make a softlink of a directory you really want to use. (recommended)

Otherwise, you can change pyvideoai/config.py's DEFAULT_EXPERIMENT_ROOT value. Or, you can also set --experiment_root argument manually when executing.

Owner
Kiyoon Kim
Computer scientist with computer vision, machine learning and signal processing background.
Kiyoon Kim
LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data

LSTM Neural Network for Time Series Prediction LSTM built using the Keras Python package to predict time series steps and sequences. Includes sine wav

Jakob Aungiers 4.1k Jan 02, 2023
Code for CPM-2 Pre-Train

CPM-2 Pre-Train Pre-train CPM-2 此分支为110亿非 MoE 模型的预训练代码,MoE 模型的预训练代码请切换到 moe 分支 CPM-2技术报告请参考link。 0 模型下载 请在智源资源下载页面进行申请,文件介绍如下: 文件名 描述 参数大小 100000.tar

Tsinghua AI 136 Dec 28, 2022
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.

AI_Personal_Voice_Assistant_Using_Python A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perf

Chumui Tripura 1 Oct 30, 2021
BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

BraTS(Brain Tumour Segmentation) using V-Net This project is an approach to dete

Rituraj Dutta 7 Nov 27, 2022
tmm_fast is a lightweight package to speed up optical planar multilayer thin-film device computation.

tmm_fast tmm_fast or transfer-matrix-method_fast is a lightweight package to speed up optical planar multilayer thin-film device computation. It is es

26 Dec 11, 2022
Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022) https://arxiv.org/abs/2203.08483 Unpaired image-to-image (I2I

Xueqi Hu 50 Dec 16, 2022
PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).

PFENet This is the implementation of our paper PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation that has been accepted to IEE

DV Lab 230 Dec 31, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

MIPNet: Multi-Instance Pose Networks This repository is the official pytorch python implementation of "Multi-Instance Pose Networks: Rethinking Top-Do

Rawal Khirodkar 57 Dec 12, 2022
Code accompanying the paper Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs (Chen et al., CVPR 2020, Oral).

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs This repository contains PyTorch implementation of our pa

Shizhe Chen 178 Dec 29, 2022
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Figure: Shape-Accurate 3D-Aware Image Synthesis. A Shading-Guid

Xingang Pan 115 Dec 18, 2022
Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

DV Lab 410 Jan 03, 2023
Pytorch library for seismic data augmentation

Pytorch library for seismic data augmentation

Artemii Novoselov 27 Nov 22, 2022
Pytorch implementation for Patient Knowledge Distillation for BERT Model Compression

Patient Knowledge Distillation for BERT Model Compression Knowledge distillation for BERT model Installation Run command below to install the environm

Siqi 180 Dec 19, 2022
Catch-all collection of generative art made using processing

Generative art with Processing.py Some art I have created for fun. Dependencies Processing for Python, see how to download/use here Packages contained

2 Mar 12, 2022
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frede

Edresson Casanova 92 Dec 09, 2022
PyTorch implementation of Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

Simple PyTorch Implementation of "Grokking" Implementation of Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Usage Running

Teddy Koker 15 Sep 29, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022