This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Overview

Elaborative Rehearsal for Zero-shot Action Recognition

This is an official implementation of:

Shizhe Chen and Dong Huang, Elaborative Rehearsal for Zero-shot Action Recognition, ICCV, 2021. Arxiv Version

Elaborating a new concept and relating it to known concepts, we reach the dawn of zero-shot action recognition models being comparable to supervised models trained on few samples.

New SOTA results are also achieved on the standard ZSAR benchmarks (Olympics, HMDB51, UCF101) as well as the first large scale ZSAR benchmak (we proposed) on the Kinetics database.
PWC PWC PWC PWC

Installation

git clone https://github.com/DeLightCMU/ElaborativeRehearsal.git
cd ElaborativeRehearsal
export PYTHONPATH=$(pwd):${PYTHONPATH}

pip install -r requirements.txt

# download pretrained models
bash scripts/download_premodels.sh

Zero-shot Action Recognition (ZSAR)

Extract Features in Video

  1. spatial-temporal features
bash scripts/extract_tsm_features.sh '0,1,2'
  1. object features
bash scripts/extract_object_features.sh '0,1,2'

ZSAR Training and Inference

  1. Baselines: DEVISE, ALE, SJE, DEM, ESZSL and GCN.
# mtype: devise, ale, sje, dem, eszsl
mtype=devise
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} --is_train
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} --eval_set tst
# evaluate other splits
ksplit=1
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines_eval_splits.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} ${ksplit}

# gcn
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_kgraphs.py zeroshot/configs/zsl_baseline_kgraph_config.yaml --is_train
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_kgraphs.py zeroshot/configs/zsl_baseline_kgraph_config.yaml --eval_set tst
  1. ER-ZSAR and ablations:
# TSM + ED class representation + AttnPool (2nd row in Table 4(b))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_vse.py zeroshot/configs/zsl_vse_wordembed_config.yaml --is_train --resume_file datasets/Kinetics/zsl220/word.glove42b.th

# TSM + ED class representation + BERT (last row in Table 4(a) and Table 4(b))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_vse.py zeroshot/configs/zsl_vse_config.yaml --is_train

# Obj + ED class representation + BERT + ER Loss (last row in Table 4(c))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_cptembed.py zeroshot/configs/zsl_cpt_config.yaml --is_train

# ER-ZSAR Full Model
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_ervse.py zeroshot/configs/zsl_ervse_config.yaml --is_train

Citation

If you find this repository useful, please cite our paper:

@proceeding{ChenHuang2021ER,
  title={Elaborative Rehearsal for Zero-shot Action Recognition},
  author={Shizhe Chen and Dong Huang},
  booktitle = {ICCV},
  year={2021}
}

Acknowledgement

Owner
DeLightCMU
Research group at CMU
DeLightCMU
SARS-Cov-2 Recombinant Finder for fasta sequences

Sc2rf - SARS-Cov-2 Recombinant Finder Pronounced: Scarf What's this? Sc2rf can search genome sequences of SARS-CoV-2 for potential recombinants - new

Lena Schimmel 41 Oct 03, 2022
Using Python to Play Cyberpunk 2077

CyberPython 2077 Using Python to Play Cyberpunk 2077 This repo will contain code from the Cyberpython 2077 video series on Youtube (youtube.

Harrison 118 Oct 18, 2022
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

Yifan Wang 66 Nov 08, 2022
Pre-trained NFNets with 99% of the accuracy of the official paper

NFNet Pytorch Implementation This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale

Benjamin Schmidt 133 Dec 09, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 09, 2023
Code accompanying our NeurIPS 2021 traffic4cast challenge

Traffic forecasting on traffic movie snippets This repo contains all code to reproduce our approach to the IARAI Traffic4cast 2021 challenge. In the c

Nina Wiedemann 2 Aug 09, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 885 Jan 01, 2023
Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

CompVis Heidelberg 293 Dec 22, 2022
Bot developed in Python that automates races in pegaxy.

español | português About it: This is a fork from pega-racing-bot. This bot, developed in Python, is to automate races in pegaxy. The game developers

4 Apr 08, 2022
Distributed Asynchronous Hyperparameter Optimization better than HyperOpt.

UltraOpt : Distributed Asynchronous Hyperparameter Optimization better than HyperOpt. UltraOpt is a simple and efficient library to minimize expensive

98 Aug 16, 2022
Optimus: the first large-scale pre-trained VAE language model

Optimus: the first pre-trained Big VAE language model This repository contains source code necessary to reproduce the results presented in the EMNLP 2

314 Dec 19, 2022
Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz).

Blender-Cave-Generation Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz). Installation

2 Dec 28, 2022
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 07, 2022
Rasterize with the least efforts for researchers.

utils3d Rasterize and do image-based 3D transforms with the least efforts for researchers. Based on numpy and OpenGL. It could be helpful when you wan

Ruicheng Wang 8 Dec 15, 2022
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the

Yuxiao Zhou 824 Jan 07, 2023
Predicting Price of house by considering ,house age, Distance from public transport

House-Price-Prediction Predicting Price of house by considering ,house age, Distance from public transport, No of convenient stores around house etc..

Musab Jaleel 1 Jan 08, 2022
An implementation of an abstract algebra for music tones (pitches).

nbdev template Use this template to more easily create your nbdev project. If you are using an older version of this template, and want to upgrade to

Open Music Kit 0 Oct 10, 2022
Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance.

Qualcomm Innovation Center 137 Jan 03, 2023
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

368 Dec 26, 2022