A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Overview

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

The source codes for ICCV2021 Paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
[paper] [supplemental material] [arXiv]

If you find our work or the codebase inspiring and useful to your research, please cite

@inproceedings{yuan2021DIN,
  title={Spatio-Temporal Dynamic Inference Network for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong and Wang, Mang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={7476--7485},
  year={2021}
}

Dependencies

  • Software Environment: Linux (CentOS 7)
  • Hardware Environment: NVIDIA TITAN RTX
  • Python 3.6
  • PyTorch 1.2.0, Torchvision 0.4.0
  • RoIAlign for Pytorch

Prepare Datasets

  1. Download publicly available datasets from following links: Volleyball dataset and Collective Activity dataset.
  2. Unzip the dataset file into data/volleyball or data/collective.
  3. Download the file tracks_normalized.pkl from cvlab-epfl/social-scene-understanding and put it into data/volleyball/videos

Using Docker

  1. Checkout repository and cd PROJECT_PATH

  2. Build the Docker container

docker build -t din_gar https://github.com/JacobYuan7/DIN_GAR.git#main
  1. Run the Docker container
docker run --shm-size=2G -v data/volleyball:/opt/DIN_GAR/data/volleyball -v result:/opt/DIN_GAR/result --rm -it din_gar
  • --shm-size=2G: To prevent ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)., you have to extend the container's shared memory size. Alternatively: --ipc=host
  • -v data/volleyball:/opt/DIN_GAR/data/volleyball: Makes the host's folder data/volleyball available inside the container at /opt/DIN_GAR/data/volleyball
  • -v result:/opt/DIN_GAR/result: Makes the host's folder result available inside the container at /opt/DIN_GAR/result
  • -it & --rm: Starts the container with an interactive session (PROJECT_PATH is /opt/DIN_GAR) and removes the container after closing the session.
  • din_gar the name/tag of the image
  • optional: --gpus='"device=7"' restrict the GPU devices the container can access.

Get Started

  1. Train the Base Model: Fine-tune the base model for the dataset.

    # Volleyball dataset
    cd PROJECT_PATH 
    python scripts/train_volleyball_stage1.py
    
    # Collective Activity dataset
    cd PROJECT_PATH 
    python scripts/train_collective_stage1.py
  2. Train with the reasoning module: Append the reasoning modules onto the base model to get a reasoning model.

    1. Volleyball dataset

      • DIN

        python scripts/train_volleyball_stage2_dynamic.py
        
      • lite DIN
        We can run DIN in lite version by setting cfg.lite_dim = 128 in scripts/train_volleyball_stage2_dynamic.py.

        python scripts/train_volleyball_stage2_dynamic.py
        
      • ST-factorized DIN
        We can run ST-factorized DIN by setting cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.hierarchical_inference = True.

        Note that if you set cfg.hierarchical_inference = False, cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.num_DIN = 2, then multiple interaction fields run in parallel.

        python scripts/train_volleyball_stage2_dynamic.py
        

      Other model re-implemented by us according to their papers or publicly available codes:

      • AT
        python scripts/train_volleyball_stage2_at.py
        
      • PCTDM
        python scripts/train_volleyball_stage2_pctdm.py
        
      • SACRF
        python scripts/train_volleyball_stage2_sacrf_biute.py
        
      • ARG
        python scripts/train_volleyball_stage2_arg.py
        
      • HiGCIN
        python scripts/train_volleyball_stage2_higcin.py
        
    2. Collective Activity dataset

      • DIN
        python scripts/train_collective_stage2_dynamic.py
        
      • DIN lite
        We can run DIN in lite version by setting 'cfg.lite_dim = 128' in 'scripts/train_collective_stage2_dynamic.py'.
        python scripts/train_collective_stage2_dynamic.py
        

Another work done by us, solving GAR from the perspective of incorporating visual context, is also available.

@inproceedings{yuan2021visualcontext,
  title={Learning Visual Context for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={3261--3269},
  year={2021}
}
Owner
A Ph.D. candidate and a realistic idealist.
Rust bindings for the C++ api of PyTorch.

tch-rs Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorc

Laurent Mazare 2.3k Dec 30, 2022
The Submission for SIMMC 2.0 Challenge 2021

The Submission for SIMMC 2.0 Challenge 2021 challenge website Requirements python 3.8.8 pytorch 1.8.1 transformers 4.8.2 apex for multi-gpu nltk Prepr

5 Jul 26, 2022
Convert game ISO and archives to CD CHD for emulation on Linux.

tochd Convert game ISO and archives to CD CHD for emulation. Author: Tuncay D. Source: https://github.com/thingsiplay/tochd Releases: https://github.c

Tuncay 20 Jan 02, 2023
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
Auto White-Balance Correction for Mixed-Illuminant Scenes

Auto White-Balance Correction for Mixed-Illuminant Scenes Mahmoud Afifi, Marcus A. Brubaker, and Michael S. Brown York University Video Reference code

Mahmoud Afifi 47 Nov 26, 2022
LineBoard - Python+React+MySQL-白板即時系統改善人群行為

LineBoard-白板即時系統改善人群行為 即時顯示實驗室的使用狀況,並遠端預約排隊,以此來改善人們的工作效率 程式架構 運作流程 使用者先至該實驗室網站預約

Bo-Jyun Huang 1 Feb 22, 2022
An example of semantic segmentation using tensorflow in eager execution.

Semantic segmentation using Tensorflow eager execution Requirement Python 2.7+ Tensorflow-gpu OpenCv H5py Scikit-learn Numpy Imgaug Train with eager e

Iñigo Alonso Ruiz 25 Sep 29, 2022
Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)

Greedy Gradient Ensemble for De-biased VQA Code release for "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). GGE can

21 Jun 29, 2022
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor

Dave Z. Chen 79 Nov 07, 2022
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers Pytorch implementation of CvT: Introducing Convolutions to Vision Transformers Usage: img = torch

Rishikesh (ऋषिकेश) 193 Jan 03, 2023
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022
Codes for Causal Semantic Generative model (CSG), the model proposed in "Learning Causal Semantic Representation for Out-of-Distribution Prediction" (NeurIPS-21)

Learning Causal Semantic Representation for Out-of-Distribution Prediction This repository is the official implementation of "Learning Causal Semantic

Chang Liu 54 Dec 01, 2022
Turn based roguelike in python

pyTB Turn based roguelike in python Documentation can be found here: http://mcgillij.github.io/pyTB/index.html Screenshot Dependencies Written in Pyth

Jason McGillivray 4 Sep 29, 2022
Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI)

Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI) Preparation Clone the Synchronized-BatchNorm-P

Fangneng Zhan 12 Aug 10, 2022
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!

Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your machine! Motivation Would

Joeri Hermans 15 Sep 11, 2022
(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

LAV Learning from All Vehicles Dian Chen, Philipp Krähenbühl CVPR 2022 (also arXiV 2203.11934) This repo contains code for paper Learning from all veh

Dian Chen 300 Dec 15, 2022
Stochastic Scene-Aware Motion Prediction

Stochastic Scene-Aware Motion Prediction [Project Page] [Paper] Description This repository contains the training code for MotionNet and GoalNet of SA

Mohamed Hassan 31 Dec 09, 2022
Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks This repository contains the official code for the

Linus Ericsson 11 Dec 16, 2022