Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

Overview

Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

This is the official code for DyReg model inroduced in Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

Citation

Please use the following BibTeX to cite our work.

@incollection{duta2021dynamic_dyreg_gnn_neurips2021,
title = {Discovering Dynamic Salient Regions with Spatio-Temporal Graph
Neural Networks},
author = {Duta, Iulia and Nicolicioiu, Andrei and Leordeanu, Marius},
booktitle = {Advances in Neural Information Processing Systems 34},
year = {2021}
}

@article{duta2020dynamic_dyreg,
title = {Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning},
author = {Duta, Iulia and Nicolicioiu, Andrei and Leordeanu, Marius},
journal = {NeurIPS 2020 Workshop on Object Representations for Learning and Reasoning},
year = {2020},
}

Requirements

The code was developed using:

- python 3.7
- matplotlib
- torch 1.7.1
- script
- pandas
- torchvision
- moviepy
- ffmpeg

Overview:

The repository contains the Pytorch implementation of the DyReg-GNN model. The model is defined and trained in the following files:

  • ops/dyreg.py - code for our DyReg module

  • ops/rstg.py - code for the Spatio-temporal GNN (RSTG) used to process the graph extracted using DyReg

  • create_model.py - two examples how to integrate the DyReg-GNN module inside an existing backbone

  • main_standard.py - code to train a model on Smt-Smt dataset

  • test_models.py - code for multi-clip evaluation

Scripts for preparing the data, training and testing the model:

Prepare dataset

For Something Something dataset:

  • the json files containing meta-data should be stored in ./data/smt-smt-V2/tsm_data
  • the zip files containing the videos should be stored in ./data/smt-smt-V2/

  1. To extract the videos from the zip files run:

cat 20bn-something-something-v2-?? | tar zx

  1. To extract the frames from videos run:

python tools/vid2img_sthv2.py

→ The videos will be stored in $FRAME_ROOT (default './data/smt-smt-V2/tmp_smt-smt-V2-frames')

💡 If you already have the dataset as frames, place them under ./data/smt-smt-V2/smt-smt-V2-frames/, one folder for each video
💡 💡 If you need to change the path for datasets modify $ROOT_DATASET in dataset_config.py

  1. To generate the labels file in the required format please run:

python tools/gen_label_sthv2.py

→ The resulting txt files, for each split, will be stored in $DATA_UTILS_ROOT (default './data/smt-smt-V2/tsm_data/')

How to run the model

DyReg-GNN module can be simply inserted into any space-time model.

import torch
from torch.nn import functional as F
from ops.dyreg import DynamicGraph, dyregParams

class SpaceTimeModel(torch.nn.Module):
    def __init__(self):
        super(SpaceTimeModel, self).__init__()
        dyreg_params = dyregParams()
        dyregParams.offset_lstm_dim = 32
        self.dyreg = DynamicGraph(dyreg_params,
                    backbone_dim=32, node_dim=32, out_num_ch=32,
                    H=16, W=16, 
                    iH=16, iW=16,
                    project_i3d=False,
                    name='lalalal')


        self.fc = torch.nn.Linear(32, 10)

    def forward(self, x):
        dx = self.dyreg(x)
        # you can initialize the dyreg branch as identity function by normalisation, 
        #   as done in DynamicGraphWrapper found in ./ops/dyreg.py 
        x = x + dx
        # average over time and space: T, H, W
        x = x.mean(-1).mean(-1).mean(-2)
        x = self.fc(x)
        return x


B = 8
T = 10
C = 32
H = 16
W = 16
x = torch.ones(B,T,C,H,W)
st_model = SpaceTimeModel()
out = st_model(x)

For another example of how to integrate DyReg (DynamicGraph module) inside your model please look at create_model.py or run:

python create_model.py

Something-Something experiments

Training a model

To train a model on smt-smt v2 dataset please run

./start_main_standard.sh model_name

For default hyperparameters check opts.py. For example, place_graph flag controls how many DyReg-GNN modules to use and where to place them inside the backbone:

# for a model with 3 DyReg-GNN modules placed after layer 2-block 2, layer 3-block 4 and layer 4-block 1 of the backbone
--place_graph=layer2.2_layer3.4_layer4.1 
# for a model with 1 dyreg module placed after layer 3 block 4 of the backbone
--place_graph=layer3.4                   

Single clip evaluation

Train a model with the above script or download a pre-trained DyReg-GNN model from here and put the checkpoint in ./ckeckpoints/

To evaluate a model on smt-smt v2 dataset on a single 224 x 224 central crop, run:

./start_main_standard_test.sh model_name

The flag $RESUME_CKPT indicate the the checkpoint used for evaluation.

Multi clips evaluation

To evaluate a model in the multi-clips setup (3 spatials clips x 2 temporal samplings) on Smt-Smt v2 dataset please run

./evaluate_model.sh model_name

The flag $RESUME_CKPT indicate the the checkpoint used for evaluation.

TSM Baseline

This repository adds DyReg-GNN modules to a TSM backbone based on code from here.

Owner
Bitdefender Machine Learning
Machine Learning Research @ Bitdefender
Bitdefender Machine Learning
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 01, 2023
Model Zoo of BDD100K Dataset

Model Zoo of BDD100K Dataset

ETH VIS Group 200 Dec 27, 2022
Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/

SemanticGAN This is the official code for: Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalizat

151 Dec 28, 2022
A Python package for performing pore network modeling of porous media

Overview of OpenPNM OpenPNM is a comprehensive framework for performing pore network simulations of porous materials. More Information For more detail

PMEAL 336 Dec 30, 2022
Qlib is an AI-oriented quantitative investment platform

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.

Microsoft 10.1k Dec 30, 2022
Airborne magnetic data of the Osborne Mine and Lightning Creek sill complex, Australia

Osborne Mine, Australia - Airborne total-field magnetic anomaly This is a section of a survey acquired in 1990 by the Queensland Government, Australia

Fatiando a Terra Datasets 1 Jan 21, 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑‍🎓 👩‍⚖️ Dataset Summary Inspired by the recent widespread use of th

95 Dec 08, 2022
Official Implementation (PyTorch) of "Point Cloud Augmentation with Weighted Local Transformations", ICCV 2021

PointWOLF: Point Cloud Augmentation with Weighted Local Transformations This repository is the implementation of PointWOLF(To appear). Sihyeon Kim1*,

MLV Lab (Machine Learning and Vision Lab at Korea University) 16 Nov 03, 2022
Codebase to experiment with a hybrid Transformer that combines conditional sequence generation with regression

Regression Transformer Codebase to experiment with a hybrid Transformer that combines conditional sequence generation with regression . Development se

International Business Machines 27 Jan 05, 2023
Python program that works as a contact list

Lista de Contatos Programa em Python que funciona como uma lista de contatos. Features Adicionar novo contato Remover contato Atualizar contato Pesqui

Victor B. Lino 3 Dec 16, 2021
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Jiaxi Jiang 282 Jan 02, 2023
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

News! Aug 2020: v0.4.0 version of AlphaPose is released! Stronger tracking! Include whole body(face,hand,foot) keypoints! Colab now available. Dec 201

Machine Vision and Intelligence Group @ SJTU 6.7k Dec 28, 2022
QuanTaichi evaluation suite

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 120 Jan 04, 2023
Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

LPTN Paper | Supplementary Material | Poster High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network Ji

372 Dec 26, 2022
Demonstration of the Model Training as a CI/CD System in Vertex AI

Model Training as a CI/CD System This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed wo

Chansung Park 19 Dec 28, 2022
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

202 Nov 18, 2022
Evaluation and Benchmarking of Speech Super-resolution Methods

Speech Super-resolution Evaluation and Benchmarking What this repo do: A toolbox for the evaluation of speech super-resolution algorithms. Unify the e

Haohe Liu (刘濠赫) 84 Dec 20, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022
Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes Setup virtualenv -p python3 venv source venv/bin/activate pip instal

Planet AI GmbH 9 May 20, 2022
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

Robotic AI & Learning Lab Berkeley 997 Dec 30, 2022