Pytorch implementation for the Temporal and Object Quantification Networks (TOQ-Nets).

Overview

TOQ-Nets-PyTorch-Release

Pytorch implementation for the Temporal and Object Quantification Networks (TOQ-Nets).

TOQ-Nets

Temporal and Object Quantification Networks
Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie Pack Kaelbling, and Tomer D. Ullman
In International Joint Conference on Artificial Intelligence (IJCAI) 2021 (Poster)
[Paper] [Project Page] [BibTex]

@inproceedings{Mao2021Temporal,
    title={{Temporal and Object Quantification Networks}},
    author={Mao, Jiayuan and Luo, Zhezheng and Gan, Chuang and Tenenbaum, Joshua B. and Wu, Jiajun and Kaelbling, Leslie Pack and Ullman, Tomer D.},
    booktitle={International Joint Conferences on Artificial Intelligence},
    year={2021}
}

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required python packages specified by requirements.txt. See the Installation.

Installation

Install Jacinle: Clone the package, and add the bin path to your global PATH environment variable:

git clone https://github.com/vacancy/Jacinle --recursive
export PATH=<path_to_jacinle>/bin:$PATH

Clone this repository:

git clone https://github.com/vacancy/TOQ-Nets-PyTorch --recursive

Create a conda environment for TOQ-Nets, and install the requirements. This includes the required python packages from both Jacinle TOQ-Nets. Most of the required packages have been included in the built-in anaconda package:

conda create -n nscl anaconda
conda install pytorch torchvision -c pytorch

Dataset preparation

We evaluate our model on four datasets: Soccer Event, RLBench, Toyota Smarthome and Volleyball. To run the experiments, you need to prepare them under NSPCL-Pytorch/data.

Soccer Event

Download link

RLBenck

Download link

Toyota Smarthome

Dataset can be obtained from the website: Toyota Smarthome: Real-World Activities of Daily Living

@InProceedings{Das_2019_ICCV,
    author = {Das, Srijan and Dai, Rui and Koperski, Michal and Minciullo, Luca and Garattoni, Lorenzo and Bremond, Francois and Francesca, Gianpiero},
    title = {Toyota Smarthome: Real-World Activities of Daily Living},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month = {October},
    year = {2019}
}

Volleyball

Dataset can be downloaded from this github repo.

@inproceedings{msibrahiCVPR16deepactivity,
  author    = {Mostafa S. Ibrahim and Srikanth Muralidharan and Zhiwei Deng and Arash Vahdat and Greg Mori},
  title     = {A Hierarchical Deep Temporal Model for Group Activity Recognition.},
  booktitle = {2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2016}
}

Training and evaluation.

Standard 9-way classification task

To train the model on the standard 9-way classification task on the soccer dataset:

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1001 --run_name 9_way_classification -Mmodel-name "'NLTL_SAv3'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128 -Mhp-train-estimate_inequality_parameters "(1,1)" -Mmodel-both_quantify False -Mmodel-depth 0

The hyper parameter estimate_inequality_parameters is to estimate the distribution of input physical features, and is only required when training TOQ-Nets (but not for baselines).

Few-shot actions

To train on regular actions and test on new actions:

jac-crun <gpu_ids> scripts/action_classification_softmax.py  -t 1002 --run_name few_shot -Mdata-name "'TrajectorySingleActionNvN_Wrapper_FewShot_Softmax'" -Mmodel-name "'NLTL_SAv3'" -Mlr 3e-3 -Mn_epochs 200 -Mbatch_size 128 -Mdata-new_actions "[('interfere', (50, 50, 2000)), ('sliding', (50, 50, 2000))]" -Mhp-train-finetune_period "(1,200)" -Mhp-train-estimate_inequality_parameters "(1,1)"

You can set the split of few-shot actions using -Mdata-new_actions, and the tuple (50, 50, 2000) represents the number of samples available in training validation and testing.

Generalization to more of fewer players and temporally warped trajectories.

To test the generalization to more or fewer players, as well as temporal warpped trajectories, first train the model on the standard 6v6 games:

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1003 --run_name generalization -Mmodel-name "'NLTL_SAv3'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mn_epochs 200 -Mbatch_size 128 -Mhp-train-estimate_inequality_parameters "(1,1)" -Mlr 3e-3

Then to generalize to games with 11 players:

jac-crun 3 scripts/action_classification_softmax.py -t 1003 --run_name generalization_more_players --eval 200 -Mdata-name "'LongVideoNvN'" -Mdata-n_train 0.1 -Mdata-temporal "'exact'" -Mdata-n_players 11

The number 200 after --eval should be equal to the number of epochs of training. Note that 11 can be replace by any number of players from [3,4,6,8,11].

Similarly, to generalize to temporally warped trajectoryes:

jac-crun 3 scripts/action_classification_softmax.py -t 1003 --run_name generalization_time_warp --eval 200 -Mdata-name "'LongVideoNvN'" -Mdata-n_train 0.1 -Mdata-temporal "'all'" -Mdata-n_players 6

Baselines

We also provide the example commands for training all baselines:

STGCN

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1004 --run_name stgcn -Mmodel-name "'STGCN_SA'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mmodel-n_agents 13 -Mn_epochs 200 -Mbatch_size 128

STGCN-LSTM

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1005 --run_name stgcn_lstm -Mmodel-name "'STGCN_LSTM_SA'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mmodel-n_agents 13 -Mn_epochs 200 -Mbatch_size 128

Space-Time Region Graph

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1006 --run_name strg -Mmodel-name "'STRG_SA'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128

Non-Local

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1007 --run_name non_local -Mmodel-name "'NONLOCAL_SA'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128
Owner
Zhezheng Luo
Zhezheng Luo
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 06, 2022
Reliable probability face embeddings

ProbFace, arxiv This is a demo code of training and testing [ProbFace] using Tensorflow. ProbFace is a reliable Probabilistic Face Embeddging (PFE) me

Kaen Chan 34 Dec 31, 2022
Interactive web apps created using geemap and streamlit

geemap-apps Introduction This repo demostrates how to build a multi-page Earth Engine App using streamlit and geemap. You can deploy the app on variou

Qiusheng Wu 27 Dec 23, 2022
"Neural Turing Machine" in Tensorflow

Neural Turing Machine in Tensorflow Tensorflow implementation of Neural Turing Machine. This implementation uses an LSTM controller. NTM models with m

Taehoon Kim 1k Dec 06, 2022
Molecular AutoEncoder in PyTorch

MolEncoder Molecular AutoEncoder in PyTorch Install $ git clone https://github.com/cxhernandez/molencoder.git && cd molencoder $ python setup.py insta

Carlos Hernández 80 Dec 05, 2022
Repository for training material for the 2022 SDSC HPC/CI User Training Course

hpc-training-2022 Repository for training material for the 2022 SDSC HPC/CI Training Series HPC/CI Training Series home https://www.sdsc.edu/event_ite

sdsc-hpc-training-org 21 Jul 27, 2022
Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

CompVis Heidelberg 293 Dec 22, 2022
Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

ppg-vc Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC) This repo implements different kinds of PPG-based VC models. Pretrained models. More m

Liu Songxiang 227 Dec 28, 2022
A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

dE_soot 1 Dec 08, 2021
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

ROSITA News & Updates (24/08/2021) Release the demo to perform fine-grained semantic alignments using the pretrained ROSITA model. (15/08/2021) Releas

Vision and Language Group@ MIL 48 Dec 23, 2022
Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Serpent.AI - Game Agent Framework (Python) Update: Revival (May 2020) Development work has resumed on the framework with the aim of bringing it into 2

Serpent.AI 6.4k Jan 05, 2023
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
Text to Image Generation with Semantic-Spatial Aware GAN

text2image This repository includes the implementation for Text to Image Generation with Semantic-Spatial Aware GAN This repo is not completely. Netwo

CVDDL 124 Dec 30, 2022
Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination (ICCV 2021) Dataset License This work is l

DongYoung Kim 33 Jan 04, 2023
Pytorch tutorials for Neural Style transfert

PyTorch Tutorials This tutorial is no longer maintained. Please use the official version: https://pytorch.org/tutorials/advanced/neural_style_tutorial

Alexis David Jacq 135 Jun 26, 2022
Focal Loss for Dense Rotation Object Detection

Convert ResNets weights from GluonCV to Tensorflow Abstract GluonCV released some new resnet pre-training weights and designed some new resnets (such

17 Nov 24, 2021
PolyTrack: Tracking with Bounding Polygons

PolyTrack: Tracking with Bounding Polygons Abstract In this paper, we present a novel method called PolyTrack for fast multi-object tracking and segme

Gaspar Faure 13 Sep 15, 2022
Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

UCL Natural Language Processing 71 Dec 29, 2022
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors   In order to facilitate the res

yujmo 11 Dec 12, 2022
Semantic Segmentation with SegFormer on Drone Dataset.

SegFormer_Segmentation Semantic Segmentation with SegFormer on Drone Dataset. You can check out the blog on Medium You can also try out the model with

Praneet 8 Oct 20, 2022