ASFormer: Transformer for Action Segmentation

This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segmentation.

Enviroment

Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

Reproduce our results

1. Download the dataset data.zip at (https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8) or (https://zenodo.org/record/3625992#.Xiv9jGhKhPY). 
2. Unzip the data.zip file to the current folder. There are three datasets in the ./data folder, i.e. ./data/breakfast, ./data/50salads, ./data/gtea
3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg). There are pretrained models for three datasets, i.e. ./models/50salads, ./models/breakfast, ./models/gtea
4. Run python main.py --action=predict --dataset=50salads/gtea/breakfast --split=1/2/3/4/5 to generate predicted results for each split.
5. Run python eval.py --dataset=50salads/gtea/breakfast --split=0/1/2/3/4/5 to evaluate the performance. **NOTE**: split=0 will evaulate the average results for all splits, It needs to be done after you complete all split predictions.

Train your own model

Also, you can retrain the model by yourself with following command.

python main.py --action=train --dataset=50salads/gtea/breakfast --split=1/2/3/4/5

The training process is very stable in our experiments. It convergences very fast and is not sensitive to the number of training epochs.

Demo for using ASFormer as your backbone

In our paper, we replace the original TCN-based backbone model MS-TCN in ASRF with our ASFormer. The new model achieves even higher results on the 50salads dataset than the original ASRF. Code is Here.

If you find our repo useful, please give us a star and cite

@inproceedings{chinayi_ASformer,  
	author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, 
	booktitle={The British Machine Vision Conference (BMVC)},   
	title={ASFormer: Transformer for Action Segmentation},
	year={2021},  
}

Feel free to raise a issue if you got trouble with our code.

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Related tags

Overview

ASFormer: Transformer for Action Segmentation

Enviroment

Reproduce our results

Train your own model

Demo for using ASFormer as your backbone

Owner

Can we learn gradients by Hamiltonian Neural Networks?

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

Toolbox of models, callbacks, and datasets for AI/ML researchers.

Training Very Deep Neural Networks Without Skip-Connections

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

Self Governing Neural Networks (SGNN): the Projection Layer

Real time sign language recognition

RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Unsupervised captioning - Code for Unsupervised Image Captioning

PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

Code for classifying international patents based on the text of their titles/abstracts

SiT: Self-supervised vIsion Transformer

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

Learning with Subset Stacking