This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Overview

Dynamic-Vision-Transformer (Pytorch)

This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Update on 2021/06/01: Release Pre-trained Models and the Inference Code on ImageNet.

Introduction

We develop a Dynamic Vision Transformer (DVT) to automatically configure a proper number of tokens for each individual image, leading to a significant improvement in computational efficiency, both theoretically and empirically.

Results

  • Top-1 accuracy on ImageNet v.s. GFLOPs

  • Top-1 accuracy on CIFAR v.s. GFLOPs

  • Top-1 accuracy on ImageNet v.s. Throughput

  • Visualization

Pre-trained Models

Backbone # of Exits # of Tokens Links
T2T-ViT-12 3 7x7-10x10-14x14 Tsinghua Cloud / Google Drive
  • What are contained in the checkpoints:
**.pth.tar
├── model_state_dict: state dictionaries of the model
├── flops: a list containing the GFLOPs corresponding to exiting at each exit
├── anytime_classification: Top-1 accuracy of each exit
├── dynamic_threshold: the confidence thresholds used in budgeted batch classification
├── budgeted_batch_classification: results of budgeted batch classification (a two-item list, [0] and [1] correspond to the two coordinates of a curve)

Requirements

  • python 3.7.7
  • pytorch 1.3.1
  • torchvision 0.4.2

Evaluate Pre-trained Models

Read the evaluation results saved in pre-trained models

CUDA_VISIBLE_DEVICES=0 python inference.py --batch_size 128 --model DVT_T2t_vit_12 --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 0

Read the confidence thresholds saved in pre-trained models and infer the model on the validation set

CUDA_VISIBLE_DEVICES=0 python inference.py --data_url PATH_TO_DATASET --batch_size 128 --model DVT_T2t_vit_12 --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 1

Determine confidence thresholds on the training set and infer the model on the validation set

CUDA_VISIBLE_DEVICES=0 python inference.py --data_url PATH_TO_DATASET --batch_size 128 --model DVT_T2t_vit_12 --checkpoint_path PATH_TO_CHECKPOINTS  --eval_mode 2

The dataset is expected to be prepared as follows:

ImageNet
├── train
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...
├── val
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...

Contact

If you have any question, please feel free to contact the authors. Yulin Wang: [email protected].

Acknowledgment

Our code of T2T-ViT from here.

To Do

  • Update the code for training.
FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

FaceAPI AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using

Vladimir Mandic 395 Dec 29, 2022
这是一个yolox-pytorch的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤

Bubbliiiing 613 Jan 05, 2023
Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance.

Qualcomm Innovation Center 137 Jan 03, 2023
MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets)

MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets) Using mixup data augmentation as reguliraztion and tuning the hyper par

Bhanu 2 Jan 16, 2022
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022
Lightweight library to build and train neural networks in Theano

Lasagne Lasagne is a lightweight library to build and train neural networks in Theano. Its main features are: Supports feed-forward networks such as C

Lasagne 3.8k Dec 29, 2022
Deeplab-resnet-101 in Pytorch with Jaccard loss

Deeplab-resnet-101 Pytorch with Lovász hinge loss Train deeplab-resnet-101 with binary Jaccard loss surrogate, the Lovász hinge, as described in http:

Maxim Berman 95 Apr 15, 2022
This is the face keypoint train code of project face-detection-project

face-key-point-pytorch 1. Data structure The structure of landmarks_jpg is like below: |--landmarks_jpg |----AFW |------AFW_134212_1_0.jpg |------AFW_

I‘m X 3 Nov 27, 2022
Semantic graph parser based on Categorial grammars

Lambekseq "Everyone who failed Greek or Latin hates it." This package is for proving theorems in Categorial grammars (CG) and constructing semantic gr

10 Aug 19, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
DSL for matching Python ASTs

py-ast-rule-engine This library provides a DSL (domain-specific language) to match a pattern inside a Python AST (abstract syntax tree). The library i

1 Dec 18, 2021
The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue.

The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue. How do I cite D-REX? For now, cite

Alon Albalak 6 Mar 31, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022
PushForKiCad - AISLER Push for KiCad EDA

AISLER Push for KiCad Push your layout to AISLER with just one click for instant

AISLER 31 Dec 29, 2022
Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Philipp Erler 329 Jan 06, 2023
COVID-Net Open Source Initiative

The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available

Linda Wang 1.1k Dec 26, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

HifiFace — Unofficial Pytorch Implementation Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

MINDs Lab 218 Jan 04, 2023
Personalized Federated Learning using Pytorch (pFedMe)

Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020) This repository implements all experiments in the paper Personalized Federated Le

Charlie Dinh 226 Dec 30, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 07, 2023