Dynamic Token Normalization Improves Vision Transformers

Last update: Oct 09, 2022

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers. Codea and Models will be available soon.

Dynamic Token Normalization

We design a novel normalization method, termed Dynamic Token Normalization (DTN), which inherits the advantages from LayerNorm and InstanceNorm. DTN can be seamlessly plugged into various transformer models, consistenly improving the performance.

Comparisons of top-1 accuracies on the validation set of ImageNet, by using ViT trained with LN and DTN.

Model	Top-1	Top-5
ViT-T*-LN	72.3	91.4
ViT-T*-DTN	73.2	91.7
ViT-S*-LN	80.6	95.2
ViT-S*-DTN	81.7	95.8
ViT-B*-LN	81.7	95.8
ViT-B*-DTN	82.5	96.1

Getting Started

Install PyTorch

Clone the repo:

git clone https://github.com/dtn-anonymous/DTN.git

Requirements

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.2:

pip install timm==0.3.2

Data Preparation

Download the ImageNet dataset which should contain train and val directionary and the txt file for correspondings between images and labels.

Training a model from scratch

An example to train our DTN is given in DTN/scripts/train.sh. To train ViT-S* with our DTN,

cd DTN/scripts   
sh train.sh layer vit_norm_s_star configs/ViT/vit.yaml

Number of GPUs and configuration file to use can be modified in train.sh

Dynamic Token Normalization Improves Vision Transformers

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization

Getting Started

Requirements

Data Preparation

Training a model from scratch

Owner

Wenqi Shao

Re-implementation of the vector capsule with dynamic routing

A Dataset of Python Challenges for AI Research

The codes and models in 'Gaze Estimation using Transformer'.

Implement A3C for Mujoco gym envs

Código de um painel de auto atendimento feito em Python.

A package related to building quasi-fibration symmetries

Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

Quick program made to generate alpha and delta tables for Hidden Markov Models

Object detection on multiple datasets with an automatically learned unified label space.

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Immortal tracker

Texture mapping with variational auto-encoders

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia

Pytorch ImageNet1k Loader with Bounding Boxes.

This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

TransGAN: Two Transformers Can Make One Strong GAN

Official PyTorch implementation of RIO

Time-Optimal Planning for Quadrotor Waypoint Flight

A tool for calculating distortion parameters in coordination complexes.