Accuracy Aligned. Concise Implementation of Swin Transformer

Last update: Dec 16, 2022

Related tags

Overview

Accuracy Aligned. Concise Implementation of Swin Transformer

This repository contains the implementation of Swin Transformer, and the training codes on ImageNet datasets. We have aligned the output of our network with the official one, that is, using the same input and random seed, the output is identical to the official one.

Our implementation is highly based on einops, which makes the implementation more concise, and easy to be understand. (Intuitively, we use only 200 lines of codes compared with ~600 lines of official codes.) Besides, our implementation keeps the same training speed.

Model	Epoch	[email protected](our)	[email protected](our)	[email protected](official)	[email protected](official)	pretrained model
Swin-T	300	81.3	95.5	81.2	95.5	here

Usage

Train on ImageNet:

Train Swin-T

python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --model Swin_T \
--batch-size 128 --drop-path 0.2 --data-path ~/ILSVRC2012/ --output_dir /data/SwinTransformer_exp/SwinT/

Train Swin-S

python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --model Swin_S \
--batch-size 128 --drop-path 0.3 --data-path ~/ILSVRC2012/ --output_dir /data/SwinTransformer_exp/SwinS/

Train Swin-B

python -m torch.distributed.launch --nproc_per_node=8 --use_env train.py --model Swin_B \
--batch-size 128 --drop-path 0.5 --data-path ~/ILSVRC2012/ --output_dir /data/SwinTransformer_exp/SwinB/

Reference

The training process involves many training and augmentation tricks, such as stochastic depth, mixup, cutmix and random erasing. I borrow large from Deit (https://github.com/facebookresearch/deit).

Citations

@misc{liu2021swin,
      title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, 
      author={Ze Liu and Yutong Lin and Yue Cao and Han Hu and Yixuan Wei and Zheng Zhang and Stephen Lin and Baining Guo},
      year={2021},
      eprint={2103.14030},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Accuracy Aligned. Concise Implementation of Swin Transformer

Related tags

Overview

Accuracy Aligned. Concise Implementation of Swin Transformer

Usage

Reference

Citations

Owner

FengWang

Implementation of paper "DeepTag: A General Framework for Fiducial Marker Design and Detection"

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

Pytorch implementation of ProjectedGAN

a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

Make differentially private training of transformers easy for everyone

The Official PyTorch Implementation of DiscoBox.

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Finding all things on-prem Microsoft for password spraying and enumeration.

D2Go is a toolkit for efficient deep learning

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Relative Uncertainty Learning for Facial Expression Recognition

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

Implementation of FSGNN

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

A3C LSTM Atari with Pytorch plus A3G design

Alleviating Over-segmentation Errors by Detecting Action Boundaries

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection