Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Overview

Swin-Transformer

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, please refer to "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

This repo is an implementation of MegEngine version Swin-Transformer. This is also a showcase for training on GPU with less memory by leveraging MegEngine DTR technique.

There is also an official PyTorch implementation.

Usage

Install

  • Clone this repo:
git clone https://github.com/MegEngine/swin-transformer.git
cd swin-transformer
  • Install megengine==1.6.0
pip3 install megengine==1.6.0 -f https://megengine.org.cn/whl/mge.html

Training

To train a Swin Transformer using random data, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> train_random.py

To train a Swin Transformer using AMP (Auto Mix Precision), run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --mode mp train_random.py

To train a Swin Transformer using DTR in dynamic graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --dtr [--dtr-thd <eviction-threshold-of-dtr>] train_random.py

To train a Swin Transformer using DTR in static graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --trace --symbolic --dtr --dtr-thd <eviction-threshold-of-dtr> train_random.py

For example, to train a Swin Transformer with a single GPU using DTR in static graph mode with threshold=8GB and AMP, run:

python3 -n 1 -b 340 -s 10 --trace --symbolic --dtr --dtr-thd 8 --mode mp train_random.py

For more usage, run:

python3 train_random.py -h

Benchmark

  • Testing Devices
    • 2080Ti @ cuda-10.1-cudnn-v7.6.3-TensorRT-5.1.5.0 @ Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
    • Reserve all CUDA memory by setting MGB_CUDA_RESERVE_MEMORY=1, in order to alleviate memory fragmentation problem
Settings Maximum Batch Size Speed(s/step) Throughput(images/s)
None 68 0.490 139
AMP 100 0.494 202
DTR in static graph mode 300 2.592 116
DTR in static graph mode + AMP 340 1.944 175

Acknowledgement

We are inspired by the Swin-Transformer repository, many thanks to microsoft!

Owner
旷视天元 MegEngine
旷视天元 MegEngine
When are Iterative GPs Numerically Accurate?

When are Iterative GPs Numerically Accurate? This is a code repository for the paper "When are Iterative GPs Numerically Accurate?" by Wesley Maddox,

Wesley Maddox 1 Jan 06, 2022
UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering

UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering This repository holds all the code and data for our recent work on

Mohamed El Banani 118 Dec 06, 2022
Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Explainability Requires Interactivity This repository contains the code to train all custom models used in the paper Explainability Requires Interacti

Digital Health & Machine Learning 5 Apr 07, 2022
Simple cross-platform application for DaVinci surgical video frame annotation

About DaVid is a simple cross-platform GUI for annotating robotic and endoscopic surgical actions for use in deep-learning research. Features Simple a

Cyril Zakka 4 Oct 09, 2021
NeRF visualization library under construction

NeRF visualization library using PlenOctrees, under construction pip install nerfvis Docs will be at: https://nerfvis.readthedocs.org import nerfvis s

Alex Yu 196 Jan 04, 2023
Python library for tracking human heads with FLAME (a 3D morphable head model)

Video Head Tracker 3D tracking library for human heads based on FLAME (a 3D morphable head model). The tracking algorithm is inspired by face2face. It

61 Dec 25, 2022
Unofficial implementation of Pix2SEQ

Unofficial-Pix2seq: A Language Modeling Framework for Object Detection Unofficial implementation of Pix2SEQ. Please use this code with causion. Many i

159 Dec 12, 2022
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
Alex Pashevich 62 Dec 24, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down. UpChecker - just run file and use project easy

UpChecker UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down.

Yan 4 Apr 07, 2022
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

EasyCV-Ellis 618 Dec 27, 2022
PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds

PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds PCAM: Product of Cross-Attention Matrices for Rigid Registration of P

valeo.ai 24 May 31, 2022
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.

genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact

Zerite Development 5 Nov 22, 2022
Machine Learning Toolkit for Kubernetes

Kubeflow the cloud-native platform for machine learning operations - pipelines, training and deployment. Documentation Please refer to the official do

Kubeflow 12.1k Jan 03, 2023
Try out deep learning models online on Google Colab

Try out deep learning models online on Google Colab

Erdene-Ochir Tuguldur 1.5k Dec 27, 2022
Compute FID scores with PyTorch.

FID score for PyTorch This is a port of the official implementation of Fréchet Inception Distance to PyTorch. See https://github.com/bioinf-jku/TTUR f

2.1k Jan 06, 2023
AFL binary instrumentation

E9AFL --- Binary AFL E9AFL inserts American Fuzzy Lop (AFL) instrumentation into x86_64 Linux binaries. This allows binaries to be fuzzed without the

242 Dec 12, 2022
なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ

FaceDetection-Anti-Spoof-Demo なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモです。 モデルはPINTO_model_zoo/191_anti-spoof-mn3からONNX形式のモデルを使用しています。 Requirement mediapipe

KazuhitoTakahashi 8 Nov 18, 2022
AFLFast (extends AFL with Power Schedules)

AFLFast Power schedules implemented by Marcel Böhme [email protected]

Marcel Böhme 380 Jan 03, 2023