SwinTransformerV2-TensorFlow

A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of SwinTransformerV1 and their paper on V2.

Paper on Version 2 (18/11/2021): [arXiv]

Paper on Version 1 (17/08/2021): [arXiv]

Features:

TensorFlow 2 implementation of version 1 and 2 of the SwinTransformer, a state-of-the-art backbone for many contemporaty tasks in computer vision. A brief overview of the architectural changes made in version 2:

A pre-norm configuration replaces the previous post-norm configuration, meant to improve training stability in larger models.
A scaled cosine attention replaces the dot product attention in V1, with a learnable scaler.
A continuous log-spaced relative position bias is used instead of the previous parametric table approach. This is implemented here as a small MLP network and a log transform on the relative coordinates bias.

Requirements:

numpy==1.21.4
tensorflow==2.7.0
tensorflow_addons==0.15.0

Getting started

Currently writing up.

License

This project is licensed under the MIT license.

Citation

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Implementation of SwinTransformerV2 in TensorFlow.

Related tags

Overview

SwinTransformerV2-TensorFlow

Features:

Requirements:

Getting started

License

Citation

Owner

Phan Nguyen

Everything about being a TA for ITP/AP course!

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

Gym for multi-agent reinforcement learning

SegNet-Basic with Keras

U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

Pytorch code for semantic segmentation using ERFNet

[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

CRNN With PyTorch

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

BanditPAM: Almost Linear-Time k-Medoids Clustering

A no-BS, dead-simple training visualizer for tf-keras

PyTorch implementation of PNASNet-5 on ImageNet

Toontown House CT Edition

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

CTF challenges and write-ups for MicroCTF 2021.

Some experiments with tennis player aging curves using Hilbert space GPs in PyMC. Only experimental for now.

Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

A library for finding knowledge neurons in pretrained transformer models.