SiT: Self-supervised vIsion Transformer

Last update: Dec 28, 2022

Related tags

Overview

SiT: Self-supervised vIsion Transformer

This repository contains the official PyTorch self-supervised pretraining, finetuning, and evaluation codes for SiT (Self-supervised image Transformer).

The training strategy is adopted from Deit

Usage

Create an environment

conda create -n SiT python=3.8

Activate the environment and install the necessary packages

conda activate SiT

conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch

pip install -r requirements.txt

Self-supervised pre-training

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 72 --epochs 501 --min-lr 5e-6 --lr 1e-3 --training-mode 'SSL' --data-set 'STL10' --output 'checkpoints/SSL/STL10' --validate-every 10

Finetuning

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10' --validate-every 10

Linear Evaluation

Linear projection Head

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --lr 1e-3 --weight-decay 5e-4 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10_LE' --validate-every 10 --SiT_LinearEvaluation 1

2-layer MLP projection Head

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --lr 1e-3 --weight-decay 5e-4 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10_LE_hidden' --validate-every 10 --SiT_LinearEvaluation 1 --representation-size 1024

Note: assign the --dataset_location parameter to the location of the downloaded dataset

If you use this code for a paper, please cite:

@article{atito2021sit,

  title={SiT: Self-supervised vIsion Transformer},

  author={Atito, Sara and Awais, Muhammad and Kittler, Josef},

  journal={arXiv preprint arXiv:2104.03602},

  year={2021}

}

License

This repository is released under the GNU General Public License.

SiT: Self-supervised vIsion Transformer

Related tags

Overview

SiT: Self-supervised vIsion Transformer

Usage

Self-supervised pre-training

Finetuning

Linear Evaluation

License

Owner

Sara Ahmed

python 93% acc. CNN Dogs Vs Cats ( Pytorch )

Gym Threat Defense

Convenient tool for speeding up the intern/officer review process.

Scales, Chords, and Cadences: Practical Music Theory for MIR Researchers

Set of models for classifcation of 3D volumes

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Robot Servers and Server Manager software for robo-gym

Plotting points that lie on the intersection of the given curves using gradient descent.

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Listing arxiv - Personalized list of today's articles from ArXiv

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

Grounding Representation Similarity with Statistical Testing

Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

Styled Augmented Translation

Learning to Reach Goals via Iterated Supervised Learning

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

A simple python program that can be used to implement user authentication tokens into your program...