Main Results on ImageNet with Pretrained Models

Last update: Dec 14, 2022

Related tags

Overview

This repository contains Pytorch evaluation code, training code and pretrained models for the following projects:

SPACH (A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP)
sMLP (Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?)
ShiftViT (When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism)

Main Results on ImageNet with Pretrained Models

name	[email protected]	#params	FLOPs	url
SPACH-Conv-MS-S	81.6	44M	7.2G	github
SPACH-Trans-MS-S	82.9	40M	7.6G	github
SPACH-MLP-MS-S	82.1	46M	8.2G	github
SPACH-Hybrid-MS-S	83.7	63M	11.2G	github
SPACH-Hybrid-MS-S+	83.9	63M	12.3G	github
sMLPNet-T	81.9	24M	5.0G
sMLPNet-S	83.1	49M	10.3G	github
sMLPNet-B	83.4	66M	14.0G	github
Shift-T / light	79.4	20M	3.0G	github
Shift-T	81.7	29M	4.5G	github
Shift-S / light	81.6	34M	5.7G	github
Shift-S	82.8	50M	8.8G	github

Usage

Install

First, clone the repo and install requirements:

git clone https://github.com/microsoft/Spach
pip install -r requirements.txt

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained model on ImageNet val with a single GPU run:

python main.py --eval --resume <checkpoint> --model <model-name>--data-path <imagenet-path>

For example, to evaluate the SPACH-Hybrid-MS-S model, run

python main.py --eval --resume --model spach_ms_s_patch4_224_hybrid spach_ms_hybrid_s.pth --data-path <imagenet-path>

giving

* [email protected] 83.658 [email protected] 96.762 loss 0.688

You can find all supported models in models/registry.py.

Training

One can simply call the following script to run training process. Distributed training is recommended even on single GPU node.

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --use_env main.py \
--model <model-name>
--data-path <imagenet-path>
--output_dir <output-path>
--dist-eval

Citation

@article{zhao2021battle,
  title={A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP},
  author={Zhao, Yucheng and Wang, Guangting and Tang, Chuanxin and Luo, Chong and Zeng, Wenjun and Zha, Zheng-Jun},
  journal={arXiv preprint arXiv:2108.13002},
  year={2021}
}

@article{tang2021sparse,
  title={Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?},
  author={Tang, Chuanxin and Zhao, Yucheng and Wang, Guangting and Luo, Chong and Xie, Wenxuan and Zeng, Wenjun},
  journal={arXiv preprint arXiv:2109.05422},
  year={2021}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Acknowledgement

Our code are built on top of DeiT. We test throughput following Swin Transformer

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Measuring and Improving Consistency in Pretrained Language Models

ParaRel 🤘 This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

26 Dec 2, 2022

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

408 Jan 1, 2023

A library for finding knowledge neurons in pretrained transformer models.

knowledge-neurons An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the t

96 Dec 21, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

140 Dec 28, 2022

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

73 Dec 16, 2022

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

100 Dec 25, 2022

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

2.4k Dec 28, 2022

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

86 Oct 28, 2022

Comments

Shift features implementation

Hi, very interesting research. I wonder why did you implement the shift_feature as memory copy https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/shiftvit.py#L107 instead of using Tensor.roll operation? It would make your block much faster. Another benefit would be that pixels from one side would leak to the other giving the network to pass information from one boundary to another, which seems a better option that dublication of the last row during each shift.

opened by bonlime 3
Add: unofficial implementation

Hey folks,

It would be great if this repository could also hold links for other unofficial implementations. I am proposing a keras tutorial on ShiftViT.

opened by ariG23498 0
The configuration of the architecture variants is inconsistent with the papers and weights files.

@tangchuanxin

https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/registry.py#L224

The code is inconsistent with the content of the paper:

and the weight file. The content of this pth file is the same as the architecture variant -S in the figure above, ie, depths=(6, 8, 18, 6).

https://github.com/microsoft/SPACH/releases/download/v1.0/shiftvit_tiny_r2.pth

opened by lartpang 1

Main Results on ImageNet with Pretrained Models

Related tags

Overview

Main Results on ImageNet with Pretrained Models

Usage

Install

Data preparation

Evaluation

Training

Citation

Contributing

Acknowledgement

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Measuring and Improving Consistency in Pretrained Language Models

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

A library for finding knowledge neurons in pretrained transformer models.

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Comments

Shift features implementation

Add: unofficial implementation

The configuration of the architecture variants is inconsistent with the papers and weights files.

Releases(v1.0)

v1.0(Nov 19, 2021)

Owner

Microsoft

BackgroundRemover lets you Remove Background from images and video with a simple command line interface

This is a pytorch implementation of the NeurIPS paper GAN Memory with No Forgetting.

Keras implementations of Generative Adversarial Networks.

Fast Neural Style for Image Style Transform by Pytorch

Official implementation of Densely connected normalizing flows

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

LIVECell - A large-scale dataset for label-free live cell segmentation

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

An abstraction layer for mathematical optimization solvers.