Transformers based fully on MLPs

Overview

Awesome MLP-based Transformers papersAwesome

An up-to-date list of Transformers based fully on MLPs without attention!

Why this repo?

After transformers and fully-based attention mechanism models took over most of the deep learning world since 2019, it appears that the power does not come from attention, and indeed replacing the feed-forward network in a transformer by attention performs horrible (~30% top-1 on ImageNet). It appears that Attention is not all we need. After all, we don't need inductive-biased models such as CNNs anymore, and we can lean back on MLPs since (1) we have enough data, (2) We have powerful optimization, regularization and data augmentation techniques. As we saw a big hipe on transformers awesome vision transformer and BERT-related papers, we expect to see a big hipe in fully MLP-based networks without attention, and the research focus is now shited to finding efficient ways of mixing tokens without involving attention mechanisms. This repository aims at gathering and collecting all these kind of papers.

Contributing

Please help in contributing to this list by submitting an issue or a pull request

- Paper Name [[pdf]](link) [[code]](link)

Papers

  • MLP-Mixer: An all-MLP Architecture for Vision [pdf] [official code] [code] [code] [code] [Yannic Kilcher Video]
  • Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet [pdf] [code]
  • ResMLP: Feedforward networks for image classification with data-efficient training [pdf] [code] [code] [code]
  • Pay Attention to MLPs [pdf] [code] [code] [code]
  • FNet: Mixing Tokens with Fourier Transforms [pdf] [code] [Yannic Kilcher Video]
  • Can Attention Enable MLPs To Catch Up With CNNs? [pdf]
  • MixerGAN: An MLP-Based Architecture for Unpaired Image-to-Image Translation [pdf]
  • On the Bias Against Inductive Biases [pdf]
  • S2 MLP: Spatial-Shift MLP Architecture for Vision [pdf]
  • Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition [pdf] [code]
  • Rethinking Token-Mixing MLP for MLP-based Vision Backbone [pdf]
  • Global Filter Networks for Image Classification [pdf] [code]
  • What Makes for Hierarchical Vision Transformer? [pdf]
  • As-MLP: An Axial Shifted MLP architecture for Vision [pdf][code]
  • CycleMLP: A MLP-like Architecture for Dense Prediction [pdf][code]
  • S2 MLPv2: Improved Spatial-Shift MLP Architecture for Vision [pdf]
  • RaftMLP: Do MLP-based Models Dream of Winning Over Computer Vision? [pdf] [code]
  • Hire-MLP: Vision MLP via Hierarchical Rearrangement [pdf]
  • Sparse-MLP: A Fully-MLP Architecture with Conditional Computation [pdf]
  • Sparse MLP for Image Recognition: Is Self-Attention Really Necessary? [pdf]
  • Patches Are All You Need? [pdf] [code]
  • Exploring the Limits of Large Scale Pre-training [pdf]
  • Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs [pdf] [code]
  • Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation [pdf] [code]
  • Are We Ready for a New Paradigm Shift? A Survey on Visual Deep MLP [pdf]
  • MetaFormer is Actually What You Need for Vision [pdf] [code]
  • An Image Patch is a Wave: Phase-Aware Vision MLP [pdf]
  • MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video [pdf]
  • SWAT: Spatial Structure Within and Among Tokens [pdf]
  • MLP Architectures for Vision-and-Language Modeling: An Empirical Study [pdf] [code]
  • RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [pdf] [code]
Owner
Fawaz Sammani
The human brain is a miracle every human has, and mathematically modelling that brain is an overwhelming matter! I like teaching machines vision-language
Fawaz Sammani
HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton Wencan Cheng, Jae Hyun Park, Jong

cwc1260 23 Oct 21, 2022
Official repository for ABC-GAN

ABC-GAN The work represented in this repository is the result of a 14 week semesterthesis on photo-realistic image generation using generative adversa

IgorSusmelj 10 Jun 23, 2022
Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

GVP Transformer (wip) Implementation of the GVP-Transformer, which was used in the paper Learning inverse folding from millions of predicted structure

Phil Wang 19 May 06, 2022
Pytorch Lightning 1.2k Jan 06, 2023
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Multiple custom object count and detection using YOLOv3-Tiny method

Electronic-Component-YOLOv3 Introduce This project created to detect, count, and recognize multiple custom object using YOLOv3-Tiny method. The target

Derwin Mahardika 2 Nov 14, 2022
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh

Jesse Bloom 4 Feb 09, 2022
Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

Lens by Credo AI - Responsible AI Assessment Framework Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data a

Credo AI 27 Dec 14, 2022
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations This repo contains official code for the NeurIPS 2021 paper Imi

Jiayao Zhang 2 Oct 18, 2021
Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

Junho Kim 26 Nov 18, 2022
ICCV2021 Expert-Goal Trajectory Prediction

ICCV 2021: Where are you heading? Dynamic Trajectory Prediction with Expert Goal Examples This repository contains the code for the paper Where are yo

hz 21 Dec 12, 2022
Fair Recommendation in Two-Sided Platforms

Fair Recommendation in Two-Sided Platforms

gourabgggg 1 Nov 10, 2021
An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Fast Transformer This repo implements Fastformer: Additive Attention Can Be All You Need by Wu et al. in TensorFlow. Fast Transformer is a Transformer

Rishit Dagli 139 Dec 28, 2022
Ros2-voiceroid2 - ROS2 wrapper package of VOICEROID2

ros2_voiceroid2 ROS2 wrapper package of VOICEROID2 Windows Only Installation Ins

Nkyoku 1 Jan 23, 2022
A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

R-YOLOv4 This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detect

94 Dec 03, 2022
Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Re-TACRED Re-TACRED: Addressing Shortcomings of the TACRED Dataset

George Stoica 40 Dec 10, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
Code for ICML 2021 paper: How could Neural Networks understand Programs?

OSCAR This repository contains the source code of our ICML 2021 paper How could Neural Networks understand Programs?. Environment Run following comman

Dinglan Peng 115 Dec 17, 2022
NDE: Climate Modeling with Neural Diffusion Equation, ICDM'21

Climate Modeling with Neural Diffusion Equation Introduction This is the repository of our accepted ICDM 2021 paper "Climate Modeling with Neural Diff

Jeehyun Hwang 5 Dec 18, 2022