A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Last update: Jan 07, 2023

Overview

CoAtNet

Overview

This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021.

👉 Check out MobileViT if you are interested in other Convolution + Transformer models.

Usage

import torch
from coatnet import coatnet_0

img = torch.randn(1, 3, 224, 224)
net = coatnet_0()
out = net(img)

Try out other block combinations mentioned in the paper:

from coatnet import CoAtNet

num_blocks = [2, 2, 3, 5, 2]            # L
channels = [64, 96, 192, 384, 768]      # D
block_types=['C', 'T', 'T', 'T']        # 'C' for MBConv, 'T' for Transformer

net = CoAtNet((224, 224), 3, num_blocks, channels, block_types=block_types)
out = net(img)

Citation

@article{dai2021coatnet,
  title={CoAtNet: Marrying Convolution and Attention for All Data Sizes},
  author={Dai, Zihang and Liu, Hanxiao and Le, Quoc V and Tan, Mingxing},
  journal={arXiv preprint arXiv:2106.04803},
  year={2021}
}

Credits

Code adapted from MobileNetV2 and ViT.

Owner

Justin Wu

GitHub Repository https://arxiv.org/abs/2106.04803

Supplemental learning materials for "Fourier Feature Networks and Neural Volume Rendering"

Fourier Feature Networks and Neural Volume Rendering This repository is a companion to a lecture given at the University of Cambridge Engineering Depa

133 Dec 26, 2022

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

8.7k Jan 05, 2023

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

3D position tracking for soccer players with multi-camera videos

This repo contains a full pipeline to support 3D position tracking of soccer players, with multi-view calibrated moving/fixed video sequences as inputs.

72 Dec 27, 2022

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

31 Nov 20, 2022

Explaining Hyperparameter Optimization via PDPs

Explaining Hyperparameter Optimization via PDPs This repository gives access to an implementation of the methods presented in the paper submission “Ex

2 Nov 16, 2022

Users can free try their models on SIDD dataset based on this code

SIDD benchmark 1 Train python train.py If you want to train your network, just modify the yaml in the options folder. 2 Validation python validation.p

2 May 20, 2022

Reference implementation for Structured Prediction with Deep Value Networks

Deep Value Network (DVN) This code is a python reference implementation of DVNs introduced in Deep Value Networks Learn to Evaluate and Iteratively Re

55 Feb 02, 2022

Language-Driven Semantic Segmentation

Language-driven Semantic Segmentation (LSeg) The repo contains official PyTorch Implementation of paper Language-driven Semantic Segmentation. Authors

416 Jan 03, 2023

Simple codebase for flexible neural net training

neural-modular Simple codebase for flexible neural net training. Allows for seamless exchange of models, dataset, and optimizers. Uses hydra for confi

7 Apr 05, 2022

Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project

This repository has become incompatible with the latest and recommended version of Tensorflow 2.0 Instead of refactoring this code painfully, I create

769 Dec 08, 2022

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

6 Jul 07, 2022

Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022

BRNet - code for Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss function

BRNet code for "Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss func

2 Mar 09, 2022

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

100 Dec 22, 2022

Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images

Keras-ICNet [paper] Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images. Training in progress! Requisites Python 3.6.3 K

87 Dec 16, 2022

Vector.ai assignment

fabio-tests-nisargatman Low Level Approach: ###Tables: continents: id*, name, population, area, createdAt, updatedAt countries: id*, name, population,

1 Nov 09, 2021

This is the official implementation of our proposed SwinMR

SwinMR This is the official implementation of our proposed SwinMR: Swin Transformer for Fast MRI Please cite: @article{huang2022swin, title={Swi

27 Nov 17, 2022

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

PyLabel pip install pylabel PyLabel is a Python package to help you prepare image datasets for computer vision models including PyTorch and YOLOv5. I

176 Jan 01, 2023

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

MT-VAE for Multimodal Human Motion Synthesis This is the code for ECCV 2018 paper MT-VAE: Learning Motion Transformations to Generate Multimodal Human

36 Oct 02, 2022

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Related tags

Overview

CoAtNet

Overview

Usage

Citation

Credits

Owner

Justin Wu

Supplemental learning materials for "Fourier Feature Networks and Neural Volume Rendering"

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

3D position tracking for soccer players with multi-camera videos

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Explaining Hyperparameter Optimization via PDPs

Users can free try their models on SIDD dataset based on this code

Reference implementation for Structured Prediction with Deep Value Networks

Language-Driven Semantic Segmentation

Simple codebase for flexible neural net training

Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

BRNet - code for Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss function

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images

Vector.ai assignment

This is the official implementation of our proposed SwinMR

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis