Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Last update: Sep 09, 2022

Overview

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation

This is the unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

reference: https://arxiv.org/pdf/2112.13692.pdf

Prerequisites

PyTorch
PyTorch Lightning
timm
torchmetrics
torchvision
python3
CUDA

Comments

Due to computation limits, CIFAR100 dataset was used in contrast to ImageNet in the original paper.
Since the official code is not released yet, there may be differences in structures and hyperparameters.
- Most of the hidden dimensions were chosen based on guesswork.
MADGRAD was used instead of LAMB optimizer.
(I thought it would be inefficient to use LAMB for small batchsizes in my local machine)
LayerScale will be added soon

Citations

@misc{touvron2021augmenting,
      title={Augmenting Convolutional networks with attention-based aggregation}, 
      author={Hugo Touvron and Matthieu Cord and Alaaeldin El-Nouby and Piotr Bojanowski and Armand Joulin and Gabriel Synnaeve and Hervé Jégou},
      year={2021},
      eprint={2112.13692},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Related tags

Overview

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation

Prerequisites

Comments

Citations

Owner

DK

OpenMMLab Model Deployment Toolset

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

A simple image/video to Desmos graph converter run locally

A library of scripts that interact with the PythonTurtle module to create games, drawings, and more

👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

Video2x - A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR.

Code for "Multi-Compound Transformer for Accurate Biomedical Image Segmentation"

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

An Straight Dilated Network with Wavelet for image Deblurring

This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Yolov5 deepsort inference，使用YOLOv5+Deepsort实现车辆行人追踪和计数，代码封装成一个Detector类，更容易嵌入到自己的项目中

Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Gradient representations in ReLU networks as similarity functions

People Interaction Graph

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"