计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Overview

Awesome-Attention-Mechanism-in-cv

Table of Contents

Introduction

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制,还收集了一些即插即用模块。由于能力有限精力有限,可能很多模块并没有包括进来,有任何的建议或者改进,可以提交issue或者进行PR。

Attention Mechanism

Paper Publish Link Main Idea Blog
Global Second-order Pooling Convolutional Networks CVPR19 GSoPNet 将高阶和注意力机制在网络中部地方结合起来
Neural Architecture Search for Lightweight Non-Local Networks CVPR20 AutoNL NAS+LightNL
Squeeze and Excitation Network CVPR18 SENet 最经典的通道注意力 zhihu
Selective Kernel Network CVPR19 SKNet SE+动态选择 zhihu
Convolutional Block Attention Module ECCV18 CBAM 串联空间+通道注意力 zhihu
BottleNeck Attention Module BMVC18 BAM 并联空间+通道注意力 zhihu
Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks MICCAI18 scSE 并联空间+通道注意力 zhihu
Non-local Neural Networks CVPR19 Non-Local(NL) self-attention zhihu
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond ICCVW19 GCNet 对NL进行改进 zhihu
CCNet: Criss-Cross Attention for Semantic Segmentation ICCV19 CCNet 对NL改进
SA-Net:shuffle attention for deep convolutional neural networks ICASSP 21 SANet SGE+channel shuffle zhihu
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks CVPR20 ECANet SE的改进
Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks CoRR19 SGENet Group+spatial+channel
FcaNet: Frequency Channel Attention Networks CoRR20 FcaNet 频域上的SE操作
$A^2\text{-}Nets$: Double Attention Networks NeurIPS18 DANet NL的思想应用到空间和通道
Asymmetric Non-local Neural Networks for Semantic Segmentation ICCV19 APNB spp+NL
Efficient Attention: Attention with Linear Complexities CoRR18 EfficientAttention NL降低计算量
Image Restoration via Residual Non-local Attention Networks ICLR19 RNAN
Exploring Self-attention for Image Recognition CVPR20 SAN 理论性很强,实现起来很简单
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None MSRA综述self-attention
Object-Contextual Representations for Semantic Segmentation ECCV20 OCRNet 复杂的交互机制,效果确实好
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification TTNNLS20 IAUNet 引入时序信息
ResNeSt: Split-Attention Networks CoRR20 ResNeSt SK+ResNeXt
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks NeurIPS18 GENet SE续作
Improving Convolutional Networks with Self-calibrated Convolutions CVPR20 SCNet 自校正卷积
Rotate to Attend: Convolutional Triplet Attention Module WACV21 TripletAttention CHW两两互相融合
Dual Attention Network for Scene Segmentation CVPR19 DANet self-attention
Relation-Aware Global Attention for Person Re-identification CVPR20 RGA 用于reid
Attentional Feature Fusion WACV21 AFF 特征融合的attention方法
An Attentive Survey of Attention Models CoRR19 None 包括NLP/CV/推荐系统等方面的注意力机制
Stand-Alone Self-Attention in Vision Models NeurIPS19 FullAttention 全部的卷积都替换为self-attention
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation ECCV18 BiSeNet 类似FPN的特征融合方法 zhihu
DCANet: Learning Connected Attentions for Convolutional Neural Networks CoRR20 DCANet 增强attention之间信息流动
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None 对空间注意力进行针对性分析
Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition CVPR17 Oral RA-CNN 细粒度识别
Guided Attention Network for Object Detection and Counting on Drones ACM MM20 GANet 处理目标检测问题
Attention Augmented Convolutional Networks ICCV19 AANet 多头+引入额外特征映射
GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION ICLR21 GSA 新的全局注意力模块
Attention-Guided Hierarchical Structure Aggregation for Image Matting CVPR20 HAttMatting 抠图方面的应用,高层使用通道注意力机制,然后再使用空间注意力机制指导低层。
Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks ECCV20 None 与SE互补的权值激活机制
Expectation-Maximization Attention Networks for Semantic Segmentation ICCV19 Oral EMANet EM+Attention

Plug and Play Module

  • ACBlock
  • Swish、wish Activation
  • ASPP Block
  • DepthWise Convolution
  • Fused Conv & BN
  • MixedDepthwise Convolution
  • PSP Module
  • RFBModule
  • SematicEmbbedBlock
  • SSH Context Module
  • Some other usefull tools such as concate feature map、flatten feature map
  • WeightedFeatureFusion:EfficientDet中的FPN用到的fuse方式
  • StripPooling:CVPR2020中核心代码StripPooling
  • GhostModule: CVPR2020GhostNet的核心模块
  • SlimConv: SlimConv3x3
  • Context Gating: video classification
  • EffNetBlock: EffNet
  • ECCV2020 BorderDet: Border aligment module
  • CVPR2019 DANet: Dual Attention
  • Object Contextual Representation for sematic segmentation: OCRModule
  • FPT: 包含Self Transform、Grounding Transform、Rendering Transform
  • DOConv: 阿里提出的Depthwise Over-parameterized Convolution
  • PyConv: 起源人工智能研究院提出的金字塔卷积
  • ULSAM:用于紧凑型CNN的超轻量级子空间注意力模块
  • DGC: ECCV 2020用于加速卷积神经网络的动态分组卷积
  • DCANet: ECCV 2020 学习卷积神经网络的连接注意力
  • PSConv: ECCV 2020 将特征金字塔压缩到紧凑的多尺度卷积层中
  • Dynamic Convolution: CVPR2020 动态滤波器卷积(非官方)
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Evaluation

基于CIFAR10+ResNet+待测评模块,对模块进行初步测评。测评代码来自于另外一个库:https://github.com/kuangliu/pytorch-cifar/ 实验过程中,不使用预训练权重,进行随机初始化。

模型 top1 acc time params(MB)
SENet18 95.28% 1:27:50 11,260,354
ResNet18 95.16% 1:13:03 11,173,962
ResNet50 95.50% 4:24:38 23,520,842
ShuffleNetV2 91.90% 1:02:50 1,263,854
GoogLeNet 91.90% 1:02:50 6,166,250
MobileNetV2 92.66% 2:04:57 2,296,922
SA-ResNet50 89.83% 2:10:07 23,528,758
SA-ResNet18 95.07% 1:39:38 11,171,394

Paper List

SENet 论文: https://arxiv.org/abs/1709.01507 解读:https://zhuanlan.zhihu.com/p/102035721

Contribute

欢迎在issue中提出补充的文章paper和对应code链接。

Owner
PJDong
Computer vision learner, deep learner
PJDong
An implementation of the BADGE batch active learning algorithm.

Batch Active learning by Diverse Gradient Embeddings (BADGE) An implementation of the BADGE batch active learning algorithm. Details are provided in o

125 Dec 24, 2022
A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

switchnorm 1.7k Dec 26, 2022
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022
Randomizes the warps in a stock pokeemerald repo.

pokeemerald warp randomizer Randomizes the warps in a stock pokeemerald repo. Usage Instructions Install networkx and matplotlib via pip3 or similar.

Max Thomas 6 Mar 17, 2022
Shitty gaze mouse controller

demo.mp4 shitty_gaze_mouse_cotroller install tensofflow, cv2 run the main.py and as it starts it will collect data so first raise your left eyebrow(bo

16 Aug 30, 2022
ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

ObjectDrawer-ToolBox is a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system, Object Drawer.

77 Jan 05, 2023
ivadomed is an integrated framework for medical image analysis with deep learning.

Repository on the collaborative IVADO medical imaging project between the Mila and NeuroPoly labs.

144 Dec 19, 2022
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Sta

Charles R. Qi 4k Dec 30, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

beyond-preserved-accuracy Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression" How to implemen

Kevin Canwen Xu 10 Dec 23, 2022
Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

Qing Guo 13 Nov 29, 2022
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
Libtorch yolov3 deepsort

Overview It is for my undergrad thesis in Tsinghua University. There are four modules in the project: Detection: YOLOv3 Tracking: SORT and DeepSORT Pr

Xu Wei 226 Dec 13, 2022
Download and preprocess popular sequential recommendation datasets

Sequential Recommendation Datasets This repository collects some commonly used sequential recommendation datasets in recent research papers and provid

125 Dec 06, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Jan 04, 2023