计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Overview

Awesome-Attention-Mechanism-in-cv

Table of Contents

Introduction

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制,还收集了一些即插即用模块。由于能力有限精力有限,可能很多模块并没有包括进来,有任何的建议或者改进,可以提交issue或者进行PR。

Attention Mechanism

Paper Publish Link Main Idea Blog
Global Second-order Pooling Convolutional Networks CVPR19 GSoPNet 将高阶和注意力机制在网络中部地方结合起来
Neural Architecture Search for Lightweight Non-Local Networks CVPR20 AutoNL NAS+LightNL
Squeeze and Excitation Network CVPR18 SENet 最经典的通道注意力 zhihu
Selective Kernel Network CVPR19 SKNet SE+动态选择 zhihu
Convolutional Block Attention Module ECCV18 CBAM 串联空间+通道注意力 zhihu
BottleNeck Attention Module BMVC18 BAM 并联空间+通道注意力 zhihu
Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks MICCAI18 scSE 并联空间+通道注意力 zhihu
Non-local Neural Networks CVPR19 Non-Local(NL) self-attention zhihu
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond ICCVW19 GCNet 对NL进行改进 zhihu
CCNet: Criss-Cross Attention for Semantic Segmentation ICCV19 CCNet 对NL改进
SA-Net:shuffle attention for deep convolutional neural networks ICASSP 21 SANet SGE+channel shuffle zhihu
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks CVPR20 ECANet SE的改进
Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks CoRR19 SGENet Group+spatial+channel
FcaNet: Frequency Channel Attention Networks CoRR20 FcaNet 频域上的SE操作
$A^2\text{-}Nets$: Double Attention Networks NeurIPS18 DANet NL的思想应用到空间和通道
Asymmetric Non-local Neural Networks for Semantic Segmentation ICCV19 APNB spp+NL
Efficient Attention: Attention with Linear Complexities CoRR18 EfficientAttention NL降低计算量
Image Restoration via Residual Non-local Attention Networks ICLR19 RNAN
Exploring Self-attention for Image Recognition CVPR20 SAN 理论性很强,实现起来很简单
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None MSRA综述self-attention
Object-Contextual Representations for Semantic Segmentation ECCV20 OCRNet 复杂的交互机制,效果确实好
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification TTNNLS20 IAUNet 引入时序信息
ResNeSt: Split-Attention Networks CoRR20 ResNeSt SK+ResNeXt
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks NeurIPS18 GENet SE续作
Improving Convolutional Networks with Self-calibrated Convolutions CVPR20 SCNet 自校正卷积
Rotate to Attend: Convolutional Triplet Attention Module WACV21 TripletAttention CHW两两互相融合
Dual Attention Network for Scene Segmentation CVPR19 DANet self-attention
Relation-Aware Global Attention for Person Re-identification CVPR20 RGA 用于reid
Attentional Feature Fusion WACV21 AFF 特征融合的attention方法
An Attentive Survey of Attention Models CoRR19 None 包括NLP/CV/推荐系统等方面的注意力机制
Stand-Alone Self-Attention in Vision Models NeurIPS19 FullAttention 全部的卷积都替换为self-attention
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation ECCV18 BiSeNet 类似FPN的特征融合方法 zhihu
DCANet: Learning Connected Attentions for Convolutional Neural Networks CoRR20 DCANet 增强attention之间信息流动
An Empirical Study of Spatial Attention Mechanisms in Deep Networks ICCV19 None 对空间注意力进行针对性分析
Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition CVPR17 Oral RA-CNN 细粒度识别
Guided Attention Network for Object Detection and Counting on Drones ACM MM20 GANet 处理目标检测问题
Attention Augmented Convolutional Networks ICCV19 AANet 多头+引入额外特征映射
GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION ICLR21 GSA 新的全局注意力模块
Attention-Guided Hierarchical Structure Aggregation for Image Matting CVPR20 HAttMatting 抠图方面的应用,高层使用通道注意力机制,然后再使用空间注意力机制指导低层。
Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks ECCV20 None 与SE互补的权值激活机制
Expectation-Maximization Attention Networks for Semantic Segmentation ICCV19 Oral EMANet EM+Attention

Plug and Play Module

  • ACBlock
  • Swish、wish Activation
  • ASPP Block
  • DepthWise Convolution
  • Fused Conv & BN
  • MixedDepthwise Convolution
  • PSP Module
  • RFBModule
  • SematicEmbbedBlock
  • SSH Context Module
  • Some other usefull tools such as concate feature map、flatten feature map
  • WeightedFeatureFusion:EfficientDet中的FPN用到的fuse方式
  • StripPooling:CVPR2020中核心代码StripPooling
  • GhostModule: CVPR2020GhostNet的核心模块
  • SlimConv: SlimConv3x3
  • Context Gating: video classification
  • EffNetBlock: EffNet
  • ECCV2020 BorderDet: Border aligment module
  • CVPR2019 DANet: Dual Attention
  • Object Contextual Representation for sematic segmentation: OCRModule
  • FPT: 包含Self Transform、Grounding Transform、Rendering Transform
  • DOConv: 阿里提出的Depthwise Over-parameterized Convolution
  • PyConv: 起源人工智能研究院提出的金字塔卷积
  • ULSAM:用于紧凑型CNN的超轻量级子空间注意力模块
  • DGC: ECCV 2020用于加速卷积神经网络的动态分组卷积
  • DCANet: ECCV 2020 学习卷积神经网络的连接注意力
  • PSConv: ECCV 2020 将特征金字塔压缩到紧凑的多尺度卷积层中
  • Dynamic Convolution: CVPR2020 动态滤波器卷积(非官方)
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Evaluation

基于CIFAR10+ResNet+待测评模块,对模块进行初步测评。测评代码来自于另外一个库:https://github.com/kuangliu/pytorch-cifar/ 实验过程中,不使用预训练权重,进行随机初始化。

模型 top1 acc time params(MB)
SENet18 95.28% 1:27:50 11,260,354
ResNet18 95.16% 1:13:03 11,173,962
ResNet50 95.50% 4:24:38 23,520,842
ShuffleNetV2 91.90% 1:02:50 1,263,854
GoogLeNet 91.90% 1:02:50 6,166,250
MobileNetV2 92.66% 2:04:57 2,296,922
SA-ResNet50 89.83% 2:10:07 23,528,758
SA-ResNet18 95.07% 1:39:38 11,171,394

Paper List

SENet 论文: https://arxiv.org/abs/1709.01507 解读:https://zhuanlan.zhihu.com/p/102035721

Contribute

欢迎在issue中提出补充的文章paper和对应code链接。

Owner
PJDong
Computer vision learner, deep learner
PJDong
Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.

Iris Species Predictor Iris prediction is used to classify iris species using their sepal length, sepal width, petal length and petal width created us

Siva Prakash 2 Jan 06, 2022
Fast Differentiable Matrix Sqrt Root

Fast Differentiable Matrix Sqrt Root Geometric Interpretation of Matrix Square Root and Inverse Square Root This repository constains the official Pyt

YueSong 42 Dec 30, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 985 Jan 08, 2023
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
Pytorch Implementation for Dilated Continuous Random Field

DilatedCRF Pytorch implementation for fully-learnable DilatedCRF. If you find my work helpful, please consider our paper: @article{Mo2022dilatedcrf,

DunnoCoding_Plus 3 Nov 13, 2022
The code for paper "Learning Implicit Fields for Generative Shape Modeling".

implicit-decoder The tensorflow code for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang. Project pag

Zhiqin Chen 353 Dec 30, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
Predict and time series avocado hass

RECOMMENDER SYSTEM MARKETING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU 1. Giới thiệu - Tiki là một hệ sinh thái thương mại "all in one", trong đó có tiki.vn, là

hieulmsc 3 Jan 10, 2022
The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Lexa-Benchmark Codebase for the self-supervised goal reaching benchmark introduced in 'Discovering and Achieving Goals via World Models'. Setup Create

1 Oct 14, 2021
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
Small little script to scrape, parse and check for active tor nodes. Can be used as proxies.

TorScrape TorScrape is a small but useful script made in python that scrapes a website for active tor nodes, parse the html and then save the nodes in

5 Dec 04, 2022
This repository contains the code used to quantitatively evaluate counterfactual examples in the associated paper.

On Quantitative Evaluations of Counterfactuals Install To install required packages with conda, run the following command: conda env create -f requi

Frederik Hvilshøj 1 Jan 16, 2022
Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Drone Detection using Thermal Signature This repository highlights the work for night-time drone detection using a using an Optris PI Lightweight ther

Chong Yu Quan 6 Dec 31, 2022
Privacy-Preserving Machine Learning (PPML) Tutorial Presented at PyConDE 2022

PPML: Machine Learning on Data you cannot see Repository for the tutorial on Privacy-Preserving Machine Learning (PPML) presented at PyConDE 2022 Abst

Valerio Maggio 10 Aug 16, 2022
HomoInterpGAN - Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation

HomoInterpGAN Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation (CVPR 2019, oral) Installation The implementation is base

Ying-Cong Chen 99 Nov 15, 2022
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers

Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat

DreamSoul 3 Sep 12, 2022
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification

76 Dec 05, 2022
Semantic Edge Detection with Diverse Deep Supervision

Semantic Edge Detection with Diverse Deep Supervision This repository contains the code for our IJCV paper: "Semantic Edge Detection with Diverse Deep

Yun Liu 12 Dec 31, 2022