EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Last update: Jan 07, 2023

Overview

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

This repo contains the official Pytorch implementaion code and configuration files of EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network. created by Hu Zhang.

Installation

Requirements

Python 3.6+
PyTorch 1.0+

Our environments

OS: Ubuntu 18.04
CUDA: 10.0
Toolkit: PyTorch 1.0
GPU: Titan RTX

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Usage

First, clone the repository locally:

git clone https://github.com/murufeng/EPSANet.git
cd EPSANet

Create a conda virtual environment and activate it:

conda create -n epsanet python=3.6 
conda activate epsanet

Install CUDA==10.0 with cudnn7 following the official installation instructions
Install PyTorch==1.0.1 and torchvision==0.2.0 with CUDA==10.0:

conda install -c pytorch pytorch torchvision

Training

To train models on ImageNet with 8 gpus run:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python main.py -a epsanet50 --data /path/to/imagenet

Model Zoo

Models are trained with 8 GPUs on both ImageNet and MS-COCO 2017 dataset.

Image Classification on ImageNet

Model	Params(M)	FLOPs(G)	Top-1 (%)	Top-5 (%)
EPSANet-50(Small)	22.56	3.62	77.49	93.54
EPSANet-50(Large)	27.90	4.72	78.64	94.18
EPSANet-101(Small)	38.90	6.82	78.43	94.11
EPSANet-101(Large)	49.59	8.97	79.38	94.58

Object Detection on MS-COCO 2017

Faster R-CNN

model	Style	Lr schd	Params(M)	FLOPs(G)	box AP	AP_50	AP_75
EPSANet-50(small)	pytorch	1x	38.56	197.07	39.2	60.3	42.3
EPSANet-50(large)	pytorch	1x	43.85	219.64	40.9	62.1	44.6

Mask R-CNN

model	Style	Lr schd	Params(M)	FLOPs(G)	box AP	AP_50	AP_75
EPSANet-50(small)	pytorch	1x	41.20	248.53	40.0	60.9	43.3
EPSANet-50(large)	pytorch	1x	46.50	271.10	41.4	62.3	45.3

RetinaNet

model	Style	Lr schd	Params(M)	FLOPs(G)	box AP	AP_50	AP_75
EPSANet-50(small)	pytorch	1x	34.78	229.32	38.2	58.1	40.6
EPSANet-50(large)	pytorch	1x	40.07	251.89	39.6	59.4	42.3

Instance segmentation with Mask R-CNN on MS-COCO 2017

model	Params(M)	FLOPs(G)	AP	AP_50	AP_75
EPSANet-50(small)	41.20	248.53	35.9	57.7	38.1
EPSANet-50(Large)	46.50	271.10	37.1	59.0	39.5

Citing EPSANet

You can cite the paper as:

@article{hu2021epsanet,
  title={EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network},
  author={Hu Zhang and Keke Zu and Jian Lu and Yuru Zou and Deyu Meng},
  journal={arXiv preprint arXiv:2105.14447},
  year={2021}
}

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Related tags

Overview

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Installation

Requirements

Our environments

Data preparation

Usage

Training

Model Zoo

Image Classification on ImageNet

Object Detection on MS-COCO 2017

Faster R-CNN

Mask R-CNN

RetinaNet

Instance segmentation with Mask R-CNN on MS-COCO 2017

Citing EPSANet

Owner

Hu Zhang

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Instance-wise Feature Importance in Time (FIT)

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

《Truly shift-invariant convolutional neural networks》(2021)

Robustness between the worst and average case

Lightweight stereo matching network based on MobileNetV1 and MobileNetV2

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

FewBit — a library for memory efficient training of large neural networks

Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

RID-Noise: Towards Robust Inverse Design under Noisy Environments

REGTR: End-to-end Point Cloud Correspondences with Transformers

Wordplay, an artificial Intelligence based crossword puzzle solver.

A powerful framework for decentralized federated learning with user-defined communication topology

Official implementation of TMANet.

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Data stream analytics: Implement online learning methods to address concept drift in data streams using the River library. Code for the paper entitled "PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams" accepted in IEEE GlobeCom 2021.

Predicting a person's gender based on their weight and height