Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Last update: Dec 27, 2022

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model	FLOPs	#Params.	top-1 err. (ReLU)	top-1 err. (Swish)	top-1 err. (ACON)
ShuffleNetV2 0.5x	41M	1.4M	39.4	38.3 (+1.1)	37.0 (+2.4)
ShuffleNetV2 1.5x	299M	3.5M	27.4	26.8 (+0.6)	26.5 (+0.9)
ResNet 50	3.9G	25.5M	24.0	23.5 (+0.5)	23.2 (+0.8)
ResNet 101	7.6G	44.4M	22.8	22.7 (+0.1)	21.8 (+1.0)
ResNet 152	11.3G	60.0M	22.3	22.2 (+0.1)	21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model	FLOPs	#Params.	top-1 err.
ShuffleNetV2 0.5x (meta-acon)	41M	1.7M	34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon)	299M	3.9M	24.7 (+2.7)
ResNet 50 (meta-acon)	3.9G	25.7M	22.0 (+2.0)
ResNet 101 (meta-acon)	7.6G	44.8M	21.0 (+1.8)
ResNet 152 (meta-acon)	11.3G	60.5M	20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

	FLOPs	#Params.	top-1 err.
MobileNetV2 0.17	42M	1.4M	52.6
ShuffleNetV2 0.5x	41M	1.4M	39.4
TFNet 0.5	43M	1.3M	36.6 (+2.8)
MobileNetV2 0.6	141M	2.2M	33.3
ShuffleNetV2 1.0x	146M	2.3M	30.6
TFNet 1.0	135M	1.9M	29.7 (+0.9)
MobileNetV2 1.0	300M	3.4M	28.0
ShuffleNetV2 1.5x	299M	3.5M	27.4
TFNet 1.5	279M	2.7M	26.0 (+1.4)
MobileNetV2 1.4	585M	5.5M	25.3
ShuffleNetV2 2.0x	591M	7.4M	25.0
TFNet 2.0	474M	3.8M	24.3 (+0.7)

Trained Models

OneDrive download: Link
BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

ACON

Training curves

TFNet

Main results

Trained Models

Usage

Requirements

Citation

Owner

A Comparative Framework for Multimodal Recommender Systems

Reproducing Results from A Hybrid Approach to Targeting Social Assistance

A curated list of awesome Model-Based RL resources

Automatic tool focused on deriving metallicities of open clusters

It helps user to learn Pick-up lines and share if he has a better one

Live training loss plot in Jupyter Notebook for Keras, PyTorch and others

Pyeventbus: a publish/subscribe event bus

Code for Fold2Seq paper from ICML 2021

Educational API for 3D Vision using pose to control carton.

Implementation for the paper SMPLicit: Topology-aware Generative Model for Clothed People (CVPR 2021)

Pomodoro timer that acknowledges the inexorable, infinite passage of time

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Filtering variational quantum algorithms for combinatorial optimization

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Graph parsing approach to structured sentiment analysis.

Disagreement-Regularized Imitation Learning

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Image Captioning using CNN and Transformers