Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Last update: Dec 30, 2022

Related tags

Overview

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Abstract

For practical deep neural network design on mobile devices, it is essential to consider the constraints incurred by the computational resources and the inference latency in various applications. Among deep network acceleration related approaches, pruning is a widely adopted practice to balance the computational resource consumption and the accuracy, where unimportant connections can be removed either channel-wisely or randomly with a minimal impact on model accuracy. The channel pruning instantly results in a significant latency reduction, while the random weight pruning is more flexible to balance the latency and accuracy. In this paper, we present a unified framework with Joint Channel pruning and Weight pruning (JCW), and achieves a better Pareto-frontier between the latency and accuracy than previous model compression approaches. To fully optimize the trade-off between the latency and accuracy, we develop a tailored multi-objective evolutionary algorithm in the JCW framework, which enables one single search to obtain the optimal candidate architectures for various deployment requirements. Extensive experiments demonstrate that the JCW achieves a better trade-off between the latency and accuracy against various state-of-the-art pruning methods on the ImageNet classification dataset.

Framework

Evaluation

Resnet18

Method	Latency/ms	Accuracy
Uniform 1x	537	69.8
DMCP	341	69.7
APS	363	70.3
JCW	160	69.2
	194	69.7
	196	69.9
	224	70.2

MobileNetV1

Method	Latency/ms	Accuracy
Uniform 1x	167	70.9
Uniform 0.75x	102	68.4
Uniform 0.5x	53	64.4
AMC	94	70.7
Fast	61	68.4
AutoSlim	99	71.5
AutoSlim	55	67.9
USNet	102	69.5
USNet	53	64.2
JCW	31	69.1
	39	69.9
	43	69.8
	54	70.3
	69	71.4

MobileNetV2

Method	Latency/ms	Accuracy
Uniform 1x	114	71.8
Uniform 0.75x	71	69.8
Uniform 0.5x	41	65.4
APS	110	72.8
APS	64	69.0
DMCP	83	72.4
DMCP	45	67.0
DMCP	43	66.1
Fast	89	72.0
Fast	62	70.2
JCW	30	69.1
	40	69.9
	44	70.8
	59	72.2

Requirements

torch
torchvision
numpy
scipy

Usage

The JCW works in a two-step fashion. i.e. the search step and the training step. The search step seaches for the layer-wise channel numbers and weight sparsity for Pareto-optimal models. The training steps trains the searched models with ADMM. We give a simple example for resnet18.

The search step

Modify the configuration file

First, open the file experiments/res18-search.yaml:
```
vim experiments/res18-search.yaml
```
Go to the 44th line and find the following codes:
```
DATASET:
  data: ImageNet
  root: /path/to/imagenet
  ...
```
and modify the root property of DATASET to the path of ImageNet dataset on your machine.
Apply the search

After modifying the configuration file, you can simply start the search by:
```
python emo_search.py --config experiments/res18-search.yaml | tee experiments/res18-search.log
```
After searching, the search results will be saved in experiments/search.pth

The training step

After searching, we can train the searched models by:

Modify the base configuration file

Open the file experiments/res18-train.yaml:
```
vim experiments/res18-train.yaml
```
Go to the 5th line, find the following codes:
```
root: &root /path/to/imagenet
```
and modify the root property to the path of ImageNet dataset on your machine.
Generate configuration files for training

After modifying the base configuration file, we are ready to generate the configuration files for training. To do that, simply run the following command:
```
python scripts/generate_training_configs.py --base-config experiments/res18-train.yaml --search-result experiments/search.pth --output ./train-configs 
```
After running the above command, the training configuration files will be written into ./train-configs/model-{id}/train.yaml.
Apply the training

After generating the configuration files, simply run the following command to train one certain model:
```
python train.py --config xxxx/xxx/train.yaml | tee xxx/xxx/train.log
```

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Related tags

Overview

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Abstract

Framework

Evaluation

Resnet18

MobileNetV1

MobileNetV2

Requirements

Usage

The search step

The training step

Owner

CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.

Implementation of neural class expression synthesizers

Official implementation of TMANet.

My implementation of DeepMind's Perceiver

Image Data Augmentation in Keras

This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

The author's officially unofficial PyTorch BigGAN implementation.

Fast Neural Style for Image Style Transform by Pytorch

Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Code for testing convergence rates of Lipschitz learning on graphs

A small library of 3D related utilities used in my research.

FSL-Mate: A collection of resources for few-shot learning (FSL).

YoloV3 Implemented in Tensorflow 2.0

CodeContests is a competitive programming dataset for machine-learning

Semi-supevised Semantic Segmentation with High- and Low-level Consistency

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features