LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Last update: Dec 02, 2022

Overview

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Introduction

This project contains the code (Note: The code is test in the environment with python=3.6, cuda=9.0, PyTorch-0.4.1, also support Pytorch-0.4.1+) for: LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation by Yu Wang.

The extensive computational burden limits the usage of CNNs in mobile devices for dense estimation tasks, a.k.a semantic segmentation. In this paper, we present a lightweight network to address this problem, namely **LEDNet**, which employs an asymmetric encoder-decoder architecture for the task of real-time semantic segmentation.More specifically, the encoder adopts a ResNet as backbone network, where two new operations, channel split and shuffle, are utilized in each residual block to greatly reduce computation cost while maintaining higher segmentation accuracy. On the other hand, an attention pyramid network (APN) is employed in the decoder to further lighten the entire network complexity. Our model has less than 1M parameters, and is able to run at over 71 FPS on a single GTX 1080Ti GPU card. The comprehensive experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-off on Cityscapes dataset. and becomes an effective method for real-time semantic segmentation tasks.

Project-Structure

├── datasets  # contains all datasets for the project
|  └── cityscapes #  cityscapes dataset
|  |  └── gtCoarse #  Coarse cityscapes annotation
|  |  └── gtFine #  Fine cityscapes annotation
|  |  └── leftImg8bit #  cityscapes training image
|  └── cityscapesscripts #  cityscapes dataset label convert scripts！
├── utils
|  └── dataset.py # dataloader for cityscapes dataset
|  └── iouEval.py # for test 'iou mean' and 'iou per class'
|  └── transform.py # data preprocessing
|  └── visualize.py # Visualize with visdom 
|  └── loss.py # loss function 
├── checkpoint
|  └── xxx.pth # pretrained models encoder form ImageNet
├── save
|  └── xxx.pth # trained models form scratch 
├── imagenet-pretrain
|  └── lednet_imagenet.py # 
|  └── main.py # 
├── train
|  └── lednet.py  # model definition for semantic segmentation
|  └── main.py # train model scripts
├── test
|  |  └── dataset.py 
|  |  └── lednet.py # model definition
|  |  └── lednet_no_bn.py # Remove the BN layer in model definition
|  |  └── eval_cityscapes_color.py # Test the results to generate RGB images
|  |  └── eval_cityscapes_server.py # generate result uploaded official server
|  |  └── eval_forward_time.py # Test model inference time
|  |  └── eval_iou.py 
|  |  └── iouEval.py 
|  |  └── transform.py

Installation

Python 3.6.x. Recommended using Anaconda3
Set up python environment

pip3 install -r requirements.txt

Env: PyTorch_0.4.1; cuda_9.0; cudnn_7.1; python_3.6,
Clone this repository.

git clone https://github.com/xiaoyufenfei/LEDNet.git
cd LEDNet-master

Install Visdom.
Install torchsummary
Download the dataset by following the Datasets below.
Note: For training, we currently support cityscapes , aim to add Camvid and VOC and ADE20K dataset

Datasets

You can download cityscapes from here. Note: please download leftImg8bit_trainvaltest.zip(11GB) and gtFine_trainvaltest(241MB) and gtCoarse(1.3GB).
You can download CityscapesScripts, and convert the dataset to 19 categories. It should have this basic structure.

├── leftImg8bit
│   ├── train
│   ├──  val
│   └── test
├── gtFine
│   ├── train
│   ├──  val
│   └── test
├── gtCoarse
│   ├── train
│   ├── train_extra
│   └── val

Training-LEDNet

For help on the optional arguments you can run: python main.py -h
By default, we assume you have downloaded the cityscapes dataset in the ./data/cityscapes dir.
To train LEDNet using the train/main.py script the parameters listed in main.py as a flag or manually change them.

python main.py --savedir logs --model lednet --datadir path/root_directory/  --num-epochs xx --batch-size xx ...

Resuming-training-if-decoder-part-broken

for help on the optional arguments you can run: python main.py -h

python main.py --savedir logs --name lednet --datadir path/root_directory/  --num-epochs xx --batch-size xx --decoder --state "../save/logs/model_best_enc.pth.tar"...

Testing

the trained models of training process can be found at here. This may not be the best one, you can train one from scratch by yourself or Fine-tuning the training decoder with model encoder pre-trained on ImageNet, For instance

more details refer ./test/README.md

Results

Please refer to our article for more details.

Method	Dataset	Fine	Coarse	IoU_cla	IoU_cat	FPS
LEDNet	cityscapes	yes	yes	70.6%	87.1%	70+

qualitative segmentation result examples:

Citation

If you find this code useful for your research, please use the following BibTeX entry.

 @article{wang2019lednet,
  title={LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation},
  author={Wang, Yu and Zhou, Quan and Liu, Jia and Xiong，Jian and Gao, Guangwei and Wu, Xiaofu, and Latecki Jan Longin},
  journal={arXiv preprint arXiv:1905.02423},
  year={2019}
}

Tips

Limited by GPU resources, the project results need to be further improved...
It is recommended to pre-train Encoder on ImageNet and then Fine-turning Decoder part. The result will be better.

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Related tags

Overview

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Table of Contents:

Introduction

Project-Structure

Installation

Datasets

Training-LEDNet

Resuming-training-if-decoder-part-broken

Testing

Results

Citation

Tips

Reference

Owner

Yu Wang

RANZCR-CLiP 7th Place Solution

yolov5 deepsort 行人车辆跟踪检测计数

code for "Feature Importance-aware Transferable Adversarial Attacks"

StyleGAN2-ADA - Official PyTorch implementation

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Open-source Monocular Python HawkEye for Tennis

A Machine Teaching Framework for Scalable Recognition

Mscp jamf - Build compliance in jamf

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

🛰️ List of earth observation companies and job sites

[NeurIPS'21] Projected GANs Converge Faster

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

DaReCzech is a dataset for text relevance ranking in Czech

To prepare an image processing model to classify the type of disaster based on the image dataset

TANL: Structured Prediction as Translation between Augmented Natural Languages

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Related tags

Overview

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Table of Contents:

Introduction

Project-Structure

Installation

Datasets

Training-LEDNet

Resuming-training-if-decoder-part-broken

Testing

Results

Citation

Tips

Reference

Owner

Yu Wang

RANZCR-CLiP 7th Place Solution

yolov5 deepsort 行人 车辆 跟踪 检测 计数

code for "Feature Importance-aware Transferable Adversarial Attacks"

StyleGAN2-ADA - Official PyTorch implementation

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Open-source Monocular Python HawkEye for Tennis

A Machine Teaching Framework for Scalable Recognition

Mscp jamf - Build compliance in jamf

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

🛰️ List of earth observation companies and job sites

[NeurIPS'21] Projected GANs Converge Faster

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

DaReCzech is a dataset for text relevance ranking in Czech

To prepare an image processing model to classify the type of disaster based on the image dataset

TANL: Structured Prediction as Translation between Augmented Natural Languages

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

yolov5 deepsort 行人车辆跟踪检测计数