A pytorch-based real-time segmentation model for autonomous driving

Last update: Dec 22, 2022

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
Install some packages:

pip install opencv-python pillow numpy matplotlib

Clone this repository

git clone https://github.com/AngeLouCN/CFPNet

One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt

Training

You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
training on Cityscapes train set.

python train.py --dataset cityscapes

training on Camvid train and val set.

python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16

During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.

Val mIoU vs Epochs	Train loss vs Epochs

Testing

After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.

python eval_fps.py 1024,2048

Results

Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:

Dataset	Model	mIoU
Cityscapes	CFPNet-V1	60.4%
Cityscapes	CFPNet-V2	66.5%
Cityscapes	CFPNet-V3	70.1%

Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)

Category_acc vs size	Class_acc vs size

Class_acc vs parameter	Class_acc vs speed

Comparsion

Results of Cityscapes

Results of CamVid

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}

A pytorch-based real-time segmentation model for autonomous driving

Related tags

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

Installation

Dataset

Training

Testing

Evalution

Inference Speed

Results

Comparsion

Citation

Owner

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

An Api for Emotion recognition.

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021

Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase

Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

Motion planning algorithms commonly used on autonomous vehicles. (path planning + path tracking)

Everything's Talkin': Pareidolia Face Reenactment (CVPR2021)

Code for ICDM2020 full paper: "Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning"

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Using pytorch to implement unet network for liver image segmentation.

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

Housing Price Prediction

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".