A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)

Last update: Nov 22, 2022

Related tags

Overview

A Higher Performance Pytorch Implementation of DeepLab V3 Plus

Introduction

This repo is an (re-)implementation of Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation in PyTorch for semantic image segmentation on the PASCAL VOC dataset. And this repo has a higher mIoU of 79.19% than the result of paper which is 78.85%.

Requirements

Python(3.6) and Pytorch(0.4.1) is necessary before running the scripts. To install the required python packages(expect PyTorch), run

pip install -r requirements.txt

Datasets

To train and validate the network, this repo use the augmented PASCAL VOC 2012 dataset which contains 10582 images for training and 1449 images for validation. To use the dataset, you can download the PASCAL VOC training/validation data (2GB tar file) here and download the SegmentationClassAug from dropbox or Baidu Netdisk

Training

Before training, you should clone this repo:

git clone git@github.com:hualin95/Deeplab-v3plus.git

You can begin training by running the train.py.

#training
cd Deeplab-v3plus-master/tools/   
python train.py

You are expected to achieve PA:94.77%, MPA:88.48%, MIoU:79.19%, FWIoU:90.53% on the validation.

#Monitoring
tensorboard --logdir=runs/ --port=80

Performance

VOC2012: after 30k iterations with a batch size of 16.

Backbone	train OS	eval OS	MS	mIoU paper	mIoU repo
Resnet101	16	16	No	78.85%	79.19%

TODO

Resnet as Network Backbone
Implement depthwise separable convolutions
Multi-GPU support
Model pretrained on MS-COCO
Xception as Network Backbone

A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)

Related tags

Overview

A Higher Performance Pytorch Implementation of DeepLab V3 Plus

Introduction

Requirements

Datasets

Training

Performance

TODO

Owner

linhua

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

Preprossing-loan-data-with-NumPy - In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United States.

Autonomous Driving on Curvy Roads without Reliance on Frenet Frame: A Cartesian-based Trajectory Planning Method

Code for Mesh Convolution Using a Learned Kernel Basis

Specificity-preserving RGB-D Saliency Detection

Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

a dnn ai project to classify which food people are eating on audio recordings

Data manipulation and transformation for audio signal processing, powered by PyTorch

GluonMM is a library of transformer models for computer vision and multi-modality research

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Survival analysis (SA) is a well-known statistical technique for the study of temporal events.

A little software to generate and save Julia or Mandelbrot's Fractals.

基于Flask开发后端、VUE开发前端框架，在WEB端部署YOLOv5目标检测模型

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Deep Learning Specialization by Andrew Ng, deeplearning.ai.