TensorFlow-based implementation of "Pyramid Scene Parsing Network".

Last update: Dec 20, 2022

Overview

PSPNet_tensorflow

Important

Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If you want to train from scratch, you need to implement the Synchronize BN layer first to do large batch-size training (as described in the paper). It seems that this repo has reproduced it, you can take a look on it.

Introduction

This is an implementation of PSPNet in TensorFlow for semantic segmentation on the cityscapes dataset. We first convert weight from Original Code by using caffe-tensorflow framework.

Update:

News (2018.11.08 updated):

Now you can try PSPNet on your own image online using ModelDepot live demo!

2018/01/24:

Support evaluation code for ade20k dataset

2018/01/19:

Support inference phase for ade20k dataset using model of pspnet50 (convert weights from original author)
Using tf.matmul to decode label, so as to improve the speed of inference.

2017/11/06:

Support different input size by padding input image to (720, 720) if original size is smaller than it, and get result by cropping image in the end.

2017/10/27:

Change bn layer from tf.nn.batch_normalization into tf.layers.batch_normalization in order to support training phase. Also update initial model in Google Drive.

Install

Get restore checkpoint from Google Drive and put into model directory. Note: Select the checkpoint corresponding to the dataset.

Inference

To get result on your own images, use the following command:

python inference.py --img-path=./input/test.png --dataset cityscapes

Inference time: ~0.6s

Options:

--dataset cityscapes or ade20k
--flipped-eval 
--checkpoints /PATH/TO/CHECKPOINT_DIR

Evaluation

Cityscapes

Perform in single-scaled model on the cityscapes validation datase.

Method	Accuracy
Without flip	76.99%
Flip	77.23%

ade20k

Method	Accuracy
Without flip	40.00%
Flip	40.67%

To re-produce evluation results, do following steps:

Download Cityscape dataset or ADE20k dataset first.
change data_dir to your dataset path in evaluate.py:

'data_dir': ' = /Path/to/dataset'

Run the following command:

python evaluate.py --dataset cityscapes

List of Args:

--dataset - ade20k or cityscapes
--flipped-eval  - Using flipped evaluation method
--measure-time  - Calculate inference time

Image Result

cityscapes

Input image	Output image

ade20k

Input image	Output image

real world

Input image	Output image

Citation

@article{zhao2017pspnet,
  author = {Hengshuang Zhao and
            Jianping Shi and
            Xiaojuan Qi and
            Xiaogang Wang and
            Jiaya Jia},
  title = {Pyramid Scene Parsing Network},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
}

Scene Parsing through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. Computer Vision and Pattern Recognition (CVPR), 2017. (http://people.csail.mit.edu/bzhou/publication/scene-parse-camera-ready.pdf)

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

Semantic Understanding of Scenes through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. arXiv:1608.05442. (https://arxiv.org/pdf/1608.05442.pdf)

@article{zhou2016semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={arXiv preprint arXiv:1608.05442},
  year={2016}
}

TensorFlow-based implementation of "Pyramid Scene Parsing Network".

Related tags

Overview

PSPNet_tensorflow

Important

Introduction

Update:

News (2018.11.08 updated):

2018/01/24:

2018/01/19:

2017/11/06:

2017/10/27:

Install

Inference

Evaluation

Cityscapes

ade20k

Image Result

cityscapes

ade20k

real world

Citation

Owner

HsuanKung Yang

Unifying Global-Local Representations in Salient Object Detection with Transformer

Solving Zero-Shot Learning in Named Entity Recognition with Common Sense Knowledge

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

SLAMP: Stochastic Latent Appearance and Motion Prediction

这是一个利用facenet和retinaface实现人脸识别的库，可以进行在线的人脸识别。

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Modular Gaussian Processes

A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

Metadata-Extractor - Metadata Extractor Script can be used to read in exif metadata

Videocaptioning.pytorch - A simple implementation of video captioning

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Certis - Certis, A High-Quality Backtesting Engine