Pixel-wise segmentation on VOC2012 dataset using pytorch.

Last update: Dec 30, 2022

Overview

PiWiSe

Pixel-wise segmentation on the VOC2012 dataset using pytorch.

For a more complete implementation of segmentation networks checkout semseg.

Note:

FCN differs from original implementation see this issue
SegNet does not match original paper performance see here
PSPNet misses "atrous convolution" (conv layers of ResNet101 should be amended to preserve image size)

Keeping this in mind feel free to PR. Thank you!

Setup

See dataset examples here.

Download

Download image archive and extract and do:

mkdir data
mv VOCdevkit/VOC2012/JPEGImages data/images
mv VOCdevkit/VOC2012/SegmentationClass data/classes
rm -rf VOCdevkit

Install

We recommend using pyenv:

pyenv virtualenv 3.6.0 piwise
pyenv activate piwise

then install requirements with pip install -r requirements.txt.

Usage

For latest documentation use:

python main.py --help

Supported model parameters are fcn8, fcn16, fcn32, unet, segnet1, segnet2, pspnet.

Training

If you want to have visualization open an extra tab with:

python -m visdom.server -port 5000

Train the SegNet model 30 epochs with cuda support, visualization and checkpoints every 100 steps:

python main.py --cuda --model segnet2 train --datadir data \
    --num-epochs 30 --num-workers 4 --batch-size 4 \
    --steps-plot 50 --steps-save 100

Evaluation

Then we want to do semantic segmentation on foo.jpg:

python main.py --model segnet2 --state segnet2-30-0 eval foo.jpg foo.png

The segmented class image can now be found at foo.png.

Results

These are some results based on segnet after 40 epoches. Set

loss_weights[0] = 1 / 1

to deal gracefully with the unbalanced problem.

Input	Output	Ground Truth

Pixel-wise segmentation on VOC2012 dataset using pytorch.

Related tags

Overview

PiWiSe

Setup

Download

Install

Usage

Training

Evaluation

Results

Owner

Bodo Kaiser

x-transformers-paddle 2.x version

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Uses OpenCV and Python Code to detect a face on the screen

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

An self sufficient AI that crawls the web to learn how to generate art from keywords

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

How to Predict Stock Prices Easily Demo

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

Spatial Single-Cell Analysis Toolkit

Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Eye-Blink-Counter - Python based Computer Vision project which counts how many time a person blinks

X-VLM: Multi-Grained Vision Language Pre-Training

MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

This repository collects 100 papers related to negative sampling methods.

This is a project based on ConvNets used to identify whether a road is clean or dirty. We have used MobileNet as our base architecture and the weights are based on imagenet.

Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Predictive Maintenance LSTM

Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution