FCN-semantic-segmentation

Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], removes the fully connected layers, and adds transposed convolution layers with skip connections from lower layers. Initialises upsampling convolutions with bilinear interpolation filters and zeros the final (classification) layer.

Uses an independent cross-entropy loss per class. Trained with SGD with momentum, plus weight decay only on convolutional weights. Calculates and plots class-wise and mean intersection-over-union. Checkpoints the network every epoch.

Note: This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there). Contributions to fix this are welcome! The goal of this repo is to provide strong, simple and efficient baselines for semantic segmentation using the FCN method, so this shouldn't be restricted to using ResNet 34 etc.

Requirements

Instructions

Install all of the required software. To feasibly run the training, CUDA is needed. The crop size and batch size can be tailored to your GPU memory (the default crop and batch sizes use ~10GB of GPU RAM).
Register on the Cityscapes website to access the dataset.
Download and extract the training/validation RGB data (leftImg8bit_trainvaltest) and ground truth data (gtFine_trainvaltest).
Run python main.py <options>.

First a Dataset object is set up, returning the RGB inputs, one-hot targets (for independent classification) and label targets. During training, the images are randomly cropped and horizontally flipped. Testing calculates IoU scores and produces a subset of coloured predictions that match the coloured ground truth.

References

[1] Fully convolutional networks for semantic segmentation
[2] Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

Related tags

Overview

FCN-semantic-segmentation

Requirements

Instructions

References

Owner

Kai Arulkumaran

State-of-the-art data augmentation search algorithms in PyTorch

Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

Classifying audio using Wavelet transform and deep learning

Training deep models using anime, illustration images.

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

An open-source outlier detection package by Getcontact Data Team

Privacy-Preserving Portrait Matting [ACM MM-21]

The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

[ICCV21] Official implementation of the "Social NCE: Contrastive Learning of Socially-aware Motion Representations" in PyTorch.

Code for ICLR2018 paper: Improving GAN Training via Binarized Representation Entropy (BRE) Regularization - Y. Cao · W Ding · Y.C. Lui · R. Huang

Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

This repository introduces a short project about Transfer Learning for Classification of MRI Images.

Optimus: the first large-scale pre-trained VAE language model

Learning Versatile Neural Architectures by Propagating Network Codes

Open-Set Recognition: A Good Closed-Set Classifier is All You Need

A Python implementation of global optimization with gaussian processes.

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"