Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Overview

Segmenter: Transformer for Semantic Segmentation

Figure 1 from paper

Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and Cordelia Schmid.

*Equal Contribution

Installation

Define os environment variables pointing to your checkpoint and dataset directory, put in your .bashrc:

export DATASET=/path/to/dataset/dir

Install PyTorch 1.9 then pip install . at the root of this repository.

To download ADE20K, use the following command:

python -m segm.scripts.prepare_ade20k $DATASET

Model Zoo

We release models with a Vision Transformer backbone initialized from the improved ViT models.

ADE20K

Segmenter models with ViT backbone:

Name mIoU (SS/MS) # params Resolution FPS Download
Seg-T-Mask/16 38.1 / 38.8 7M 512x512 52.4 model config log
Seg-S-Mask/16 45.3 / 46.9 27M 512x512 34.8 model config log
Seg-B-Mask/16 48.5 / 50.0 106M 512x512 24.1 model config log
Seg-L-Mask/16 51.3 / 53.2 334M 512x512 10.6 model config log
Seg-L-Mask/16 51.8 / 53.6 334M 640x640 - model config log

Segmenter models with DeiT backbone:

Name mIoU (SS/MS) # params Resolution FPS Download
Seg-B/16 47.1 / 48.1 87M 512x512 27.3 model config log
Seg-B-Mask/16 48.7 / 50.1 106M 512x512 24.1 model config log

Pascal Context

Name mIoU (SS/MS) # params Resolution FPS Download
Seg-L-Mask/16 58.1 / 59.0 334M 480x480 - model config log

Inference

Download one checkpoint with its configuration in a common folder, for example seg_tiny_mask.

You can generate segmentation maps from your own data with:

python -m segm.inference --model-path seg_tiny_mask/checkpoint.pth -i images/ -o segmaps/ 

To evaluate on ADE20K, run the command:

# single-scale evaluation:
python -m segm.eval.miou seg_tiny_mask/checkpoint.pth ade20k --singlescale
# multi-scale evaluation:
python -m segm.eval.miou seg_tiny_mask/checkpoint.pth ade20k --multiscale

Train

Train Seg-T-Mask/16 on ADE20K on a single GPU:

python -m segm.train --log-dir seg_tiny_mask --dataset ade20k \
  --backbone vit_tiny_patch16_384 --decoder mask_transformer

To train Seg-B-Mask/16, simply set vit_base_patch16_384 as backbone and launch the above command using a minimum of 4 V100 GPUs (~12 minutes per epoch) and up to 8 V100 GPUs (~7 minutes per epoch). The code uses SLURM environment variables.

Logs

To plot the logs of your experiments, you can use

python -m segm.utils.logs logs.yml

with logs.yml located in utils/ with the path to your experiments logs:

root: /path/to/checkpoints/
logs:
  seg-t: seg_tiny_mask/log.txt
  seg-b: seg_base_mask/log.txt

Video Segmentation

Zero shot video segmentation on DAVIS video dataset with Seg-B-Mask/16 model trained on ADE20K.

BibTex

@article{strudel2021,
  title={Segmenter: Transformer for Semantic Segmentation},
  author={Strudel, Robin and Garcia, Ricardo and Laptev, Ivan and Schmid, Cordelia},
  journal={arXiv preprint arXiv:2105.05633},
  year={2021}
}

Acknowledgements

The Vision Transformer code is based on timm library and the semantic segmentation training and evaluation pipeline is using mmsegmentation.

Owner
PhD student at Ecole Normale Supérieure and INRIA Paris
git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 05, 2022
OverFeat is a Convolutional Network-based image classifier and feature extractor.

OverFeat OverFeat is a Convolutional Network-based image classifier and feature extractor. OverFeat was trained on the ImageNet dataset and participat

593 Dec 08, 2022
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
A simple library that implements CLIP guided loss in PyTorch.

pytorch_clip_guided_loss: Pytorch implementation of the CLIP guided loss for Text-To-Image, Image-To-Image, or Image-To-Text generation. A simple libr

Sergei Belousov 74 Dec 26, 2022
This repository contains code accompanying the paper "An End-to-End Chinese Text Normalization Model based on Rule-Guided Flat-Lattice Transformer"

FlatTN This repository contains code accompanying the paper "An End-to-End Chinese Text Normalization Model based on Rule-Guided Flat-Lattice Transfor

THUHCSI 74 Nov 28, 2022
Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

Hongwen Zhang 450 Dec 28, 2022
Implements pytorch code for the Accelerated SGD algorithm.

AccSGD This is the code associated with Accelerated SGD algorithm used in the paper On the insufficiency of existing momentum schemes for Stochastic O

205 Jan 02, 2023
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation The reference code of Improving Factual Completeness and C

46 Dec 15, 2022
Definition of a business problem according to Wilson Lower Bound Score and Time Based Average Rating

Wilson Lower Bound Score, Time Based Rating Average In this study I tried to calculate the product rating and sorting reviews more accurately. I have

3 Sep 30, 2021
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 116 Dec 19, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

2 Nov 30, 2021
Lolviz - A simple Python data-structure visualization tool for lists of lists, lists, dictionaries; primarily for use in Jupyter notebooks / presentations

lolviz By Terence Parr. See Explained.ai for more stuff. A very nice looking javascript lolviz port with improvements by Adnan M.Sagar. A simple Pytho

Terence Parr 785 Dec 30, 2022
Our solution for SSN Invente 2021's Hackathon

Our solution for SSN Invente 2021's Hackathon. To help maitain godowns in a pristine and safe condition using raspberry pi.

1 Jan 12, 2022
A library for Deep Learning Implementations and utils

deeply A Deep Learning library Table of Contents Features Quick Start Usage License Features Python 2.7+ and Python 3.4+ compatible. Quick Start $ pip

Achilles Rasquinha 1 Dec 12, 2022
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch

Woncheol Shin 7 Sep 26, 2022
Pydantic models for pywttr and aiopywttr.

Pydantic models for pywttr and aiopywttr.

Almaz 2 Dec 08, 2022
Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

Xin Xia 42 Dec 09, 2022
Fast and robust certifiable relative pose estimation

Fast and Robust Relative Pose Estimation for Calibrated Cameras This repository contains the code for the relative pose estimation between two central

42 Dec 06, 2022
Viperdb - A tiny log-structured key-value database written in pure Python

ViperDB 🐍 ViperDB is a lightweight embedded key-value store written in pure Pyt

17 Oct 17, 2022