Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Related tags

Deep Learningda-sac
Overview

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

License PyTorch

This repository contains the official implementation of our paper:

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov and Stefan Roth
To appear at CVPR 2021. [arXiv preprint]

drawing

We obtain state-of-the-art accuracy of adapting semantic
segmentation by enforcing consistency across photometric
and similarity transformations. We use neither style transfer
nor adversarial training.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least two Titan X GPUs (12Gb) or equivalent are required for VGG-16; ResNet-101 and VGG-16/FCN need four.

  1. create conda environment:
conda create --name da-sac
source activate da-sac
  1. install PyTorch >=1.4 (see PyTorch instructions). For example,
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. install the dependencies:
pip install -r requirements.txt
  1. download data (Cityscapes, GTA5, SYNTHIA) and create symlinks in the ./data folder, as follows:
./data/cityscapes -> <symlink to Cityscapes>
./data/cityscapes/gtFine2/
./data/cityscapes/leftImg8bit/

./data/game -> <symlink to GTA>
./data/game/labels_cs
./data/game/images

./data/synthia  -> <symlink to SYNTHIA>
./data/synthia/labels_cs
./data/synthia/RGB

Note that all ground-truth label IDs (Cityscapes, GTA5 and SYNTHIA) should be converted to Cityscapes train IDs. The label directories in the above example (gtFine2, labels_cs) therefore refer not to the original labels, but to these converted semantic maps.

Training

Training from ImageNet initialisation proceeds in three steps:

  1. Training the baseline (ABN)
  2. Generating the weights for importance sampling
  3. Training with augmentation consistency from the ABN baseline

1. Training the baseline (ABN)

Here the input are ImageNet models available from the official PyTorch repository. We provide the links to those models for convenience.

Backbone Link
ResNet-101 resnet101-5d3b4d8f.pth (171M)
VGG-16 vgg16_bn-6c64b313.pth (528M)

By default, these models should be placed in ./models/pretrained/ (though configurable with MODEL.INIT_MODEL).

To run the training

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn] base

where the first argument specifies the source domain, the second determines the network architecture. The third argument base instructs to run the training of the baseline.

If you would like to skip this step, you can use our pre-trained models:

Source domain: GTA5

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 40.8 baseline_abn_e040.pth (336M) 9fe17[...]c11fc
VGG-16 DeepLabv2 37.1 baseline_abn_e115.pth (226M) d4ffc[...]ef755
VGG-16 FCN 36.7 baseline_abn_e040.pth (1.1G) aa2e9[...]bae53

Source domain: SYNTHIA

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 36.3 baseline_abn_e090.pth (336M) b3431[...]d1a83
VGG-16 DeepLabv2 34.4 baseline_abn_e070.pth (226M) 3af24[...]5b24e
VGG-16 FCN 31.6 baseline_abn_e040.pth (1.1G) 5f457[...]e4b3a

Tip: You can download these files (as well as the final models below) with tools/download_baselines.sh:

cp tools/download_baselines.sh snapshots/cityscapes/baselines/
cd snapshots/cityscapes/baselines/
bash ./download_baselines.sh

2. Generating weights for importance sampling

To generate the weights you need to

  1. generate mask predictions with your baseline (see inference below);
  2. run tools/compute_image_weights.py that reads in those predictions and counts the predictions per each class.

If you would like to skip this step, you can use our weights we computed for the ABN baselines above:

Backbone Arch. Source: GTA5 Source: SYNTHIA
ResNet-101 DeepLabv2 cs_weights_resnet101_gta.data cs_weights_resnet101_synthia.data
VGG-16 DeepLabv2 cs_weights_vgg16_gta.data cs_weights_vgg16_synthia.data
VGG-16 FCN cs_weights_vgg16fcn_gta.data cs_weights_vgg16fcn_synthia.data

Tip: The bash script data/download_weights.sh will download all these importance sampling weights in the current directory.

3. Training with augmentation consistency

To train the model with augmentation consistency, we use the same shell script as in step 1, but without the argument base:

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn]

Make sure to specify your baseline snapshot with RESUME bash variable set in the environment (export RESUME=...) or directly in the shell script (commented out by default).

We provide our final models for download.

Source domain: GTA5

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 53.8 55.7 final_e136.pth (504M) 59c16[...]5a32f
VGG-16 DeepLabv2 49.8 51.0 final_e184.pth (339M) 0accb[...]d5881
VGG-16 FCN 49.9 50.4 final_e112.pth (1.6G) e69f8[...]f729b

Source domain: SYNTHIA

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 52.6 52.7 final_e164.pth (504M) a7682[...]db742
VGG-16 DeepLabv2 49.1 48.3 final_e164.pth (339M) c5b31[...]5fdb7
VGG-16 FCN 46.8 45.8 final_e098.pth (1.6G) efb74[...]845cc

Inference and evaluation

Inference

To run single-scale inference from your snapshot, use infer_val.py. The bash script launch/infer_val.sh provides an easy way to run the inference by specifying a few variables:

# validation/training set
FILELIST=[val_cityscapes|train_cityscapes] 
# configuration used for training
CONFIG=configs/[deeplabv2_vgg16|deeplab_resnet101|fcn_vgg16]_train.yaml
# the following 3 variables effectively specify the path to the snapshot
EXP=...
RUN_ID=...
SNAPSHOT=...
# the snapshot path is defined as
# SNAPSHOT_PATH=snapshots/cityscapes/${EXP}/${RUN_ID}/${SNAPSHOT}.pth

Evaluation

Please use the Cityscapes' official evaluation tool evalPixelLevelSemanticLabeling from Cityscapes scripts for evaluating your results.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DASAC,
  title     = {Self-supervised Augmentation Consistency for Adapting Semantic Segmentation},
  author    = {Araslanov, Nikita and and Roth, Stefan},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}
Owner
Visual Inference Lab @TU Darmstadt
Visual Inference Lab @TU Darmstadt
Angle data is a simple data type.

angledat Angle data is a simple data type. Installing + using Put angledat.py in the main dir of your project. Import it and use. Comments Comments st

1 Jan 05, 2022
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Microsoft365_devicePhish Abusing Microsoft 365 OAuth Authorization Flow for Phishing Attack This is a simple proof-of-concept script that allows an at

Alex 236 Dec 21, 2022
Learning Open-World Object Proposals without Learning to Classify

Learning Open-World Object Proposals without Learning to Classify Pytorch implementation for "Learning Open-World Object Proposals without Learning to

Dahun Kim 149 Dec 22, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
BTC-Generator - BTC Generator With Python

Что такое BTC-Generator? Это генератор чеков всеми любимого @BTC_BANKER_BOT Для

DoomGod 3 Aug 24, 2022
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Lossy Compression for Lossless Prediction Using: Training: This repostiory contains our implementation of the paper: Lossy Compression for Lossless Pr

Yann Dubois 84 Jan 02, 2023
Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

Qibin (Andrew) Hou 726 Jan 05, 2023
Code and Data for the paper: Molecular Contrastive Learning with Chemical Element Knowledge Graph [AAAI 2022]

Knowledge-enhanced Contrastive Learning (KCL) Molecular Contrastive Learning with Chemical Element Knowledge Graph [ AAAI 2022 ]. We construct a Chemi

Fangyin 58 Dec 26, 2022
Code for paper Novel View Synthesis via Depth-guided Skip Connections

Novel View Synthesis via Depth-guided Skip Connections Code for paper Novel View Synthesis via Depth-guided Skip Connections @InProceedings{Hou_2021_W

8 Mar 14, 2022
Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

BaseSAFE This repository contains the BaseSAFE Rust APIs, introduced by "BaseSAFE: Baseband SAnitized Fuzzing through Emulation". The example/ directo

Security in Telecommunications 138 Dec 16, 2022
SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu

Joseph P. Robinson 12 Jun 04, 2022
Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy. Now with tensorflow 1.0 support. Evaluation usa

Marcel R. 349 Aug 06, 2022
Learning Features with Parameter-Free Layers (ICLR 2022)

Learning Features with Parameter-Free Layers (ICLR 2022) Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper NAVER AI Lab, NAVER CLOVA Up

NAVER AI 65 Dec 07, 2022
KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

KAPAO (Keypoints and Poses as Objects) KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as

Will McNally 664 Dec 30, 2022
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor

DIVE Lab, Texas A&M University 8 Jun 12, 2022
Interactive Image Segmentation via Backpropagating Refinement Scheme

Won-Dong Jang and Chang-Su Kim, Interactive Image Segmentation via Backpropagating Refinement Scheme, CVPR 2019

Won-Dong Jang 85 Sep 15, 2022
Neural network pruning for finding a sparse computational model for controlling a biological motor task.

MothPruning Scientific Overview Originally inspired by biological nervous systems, deep neural networks (DNNs) are powerful computational tools for mo

Olivia Thomas 0 Dec 14, 2022
This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. T

iThermAI 4 Nov 27, 2022
g9.py - Torch interactive graphics

g9.py - Torch interactive graphics A Torch toy in the browser. Demo at https://srush.github.io/g9py/ This is a shameless copy of g9.js, written in Pyt

Sasha Rush 13 Nov 16, 2022
Federated Learning Based on Dynamic Regularization

Federated Learning Based on Dynamic Regularization This is implementation of Federated Learning Based on Dynamic Regularization. Requirements Please i

39 Jan 07, 2023