Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Related tags

Deep LearningDSRL
Overview

Dual super-resolution learning for semantic segmentation

2021-01-02 Subpixel Update

Happy new year! The 2020-12-29 update of SISR with subpixel conv performs bad in my experiment so I did some changes to it.

The former subpixel version is depreciated now. Click here to learn more. If you are using the main branch then you can just ignore this message.

2020-12-29 New branch: subpixel

  • In this new branch, SISR path changes to follow the design of Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, CVPR 2016. The main branch still uses Deconv so if you prefer the older version you can simply ignore this update.
  • I haven't run a full test on this new framework yet so I'm still not sure about it's performance on validation set. Please let me know if you find this new framework performs better. Thank you. :)

2020-12-15 Pretrained Weights Uploaded (Only for the main branch)

  • See Google Drive (Please note that you don't have to unzip this file.)
  • Use the pretrained weights by train.py --resume 'path/to/weights'

2020-10-31 Good News! I achieved an mIoU of 0.6787 in the newest experiment(the experiment is still running and the final mIoU may be even higher)!

  • So the FA module should be places after each path's final output.
  • The FTM should be 19 channel -> 3 channel
  • Hyper-Parameter fine-tuning

It's amazing that the final model converges at a extremely fast speed. Now the codes are all set, just clone this repo and run train.py!

And thanks for the reminder of @XinruiYuan, currently this repo also differs from the original paper in the architecture of SISR path. I will be working on it after finishing my homework.

2020-10-22 First commit

I implemented the framework proposed in this paper since the authors' code is still under legal scan and i just can't wait to see the results. This repo is based on Deeplab v3+ and Cityscapes, and i still have problems about the FA module.

  • so the code is runnable? yes. just run train.py directly and you can see DSRL starts training.(of course change the dataset path. See insturctions in the Deeplab v3+ part below.)

  • any difference from the paper's proposed method? Actually yes. It's mainly about the FA module. I tried several mothods such as:

    • 19 channel SSSR output -> feature transform module -> 3 channel output -> calculate FAloss with 3 channel SISR output. Result is like a disaster
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature. still disaster.
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature, but no more normalization in the FA module. Seems not bad, but still cannot surpass simple original Deeplab v3+
    • Besides, this project use a square input(default 512*512) which is cropped from the original image.
  • so my results? mIoU about 0.6669 when use the original Deeplab v3+. 0.6638 when i add the SISR path but no FA module. and about 0.62 after i added the FA module.

The result doesn't look good, but this may because of the differences of the FA module.(but why the mIoU decreased after i added the SISR path)

Currently the code doesn't use normalization in FA module. If you want to try using them, please cancel the comment of line 16,18,23,25 in 'utils/fa_loss.py'

Please imform me if you have any questions about the code.

below are discriptions about Deeplab v3+(from the original repo).


pytorch-deeplab-xception

Update on 2018/12/06. Provide model trained on VOC and SBD datasets.

Update on 2018/11/24. Release newest version code, which fix some previous issues and also add support for new backbones and multi-gpu training. For previous code, please see in previous branch

TODO

  • Support different backbones
  • Support VOC, SBD, Cityscapes and COCO datasets
  • Multi-GPU training
Backbone train/eval os mIoU in val Pretrained Model
ResNet 16/16 78.43% google drive
MobileNet 16/16 70.81% google drive
DRN 16/16 78.87% google drive

Introduction

This is a PyTorch(0.4.1) implementation of DeepLab-V3-Plus. It can use Modified Aligned Xception and ResNet as backbone. Currently, we train DeepLab V3 Plus using Pascal VOC 2012, SBD and Cityscapes datasets.

Results

Installation

The code was tested with Anaconda and Python 3.6. After installing the Anaconda environment:

  1. Clone the repo:

    git clone https://github.com/jfzhang95/pytorch-deeplab-xception.git
    cd pytorch-deeplab-xception
  2. Install dependencies:

    For PyTorch dependency, see pytorch.org for more details.

    For custom dependencies:

    pip install matplotlib pillow tensorboardX tqdm

Training

Follow steps below to train your model:

  1. Configure your dataset path in mypath.py.

  2. Input arguments: (see full input arguments via python train.py --help):

    usage: train.py [-h] [--backbone {resnet,xception,drn,mobilenet}]
                [--out-stride OUT_STRIDE] [--dataset {pascal,coco,cityscapes}]
                [--use-sbd] [--workers N] [--base-size BASE_SIZE]
                [--crop-size CROP_SIZE] [--sync-bn SYNC_BN]
                [--freeze-bn FREEZE_BN] [--loss-type {ce,focal}] [--epochs N]
                [--start_epoch N] [--batch-size N] [--test-batch-size N]
                [--use-balanced-weights] [--lr LR]
                [--lr-scheduler {poly,step,cos}] [--momentum M]
                [--weight-decay M] [--nesterov] [--no-cuda]
                [--gpu-ids GPU_IDS] [--seed S] [--resume RESUME]
                [--checkname CHECKNAME] [--ft] [--eval-interval EVAL_INTERVAL]
                [--no-val]
    
  3. To train deeplabv3+ using Pascal VOC dataset and ResNet as backbone:

    bash train_voc.sh
  4. To train deeplabv3+ using COCO dataset and ResNet as backbone:

    bash train_coco.sh

Acknowledgement

PyTorch-Encoding

Synchronized-BatchNorm-PyTorch

drn

Owner
Sam
Get yourself a cup of tea. ˊ_>ˋ旦
Sam
ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Zongdai 107 Dec 20, 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
Awesome Transformers in Medical Imaging

This repo supplements our Survey on Transformers in Medical Imaging Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat,

Fahad Shamshad 666 Jan 06, 2023
SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks Molecular interaction networks are powerful resources for the discovery. While dee

Kexin Huang 49 Oct 15, 2022
OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Stock Price Prediction of Apple Inc. Using Recurrent Neural Network OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network Dataset:

Nouroz Rahman 410 Jan 05, 2023
Automated Attendance Project Using Face Recognition

dependencies for project: cmake 3.22.1 dlib 19.22.1 face-recognition 1.3.0 openc

Rohail Taha 1 Jan 09, 2022
Keeper for Ricochet Protocol, implemented with Apache Airflow

Ricochet Keeper This repository contains Apache Airflow DAGs for executing keeper operations for Ricochet Exchange. Usage You will need to run this us

Ricochet Exchange 5 May 24, 2022
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022
CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator This is the official code repository for NeurIPS 2021 paper: CARMS: Categorica

Alek Dimitriev 1 Jul 09, 2022
Simply enable or disable your Nvidia dGPU

EnvyControl (WIP) Simply enable or disable your Nvidia dGPU Usage First clone this repo and install envycontrol with sudo pip install . CLI Turn off y

Victor Bayas 292 Jan 03, 2023
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
Efficient Multi Collection Style Transfer Using GAN

Proposed a new model that can make style transfer from single style image, and allow to transfer into multiple different styles in a single model.

Zhaozheng Shen 2 Jan 15, 2022
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 111 Dec 18, 2022
Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

TransGanFormer (wip) Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GansFormer and TransGan paper. I

Phil Wang 146 Dec 06, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Adversarial Graph Augmentation to Improve Graph Contrastive Learning

ADGCL : Adversarial Graph Augmentation to Improve Graph Contrastive Learning Introduction This repo contains the Pytorch [1] implementation of Adversa

susheel suresh 62 Nov 19, 2022
Deep Learning to Create StepMania SM FIles

StepCOVNet Running Audio to SM File Generator Currently only produces .txt files. Use SMDataTools to convert .txt to .sm python stepmania_note_generat

Chimezie Iwuanyanwu 8 Jan 08, 2023
code for Grapadora research paper experimentation

Road feature embedding selection method Code for research paper experimentation Abstract Traffic forecasting models rely on data that needs to be sens

Eric López Manibardo 0 May 26, 2022
Focal Loss for Dense Rotation Object Detection

Convert ResNets weights from GluonCV to Tensorflow Abstract GluonCV released some new resnet pre-training weights and designed some new resnets (such

17 Nov 24, 2021