Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Related tags

Deep LearningDSRL
Overview

Dual super-resolution learning for semantic segmentation

2021-01-02 Subpixel Update

Happy new year! The 2020-12-29 update of SISR with subpixel conv performs bad in my experiment so I did some changes to it.

The former subpixel version is depreciated now. Click here to learn more. If you are using the main branch then you can just ignore this message.

2020-12-29 New branch: subpixel

  • In this new branch, SISR path changes to follow the design of Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, CVPR 2016. The main branch still uses Deconv so if you prefer the older version you can simply ignore this update.
  • I haven't run a full test on this new framework yet so I'm still not sure about it's performance on validation set. Please let me know if you find this new framework performs better. Thank you. :)

2020-12-15 Pretrained Weights Uploaded (Only for the main branch)

  • See Google Drive (Please note that you don't have to unzip this file.)
  • Use the pretrained weights by train.py --resume 'path/to/weights'

2020-10-31 Good News! I achieved an mIoU of 0.6787 in the newest experiment(the experiment is still running and the final mIoU may be even higher)!

  • So the FA module should be places after each path's final output.
  • The FTM should be 19 channel -> 3 channel
  • Hyper-Parameter fine-tuning

It's amazing that the final model converges at a extremely fast speed. Now the codes are all set, just clone this repo and run train.py!

And thanks for the reminder of @XinruiYuan, currently this repo also differs from the original paper in the architecture of SISR path. I will be working on it after finishing my homework.

2020-10-22 First commit

I implemented the framework proposed in this paper since the authors' code is still under legal scan and i just can't wait to see the results. This repo is based on Deeplab v3+ and Cityscapes, and i still have problems about the FA module.

  • so the code is runnable? yes. just run train.py directly and you can see DSRL starts training.(of course change the dataset path. See insturctions in the Deeplab v3+ part below.)

  • any difference from the paper's proposed method? Actually yes. It's mainly about the FA module. I tried several mothods such as:

    • 19 channel SSSR output -> feature transform module -> 3 channel output -> calculate FAloss with 3 channel SISR output. Result is like a disaster
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature. still disaster.
    • 19 channel SSSR last_conv(see the code and you'll know what it is) feature -> feature transform module -> calculate FAloss with 19 channel SISR last_conv feature, but no more normalization in the FA module. Seems not bad, but still cannot surpass simple original Deeplab v3+
    • Besides, this project use a square input(default 512*512) which is cropped from the original image.
  • so my results? mIoU about 0.6669 when use the original Deeplab v3+. 0.6638 when i add the SISR path but no FA module. and about 0.62 after i added the FA module.

The result doesn't look good, but this may because of the differences of the FA module.(but why the mIoU decreased after i added the SISR path)

Currently the code doesn't use normalization in FA module. If you want to try using them, please cancel the comment of line 16,18,23,25 in 'utils/fa_loss.py'

Please imform me if you have any questions about the code.

below are discriptions about Deeplab v3+(from the original repo).


pytorch-deeplab-xception

Update on 2018/12/06. Provide model trained on VOC and SBD datasets.

Update on 2018/11/24. Release newest version code, which fix some previous issues and also add support for new backbones and multi-gpu training. For previous code, please see in previous branch

TODO

  • Support different backbones
  • Support VOC, SBD, Cityscapes and COCO datasets
  • Multi-GPU training
Backbone train/eval os mIoU in val Pretrained Model
ResNet 16/16 78.43% google drive
MobileNet 16/16 70.81% google drive
DRN 16/16 78.87% google drive

Introduction

This is a PyTorch(0.4.1) implementation of DeepLab-V3-Plus. It can use Modified Aligned Xception and ResNet as backbone. Currently, we train DeepLab V3 Plus using Pascal VOC 2012, SBD and Cityscapes datasets.

Results

Installation

The code was tested with Anaconda and Python 3.6. After installing the Anaconda environment:

  1. Clone the repo:

    git clone https://github.com/jfzhang95/pytorch-deeplab-xception.git
    cd pytorch-deeplab-xception
  2. Install dependencies:

    For PyTorch dependency, see pytorch.org for more details.

    For custom dependencies:

    pip install matplotlib pillow tensorboardX tqdm

Training

Follow steps below to train your model:

  1. Configure your dataset path in mypath.py.

  2. Input arguments: (see full input arguments via python train.py --help):

    usage: train.py [-h] [--backbone {resnet,xception,drn,mobilenet}]
                [--out-stride OUT_STRIDE] [--dataset {pascal,coco,cityscapes}]
                [--use-sbd] [--workers N] [--base-size BASE_SIZE]
                [--crop-size CROP_SIZE] [--sync-bn SYNC_BN]
                [--freeze-bn FREEZE_BN] [--loss-type {ce,focal}] [--epochs N]
                [--start_epoch N] [--batch-size N] [--test-batch-size N]
                [--use-balanced-weights] [--lr LR]
                [--lr-scheduler {poly,step,cos}] [--momentum M]
                [--weight-decay M] [--nesterov] [--no-cuda]
                [--gpu-ids GPU_IDS] [--seed S] [--resume RESUME]
                [--checkname CHECKNAME] [--ft] [--eval-interval EVAL_INTERVAL]
                [--no-val]
    
  3. To train deeplabv3+ using Pascal VOC dataset and ResNet as backbone:

    bash train_voc.sh
  4. To train deeplabv3+ using COCO dataset and ResNet as backbone:

    bash train_coco.sh

Acknowledgement

PyTorch-Encoding

Synchronized-BatchNorm-PyTorch

drn

Owner
Sam
Get yourself a cup of tea. ˊ_>ˋ旦
Sam
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

Fadi Boutros 51 Dec 13, 2022
Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

shindo.py Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data stored in NumPy array Introduction Japa

RR_Inyo 3 Sep 23, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
A new play-and-plug method of controlling an existing generative model with conditioning attributes and their compositions.

Viz-It Data Visualizer Web-Application If I ask you where most of the data wrangler looses their time ? It is Data Overview and EDA. Presenting "Viz-I

NVIDIA Research Projects 66 Jan 01, 2023
ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data

ARKitScenes This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D

Apple 371 Jan 05, 2023
4th place solution for the SIGIR 2021 challenge.

SIGIR-2021 (Tinkoff.AI) How to start Download train and test data: https://sigir-ecom.github.io/data-task.html Place it under sigir-2021/data/. Run py

Tinkoff.AI 4 Jul 01, 2022
UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss This repository contains the TensorFlow implementation of the paper UnF

Simon Meister 270 Nov 06, 2022
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetu

3 Dec 05, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 68 Jul 18, 2022
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

Lie Transformer - Pytorch (wip) Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch. Only the SE3 version will be present in thi

Phil Wang 78 Oct 26, 2022
A cool little repl-based simulation written in Python

A cool little repl-based simulation written in Python planned to integrate machine-learning into itself to have AI battle to the death before your eye

Em 6 Sep 17, 2022
MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

ZhengChang 20 Nov 25, 2022
Graph WaveNet apdapted for brain connectivity analysis.

Graph WaveNet for brain network analysis This is the implementation of the Graph WaveNet model used in our manuscript: S. Wein , A. Schüller, A. M. To

4 Dec 17, 2022
This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video] Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang CVPR 2021 This is re-implem

Ahmet Sarigun 79 Jan 05, 2023
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

Google Cloud Platform 792 Dec 28, 2022
Code release for SLIP Self-supervision meets Language-Image Pre-training

SLIP: Self-supervision meets Language-Image Pre-training What you can find in this repo: Pre-trained models (with ViT-Small, Base, Large) and code to

Meta Research 621 Dec 31, 2022