DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Last update: Nov 27, 2022

Related tags

Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

This project hosts the code for implementing the DCT-MASK algorithms for instance segmentation.

[DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation] Xing Shen*, Jirui Yang*, Chunbo Wei, Bing Deng, Jianqiang Huang, Xiansheng Hua Xiaoliang Cheng, Kewei Liang

In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition(CVPR 2021)

arXiv preprint(arXiv:2011.09876)

Contributions

We propose a high-quality and low-complexity mask representation for instance segmentation, which encodes the high-resolution binary mask into a compact vector with discrete cosine transform.
With slight modifications, DCT-Mask could be integrated into most pixel-based frameworks, and achieve significant and consistent improvement on different datasets, backbones, and training schedules. Specifically, it obtains more improvements for more complex backbones and higher-quality annotations.
DCT-Mask does not require extra pre-processing or pre-training. It achieves high-resolution mask prediction at a speed similar to low-resolution.

Installation

Requirements

PyTorch ≥ 1.5 and fvcore == 0.1.1.post20200716

This implementation is based on detectron2. Please refer to INSTALL.md. for installation and dataset preparation.

Usage

The codes of this project is on projects/DCT_Mask/

Train with multiple GPUs

cd ./projects/DCT_Mask/
./train1.sh

Testing

cd ./projects/DCT_Mask/
./test1.sh

Model ZOO

Trained models on COCO

Model	Backbone	Schedule	Multi-scale training	Inference time (s/im)	AP (minival)	Link
DCT-Mask R-CNN	R50	1x	Yes	0.0465	36.5	download(Fetch code: xpdm)
DCT-Mask R-CNN	R101	3x	Yes	0.0595	39.9	download(Fetch code: 7q6x)
DCT-Mask R-CNN	RX101	3x	Yes	0.1049	41.2	download(Fetch code: ufw2)
Casecade DCT-Mask R-CNN	R50	1x	Yes	0.0630	37.5	download(Fetch code: yqxp)
Casecade DCT-Mask R-CNN	R101	3x	Yes	0.0750	40.8	download(Fetch code: r8xv)
Casecade DCT-Mask R-CNN	RX101	3x	Yes	0.1195	42.0	download(Fetch code: pdej)

Trained models on Cityscapes

Model	Data	Backbone	Schedule	Multi-scale training	AP (val)	Link
DCT-Mask R-CNN	Fine-Only	R50	1x	Yes	37.0	download(Fetch code: dn7i)
DCT-Mask R-CNN	CoCo-Pretrain +Fine	R50	1x	Yes	39.6	download(Fetch code: ntqf)

Notes

We observe about 0.2 AP noise in COCO.
High variance observed in CityScapes when trained on fine annotations only. We report the median of 5 runs AP in the article (i.e. 35.6), while in this repo we report the best results (37.0).
Initialized from COCO pre-training will reduce the variance on CityScapes as well as increasing mask AP.
The inference time is measured on single GPU with batchsize 1. All GPUs are NVIDIA V100.
Lvis 0.5 is used for evaluation.

Contributing to the project

Any pull requests or issues are welcome.

If there is any problem with this project, please contact Xing Shen.

Citations

Please consider citing our papers in your publications if the project helps your research.

License

MIT License.

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Related tags

Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Contributions

Installation

Requirements

Usage

Train with multiple GPUs

Testing

Model ZOO

Trained models on COCO

Trained models on Cityscapes

Notes

Contributing to the project

Citations

License

Owner

Alibaba Cloud

YOLOv4-v3 Training Automation API for Linux

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Code for Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty

ECAENet (TensorFlow and Keras)

Pun Detection and Location

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

MMRazor: a model compression toolkit for model slimming and AutoML

A Model for Natural Language Attack on Text Classification and Inference

Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Source code for Fixed-Point GAN for Cloud Detection

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Related tags

Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Contributions

Installation

Requirements

Usage

Train with multiple GPUs

Testing

Model ZOO

Trained models on COCO

Trained models on Cityscapes

Notes

Contributing to the project

Citations

License

Owner

Alibaba Cloud

YOLOv4-v3 Training Automation API for Linux

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Code for Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty

ECAENet (TensorFlow and Keras)

Pun Detection and Location

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

MMRazor: a model compression toolkit for model slimming and AutoML

A Model for Natural Language Attack on Text Classification and Inference

Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Source code for Fixed-Point GAN for Cloud Detection

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.