QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Last update: Jan 08, 2023

Overview

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

TL;DR: QueryInst is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation,

by Yuxin Fang*, Shusheng Yang*, Xinggang Wang†, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu.

(*) equal contribution, (†) corresponding author.

arXiv technical report (arXiv 2105.01928)

This repo serves as the official implementation for QueryInst, based on mmdetection and built upon Sparse R-CNN & DETR. Implantations based on Detectron2 will be released in the near future.
This project is under active development, we will extend QueryInst to a wide range of instance-level recognition tasks.

Updates

[06/05/2021] 🌟 QueryInst training and inference code has been released!

Getting Started

Our project is mainly developed on mmdetection toolbox (931d96), please refer to the mmdetection official installation.
Install QueryInst by:

python setup.py develop

Prepare datasets:

mkdir data && cd data
ln -s /path/to/coco coco

Training QueryInst with single GPU:

python tools/train.py configs/queryinst/queryinst_r50_fpn_1x_coco.py

Training QueryInst with multi GPUs:

./tools/dist_train.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py 8

Test QueryInst on COCO val set with single GPU:

python tools/test.py configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth --eval bbox segm

Test QueryInst on COCO val set with multi GPUs:

./tools/dist_test.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth 8 --eval bbox segm

Main Results on COCO val

Configs	Aug.	Weights	Box AP	Mask AP
QueryInst_R50_3x_300_queries	480 ~ 800, w/ Crop	-	46.9	41.4
QueryInst_R101_3x_300_queries	480 ~ 800, w/ Crop	-	48.0	42.4
QueryInst_X101-DCN_3x_300_queries	480 ~ 800, w/ Crop	-	50.3	44.2

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation ?? :

@article{QueryInst,
  title={QueryInst: Parallelly Supervised Mask Query for Instance Segmentation},
  author={Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
  journal={arXiv preprint arXiv:2105.01928},
  year={2021}
}

TODO

QueryInst training and inference code.
QueryInst based on Detectron2 toolbox will be released in the near future.
QueryInst configurations for Cityscapes and YouTube-VIS.
QueryInst pretrain weights.

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Related tags

Overview

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Updates

Getting Started

Main Results on COCO val

Citation

TODO

Owner

Hust Visual Learning Team

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

Neural Motion Learner With Python

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

A unet implementation for Image semantic segmentation

Computational inteligence project on faces in the wild dataset

[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

Prometheus Exporter for data scraped from datenplattform.darmstadt.de

Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Lecture materials for Cornell CS5785 Applied Machine Learning (Fall 2021)

You Only 👀 One Sequence

Facial Image Inpainting with Semantic Control

Fastquant - Backtest and optimize your trading strategies with only 3 lines of code!

A collection of resources on GAN Inversion.

This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

This Deep Learning Model Predicts that from which disease you are suffering.

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

《Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis》(2021)