[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Last update: Jan 02, 2023

Overview

K-Net: Towards Unified Image Segmentation

Introduction

This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will also be integrated in the future release of MMDetection and MMSegmentation.

K-Net:Towards Unified Image Segmentation,
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv][project page][Bibetex]

Results

The results of K-Net and their corresponding configs on each segmentation task are shown as below. We have released the full model zoo of panoptic segmentation. The complete model checkpoints and logs for instance and semantic segmentation will be released soon.

Semantic Segmentation on ADE20K

Backbone	Method	Crop Size	Lr Schd	mIoU	Config	Download
R-50	K-Net + FCN	512x512	80K	43.3	config	model \| log
R-50	K-Net + PSPNet	512x512	80K	43.9	config	model \| log
R-50	K-Net + DeepLabv3	512x512	80K	44.6	config	model \| log
R-50	K-Net + UPerNet	512x512	80K	43.6	config	model \| log
Swin-T	K-Net + UPerNet	512x512	80K	45.4	config	model \| log
Swin-L	K-Net + UPerNet	512x512	80K	52.0	config	model \| log
Swin-L	K-Net + UPerNet	640x640	80K	52.7	config	model \| log

Instance Segmentation on COCO

Backbone	Method	Lr Schd	Mask mAP	Config	Download
R-50	K-Net	1x	34.0	config	model \| log
R-50	K-Net	ms-3x	37.8	config	model \| log
R-101	K-Net	ms-3x	39.2	config	model \| log
R-101-DCN	K-Net	ms-3x	40.5	config	model \| log

Panoptic Segmentation on COCO

Backbone	Method	Lr Schd	PQ	Config	Download
R-50	K-Net	1x	44.3	config	model \| log
R-50	K-Net	ms-3x	47.1	config	model \| log
R-101	K-Net	ms-3x	48.4	config	model \| log
R-101-DCN	K-Net	ms-3x	49.6	config	model \| log
Swin-L (window size 7)	K-Net	ms-3x	54.6	config	model \| log
Above on test-dev			55.2

Installation

It requires the following OpenMMLab packages:

MIM >= 0.1.5
MMCV-full >= v1.3.14
MMDetection >= v2.17.0
MMSegmentation >= v0.18.0
scipy
panopticapi

pip install openmim scipy mmdet mmsegmentation
pip install git+https://github.com/cocodataset/panopticapi.git
mim install mmcv-full

License

This project is released under the Apache 2.0 license.

Usage

Data preparation

Prepare data following MMDetection and MMSegmentation. The data structure looks like below:

data/
├── ade
│   ├── ADEChallengeData2016
│   │   ├── annotations
│   │   ├── images
├── coco
│   ├── annotations
│   │   ├── panoptic_{train,val}2017.json
│   │   ├── instance_{train,val}2017.json
│   │   ├── panoptic_{train,val}2017/  # panoptic png annotations
│   │   ├── image_info_test-dev2017.json  # for test-dev submissions
│   ├── train2017
│   ├── val2017
│   ├── test2017

Training and testing

For training and testing, you can directly use mim to train and test the model

# train instance/panoptic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmdet $CONFIG $WORK_DIR

# test instance segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

# test panoptic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval pq

# train semantic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmseg $CONFIG $WORK_DIR

# test semantic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmseg $CONFIG $CHECKPOINT --eval mIoU

For test submission for panoptic segmentation, you can use the command below:

# we should update the category information in the original image test-dev pkl file
# for panoptic segmentation
python -u tools/gen_panoptic_test_info.py
# run test-dev submission
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT  --format-only --cfg-options data.test.ann_file=data/coco/annotations/panoptic_image_info_test-dev2017.json data.test.img_prefix=data/coco/test2017 --eval-options jsonfile_prefix=$WORK_DIR

You can also run training and testing without slurm by directly using mim for instance/semantic/panoptic segmentation like below:

PYTHONPATH='.':$PYTHONPATH mim train mmdet $CONFIG $WORK_DIR
PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR

PARTITION: the slurm partition you are using
CHECKPOINT: the path of the checkpoint downloaded from our model zoo or trained by yourself
WORK_DIR: the working directory to save configs, logs, and checkpoints
CONFIG: the config files under the directory configs/
JOB_NAME: the name of the job that are necessary for slurm

Citation

@inproceedings{zhang2021knet,
    title={{K-Net: Towards} Unified Image Segmentation},
    author={Wenwei Zhang and Jiangmiao Pang and Kai Chen and Chen Change Loy},
    year={2021},
    booktitle={NeurIPS},
}

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Related tags

Overview

K-Net: Towards Unified Image Segmentation

Introduction

Results

Semantic Segmentation on ADE20K

Instance Segmentation on COCO

Panoptic Segmentation on COCO

Installation

License

Usage

Data preparation

Training and testing

Citation

Owner

Wenwei Zhang

A whale detector design for the Kaggle whale-detector challenge!

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

💊 A 3D Generative Model for Structure-Based Drug Design (NeurIPS 2021)

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Explaining neural decisions contrastively to alternative decisions.

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Deploy optimized transformer based models on Nvidia Triton server

PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

Simple tutorials using Google's TensorFlow Framework

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

You Only Look Once for Panopitic Driving Perception

Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

Contrastive Loss Gradient Attack (CLGA)

This repo. is an implementation of ACFFNet, which is accepted for in Image and Vision Computing.

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

使用yolov5训练自己数据集(详细过程)并通过flask部署

Image reconstruction done with untrained neural networks.