Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Last update: Dec 27, 2022

Related tags

Deep Learning SparseR-CNN

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Updates

(02/03/2021) Higher performance is reported by using stronger backbone model PVT.
(23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
(02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
(26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
(26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.

Models

Method	inf_time	train_time	box AP	download
R50_100pro_3x	23 FPS	19h	42.8	model \| log
R50_300pro_3x	22 FPS	24h	45.0	model \| log
R101_100pro_3x	19 FPS	25h	44.1	model \| log
R101_300pro_3x	18 FPS	29h	46.4	model \| log

Models and logs are available in Baidu Drive by code wt9n.

Notes

We observe about 0.3 AP noise.
The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.

Method	inf_time	train_time	box AP	codebase
R50_300pro_3x	22 FPS	24h	45.0	detectron2
R50_300pro_3x.detco	22 FPS	28h	46.5	detectron2
PVTSmall_300pro_3x	13 FPS	50h	45.7	mmdetection
PVTv2-b2_300pro_3x	11 FPS	76h	50.1	mmdetection

Installation

The codebases are built on top of Detectron2 and DETR.

Requirements

Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop

Link coco dataset path to SparseR-CNN/datasets/coco

mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

Train SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml

Evaluate SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth

Visualize SparseR-CNN

python demo/demo.py\
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
    --opts MODEL.WEIGHTS path/to/model.pth

Third-party resources

mmdetection implementation: sparse_rcnn. Thank Shilong Zhang!
cvpod implementation:sparse_rcnn. Thank Benjin Zhu!
paddledetection implementation:sparse_rcnn. Thank FL77N!

License

SparseR-CNN is released under MIT License.

Citing

If you use SparseR-CNN in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{peize2020sparse,
  title   =  {{SparseR-CNN}: End-to-End Object Detection with Learnable Proposals},
  author  =  {Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv:2011.12450},
  year    =  {2020}
}

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Related tags

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Updates

Models

Notes

Installation

Requirements

Steps

Third-party resources

License

Citing

Owner

Peize Sun

Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

The BCNet related data and inference model.

Official Pytorch implementation of MixMo framework

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

PyTorch implementation of Self-supervised Contrastive Regularization for DG (SelfReg)

FB-tCNN for SSVEP Recognition

Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Camera calibration & 3D pose estimation tools for AcinoSet

Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

The code of paper 'Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection'

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

Edison AT is software Depression Assistant personal.

Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport

Numenta published papers code and data