Two-stage CenterNet

Overview

Probabilistic two-stage detection

Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network.

Probabilistic two-stage detection,
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl,
arXiv technical report (arXiv 2103.07461)

Contact: [email protected]. Any questions or discussions are welcomed!

Abstract

We develop a probabilistic interpretation of two-stage object detection. We show that this probabilistic interpretation motivates a number of common empirical training practices. It also suggests changes to two-stage detection pipelines. Specifically, the first stage should infer proper object-vs-background likelihoods, which should then inform the overall score of the detector. A standard region proposal network (RPN) cannot infer this likelihood sufficiently well, but many one-stage detectors can. We show how to build a probabilistic two-stage detector from any state-of-the-art one-stage detector. The resulting detectors are faster and more accurate than both their one- and two-stage precursors. Our detector achieves 56.4 mAP on COCO test-dev with single-scale testing, outperforming all published results. Using a lightweight backbone, our detector achieves 49.2 mAP on COCO at 33 fps on a Titan Xp.

Summary

  • Two-stage CenterNet: First stage estimates object probabilities, second stage conditionally classifies objects.

  • Resulting detector is faster and more accurate than both traditional two-stage detectors (fewer proposals required), and one-stage detectors (lighter first stage head).

  • Our best model achieves 56.4 mAP on COCO test-dev.

  • This repo also includes a detectron2-based CenterNet implementation with better accuracy (42.5 mAP at 70FPS) and a new FPN version of CenterNet (40.2 mAP with Res50_1x).

Main results

All models are trained with multi-scale training, and tested with a single scale. The FPS is tested on a Titan RTX GPU. More models and details can be found in the MODEL_ZOO.

COCO

Model COCO val mAP FPS
CenterNet-S4_DLA_8x 42.5 71
CenterNet2_R50_1x 42.9 24
CenterNet2_X101-DCN_2x 49.9 8
CenterNet2_R2-101-DCN-BiFPN_4x+4x_1560_ST 56.1 5
CenterNet2_DLA-BiFPN-P5_24x_ST 49.2 38

LVIS

Model val mAP box
CenterNet2_R50_1x 26.5
CenterNet2_FedLoss_R50_1x 28.3

Objects365

Model val mAP
CenterNet2_R50_1x 22.6

Installation

Our project is developed on detectron2. Please follow the official detectron2 installation. All our code is under projects/CenterNet2/. In theory, you should be able to copy-paste projects/CenterNet2/ to the latest detectron2 release or your own detectron2 repo to run our project. There might be API changes in future detectron2 releases that make it incompatible.

Demo

We use the default detectron2 demo script. To run inference on an image folder using our pre-trained model, run

python projects/CenterNet2/demo/demo.py --config-file projects/CenterNet2/configs/CenterNet2_R50_1x.yaml --input path/to/image/ --opts MODEL.WEIGHTS models/CenterNet2_R50_1x.pth

Benchmark evaluation and training

Please check detectron2 GETTING_STARTED.md for running evaluation and training. Our config files are under projects/CenterNet2/configs and the pre-trained models are in the MODEL_ZOO.

License

Our code under projects/CenterNet2/ is under Apache 2.0 license. projects/CenterNet2/centernet/modeling/backbone/bifpn_fcos.py are from AdelaiDet, which follows the original non-commercial license. The code from detectron2 follows the original Apache 2.0 license.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2021probablistic,
  title={Probabilistic two-stage detection},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={arXiv preprint arXiv:2103.07461},
  year={2021}
}
Owner
Xingyi Zhou
CS Ph.D. student at UT Austin.
Xingyi Zhou
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Ibai Gorordo 42 Oct 07, 2022
MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

Qibin (Andrew) Hou 162 Nov 28, 2022
Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020
CTC segmentation python package

CTC segmentation CTC segmentation can be used to find utterances alignments within large audio files. This repository contains the ctc-segmentation py

Ludwig Kürzinger 217 Jan 04, 2023
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
TLoL (Python Module) - League of Legends Deep Learning AI (Research and Development)

TLoL-py - League of Legends Deep Learning Library TLoL-py is the Python component of the TLoL League of Legends deep learning library. It provides a s

7 Nov 29, 2022
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark 🔥 🔥 Based on MMsegmentation 🔥 🔥 Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
My take on a practical implementation of Linformer for Pytorch.

Linformer Pytorch Implementation A practical implementation of the Linformer paper. This is attention with only linear complexity in n, allowing for v

Peter 349 Dec 25, 2022
Distributed Arcface Training in Pytorch

Distributed Arcface Training in Pytorch

3 Nov 23, 2021
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network

MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network This repository is the official implementation of MatchGAN: A S

Justin Sun 12 Dec 27, 2022
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

TorchRL Disclaimer This library is not officially released yet and is subject to change. The features are available before an official release so that

Meta Research 860 Jan 07, 2023
Pre-trained NFNets with 99% of the accuracy of the official paper

NFNet Pytorch Implementation This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale

Benjamin Schmidt 133 Dec 09, 2022
Official PyTorch repo for JoJoGAN: One Shot Face Stylization

JoJoGAN: One Shot Face Stylization This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization. Abstract: While there have been recent ad

1.3k Dec 29, 2022
TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

98 Dec 15, 2022
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
Self-Supervised Methods for Noise-Removal

SSMNR | Self-Supervised Methods for Noise Removal Image denoising is the task of removing noise from an image, which can be formulated as the task of

1 Jan 16, 2022
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer

1 Feb 03, 2022
GoodNews Everyone! Context driven entity aware captioning for news images

This is the code for a CVPR 2019 paper, called GoodNews Everyone! Context driven entity aware captioning for news images. Enjoy! Model preview: Huge T

117 Dec 19, 2022
E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides

E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides

11 Nov 08, 2022