Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Overview

Awesome Few-Shot Object Detection (FSOD)

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Maintainers: Gabriel Huang

For an introduction to the few-shot object detection framework read below, or check our our survey on few-shot and self-supervised object detection and its project page for full explanations, discussions on the pitfalls of the Pascal, COCO, and LVIS benchmarks used below, main takeaways and future research directions.

Contributing

If you want to add your paper or report a mistake, please create a pull request with all supporting information. Thanks!

Pascal VOC and MS COCO FSOD Leaderboard

In this table we distinguish Kang's Splits (Meta-YOLO) from TFA's splits (Frustratingly Simple FSOD), as the Kang splits have been shown to have high variance and overestimate performance for low number of shots (see for yourself -- check the difference between TFA 1-shot and Kang 1-shot in the table below).

Name Type VOC TFA 1-shot (mAP50) VOC TFA 3-shot (mAP50) VOC TFA 10-shot (mAP50) VOC Kang 1-shot (mAP50) VOC Kang 3-shot (mAP50) VOC Kang 10-shot (mAP50) MS COCO 10-shot (mAP) MS COCO 30-shot (mAP)
LSTD finetuning - - - 8.2 12.4 38.5 - -
RepMet prototype - - - 26.1 34.4 41.3 - -
Meta-YOLO modulation 14.2 29.8 - 14.8 26.7 47.2 5.6 9.1
MetaDet modulation - - - 18.9 30.2 49.6 7.1 11.3
Meta-RCNN modulation - - - 19.9 35.0 51.5 8.7 12.4
Faster RCNN+FT finetuning 9.9 21.6 35.6 15.2 29.0 45.5 9.2 12.5
ACM-MetaRCNN modulation - - - 31.9 35.9 53.1 9.4 12.8
TFA w/fc finetuning 22.9 40.4 52.0 36.8 43.6 57.0 10.0 13.4
TFA w/cos finetuning 25.3 42.1 52.8 39.8 44.7 56.0 10.0 13.7
Retentive RCNN finetuning - - - 42.0 46.0 56.0 10.5 13.8
MPSR finetuning - - - 41.7 51.4 61.8 9.8 14.1
Attention-FSOD modulation - - - - - - 12.0 -
FsDetView finetuning 24.2 42.2 57.4 - - - 12.5 14.7
CME finetuning - - - 41.5 50.4 60.9 15.1 16.9
TIP add-on 27.7 43.3 59.6 - - - 16.3 18.3
DAnA modulation - - - - - - 18.6 21.6
DeFRCN prototype - - - 53.6 61.5 60.8 18.5 22.6
Meta-DETR modulation 20.4 46.6 57.8 - - - 17.8 22.9
DETReg finetuning - - - - - - 18.0 30.0

Few-Shot Object Detection Explained

We explain the few-shot object detection framework as defined by the Meta-YOLO paper (Kang's splits - full details here). FSOD partitions objects into two disjoint sets of categories: base or known/source classes, which are object categories for which we have access to a large number of training examples; and novel or unseen/target classes, for which we have only a few training examples (shots) per class. The FSOD task is formalized into the following steps:

  • 1. Base training.¹ Annotations are given only for the base classes, with a large number of training examples per class (bikes in the example). We train the FSOD method on the base classes.
  • 2. Few-shot finetuning. Annotations are given for the support set, a very small number of training examples from both the base and novel classes (one bike and one human in the example). Most methods finetune the FSOD model on the support set, but some methods might only use the support set for conditioning during evaluation (finetuning-free methods).
  • 3. Few-shot evaluation. We evaluate the FSOD to jointly detect base and novel classes from the test set (few-shot refers to the size of the support set). The performance metrics are reported separately for base and novel classes. Common evaluation metrics are variants of the mean average precision: mAP50 for Pascal and COCO-style mAP for COCO. They are often denoted bAP50, bAP75, bAP (resp. nAP50, nAP75, nAP) for the base and novel classes respectively, where the number is the IoU-threshold in percentage.

In pure FSOD, methods are usually compared solely on the basis of novel class performance, whereas in Generalized FSOD, methods are compared on both base and novel class performances [2]. Note that "training" and "test" set refer to the splits used in traditional object detection. Base and novel classes are typically present in both the training and testing sets; however, the novel class annotations are filtered out from the training set during base training; during few-shot finetuning, the support set is typically taken to be a (fixed) subset of the training set; during few-shot evaluation, all of the test set is used to reduce uncertainty [1].

For conditioning-based methods with no finetuning, few-shot finetuning and few-shot evaluation are merged into a single step; the novel examples are used as support examples to condition the model, and predictions are made directly on the test set. In practice, the majority of conditioning-based methods reviewed in this survey do benefit from some form of finetuning.

*¹In the context of self-supervised learning, base-training may also be referred to as finetuning or training. This should not be confused with base training in the meta-learning framework; rather this is similar to the meta-training phase [3].

Owner
Gabriel Huang
PhD student at MILA
Gabriel Huang
The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability

PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability PCACE is a new algorithm for ranking neurons in a CNN architecture in order

4 Jan 04, 2022
Reliable probability face embeddings

ProbFace, arxiv This is a demo code of training and testing [ProbFace] using Tensorflow. ProbFace is a reliable Probabilistic Face Embeddging (PFE) me

Kaen Chan 34 Dec 31, 2022
Code for our paper: Online Variational Filtering and Parameter Learning

Variational Filtering To run phi learning on linear gaussian (Fig1a) python linear_gaussian_phi_learning.py To run phi and theta learning on linear g

16 Aug 14, 2022
A deep neural networks for images using CNN algorithm.

Example-CNN-Project This is a simple project showing how to implement deep neural networks using CNN algorithm. The dataset is taken from this link: h

Mohammad Amin Dadgar 3 Sep 16, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019) This is code for a paper Learning View Priors for Single-view 3D Reconstruction by

Hiroharu Kato 38 Aug 17, 2022
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

Yue Gao 139 Dec 14, 2022
Selfplay In MultiPlayer Environments

This project allows you to train AI agents on custom-built multiplayer environments, through self-play reinforcement learning.

200 Jan 08, 2023
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

ImageBART NeurIPS 2021 Patrick Esser*, Robin Rombach*, Andreas Blattmann*, Björn Ommer * equal contribution arXiv | BibTeX | Poster Requirements A sui

CompVis Heidelberg 110 Jan 01, 2023
End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

onnx-facial-lmk-detector End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx. Demo You can

atksh 42 Dec 30, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection arXi

59 Nov 29, 2022
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

data-services A repository for storing various Data Engineering docker-compose files in one place. How to use it ? Set the required settings in .env f

BigData.IR 525 Dec 03, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
Unofficial Implementation of Oboe (SIGCOMM'18').

Oboe-Reproduce This is the unofficial implementation of the paper "Oboe: Auto-tuning video ABR algorithms to network conditions, Zahaib Akhtar, Yun Se

Tianchi Huang 13 Nov 04, 2022
Repositório da disciplina de APC, no segundo semestre de 2021

NOTAS FINAIS: https://github.com/fabiommendes/apc2018/blob/master/nota-final.pdf Algoritmos e Programação de Computadores Este é o Git da disciplina A

16 Dec 16, 2022
PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

pytorch-maml This is a PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML): https://arxiv

Kate Rakelly 516 Jan 05, 2023
MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.

Documentation: https://mmgeneration.readthedocs.io/ Introduction English | 简体中文 MMGeneration is a powerful toolkit for generative models, especially f

OpenMMLab 1.3k Dec 29, 2022