An official implementation of the Anchor DETR.

Last update: Dec 28, 2022

Related tags

Overview

Anchor DETR: Query Design for Transformer-Based Detector

Introduction

This repository is an official implementation of the Anchor DETR. We encode the anchor points as the object queries in DETR. Multiple patterns are attached to each anchor point to solve the difficulty: "one region, multiple objects". We also propose an attention variant RCDA to reduce the memory cost for high-resolution features.

Main Results

	feature	epochs	AP	GFLOPs	Infer Speed (FPS)
DETR	DC5	500	43.3	187	10 (12)
SMCA	multi-level	50	43.7	152	10
Deformable DETR	multi-level	50	43.8	173	15
Conditional DETR	DC5	50	43.8	195	10
Anchor DETR	DC5	50	44.3	151	16 (19)

Note:

The results are based on ResNet-50 backbone.
Inference speeds are measured on NVIDIA Tesla V100 GPU.
The values in parentheses of the Infer Speed indicate the speed with torchscript optimization.

Model

name	backbone	AP	URL
AnchorDETR-C5	R50	42.1	model / log
AnchorDETR-DC5	R50	44.3	model / log
AnchorDETR-C5	R101	43.5	model / log
AnchorDETR-DC5	R101	45.1	model / log

Note: the models and logs are also available at Baidu Netdisk with code hh13.

Usage

Installation

First, clone the repository locally:

git clone https://github.com/megvii-research/AnchorDETR.git

Then, install dependencies:

pip install -r requirements.txt

Training

To train AnchorDETR on a single node with 8 GPUs:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py  --coco_path /path/to/coco

Evaluation

To evaluate AnchorDETR on a single node with 8 GPUs:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --eval --coco_path /path/to/coco --resume /path/to/checkpoint.pth

To evaluate AnchorDETR with a single GPU:

python main.py --eval --coco_path /path/to/coco --resume /path/to/checkpoint.pth

Citation

If you find this project useful for your research, please consider citing the paper.

@misc{wang2021anchor,
      title={Anchor DETR: Query Design for Transformer-Based Detector},
      author={Yingming Wang and Xiangyu Zhang and Tong Yang and Jian Sun},
      year={2021},
      eprint={2109.07107},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

If you have any questions, feel free to open an issue or contact us at [email protected].

An official implementation of the Anchor DETR.

Related tags

Overview

Anchor DETR: Query Design for Transformer-Based Detector

Introduction

Main Results

Model

Usage

Installation

Training

Evaluation

Citation

Contact

Owner

MEGVII Research

An Api for Emotion recognition.

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

CKD - Collaborative Knowledge Distillation for Heterogeneous Information Network Embedding

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Finetune alexnet with tensorflow - Code for finetuning AlexNet in TensorFlow >= 1.2rc0

Graph Representation Learning via Graphical Mutual Information Maximization

Unpaired Caricature Generation with Multiple Exaggerations

Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

Code accompanying our NeurIPS 2021 traffic4cast challenge

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

All course materials for the Zero to Mastery Machine Learning and Data Science course.

AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

Official PyTorch implementation of MAAD: A Model and Dataset for Attended Awareness

这是一个facenet-pytorch的库，可以用于训练自己的人脸识别模型。

Python package for covariance matrices manipulation and Biosignal classification with application in Brain Computer interface

Emblaze - Interactive Embedding Comparison

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks