FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

benchmark

Details (Test device: nvidia RTX 2080ti)

Methods backbone FPS mAP(%)
ReDet ReR50 8.8 76.25
S2ANet Mobilenet v2 18.9 67.46
S2ANet R50 14.4 74.14
R3Det R50 9.2 71.9
Oriented-RCNN Mobilenet v2 21.2 72.72
Oriented-RCNN R50 13.8 75.87
Oriented-RCNN R101 11.3 76.28
RetinaNet-O Mobilenet v2 22.4 67.95
RetinaNet-O R50 16.5 72.7
RetinaNet-O R101 13.3 73.7
Faster-RCNN-O Mobilenet v2 23 67.41
Faster-RCNN-O R50 14.4 72.29
Faster-RCNN-O R101 11.4 72.65
FCOSR-S Mobilenet v2 23.7 74.05
FCOSR-M Rx50 14.6 77.15
FCOSR-L Rx101 7.9 77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 74.05 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 76.11 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 77.15 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 79.25 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 77.39 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 78.80 model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 66.37 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 73.14 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 68.74 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 73.79 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 69.96 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 75.41 model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model backbone Rot. Sched. Param. Input GFLOPs FPS AP50(07) AP75(07) AP50(12) AP75(12) download
FCOSR-S Mobilenet v2 40k iters 7.29M 800×800 61.57 35.3 90.08 76.75 92.67 75.73 model/cfg
FCOSR-M ResNext50-32x4 40k iters 31.37M 800×800 127.87 26.9 90.15 78.58 94.84 81.38 model/cfg
FCOSR-L ResNext101-64x4 40k iters 89.61M 800×800 271.75 15.1 90.14 77.98 95.74 80.94 model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model backbone Head channels Sched. Param Size Input GFLOPs FPS mAP onnx TensorRT
FCOSR-lite Mobilenet v2 256 3x 6.9M 51.63MB 1024×1024 101.25 7.64 74.30 onnx trt
FCOSR-tiny Mobilenet v2 128 3x 3.52M 23.2MB 1024×1024 35.89 10.68 73.93 onnx trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name size patch size gap patches det objects det time(s)
P0031.png 5343×3795 1024 200 35 1197 2.75
P0051.png 4672×5430 1024 200 42 309 2.38
P0112.png 6989×4516 1024 200 54 184 3.02
P0137.png 5276×4308 1024 200 35 66 1.95
P1004.png 7001×3907 1024 200 45 183 2.52
P1125.png 7582×4333 1024 200 54 28 2.95
P1129.png 4093×6529 1024 200 40 70 2.23
P1146.png 5231×4616 1024 200 42 64 2.29
P1157.png 7278×5286 1024 200 63 184 3.47
P1378.png 5445×4561 1024 200 42 83 2.32
P1379.png 4426×4182 1024 200 30 686 1.78
P1393.png 6072×6540 1024 200 64 893 3.63
P1400.png 6471×4479 1024 200 48 348 2.63
P1402.png 4112×4793 1024 200 30 293 1.68
P1406.png 6531×4182 1024 200 40 19 2.19
P1415.png 4894x4898 1024 200 36 190 1.99
P1436.png 5136×5156 1024 200 42 39 2.31
P1448.png 7242×5678 1024 200 63 51 3.41
P1457.png 5193×4658 1024 200 42 382 2.33
P1461.png 6661×6308 1024 200 64 27 3.45
P1494.png 4782×6677 1024 200 48 70 2.61
P1500.png 4769×4386 1024 200 36 92 1.96
P1772.png 5963×5553 1024 200 49 28 2.70
P1774.png 5352×4281 1024 200 35 291 1.95
P1796.png 5870×5822 1024 200 49 308 2.74
P1870.png 5942×6059 1024 200 56 135 3.04
P2043.png 4165×3438 1024 200 20 1479 1.49
P2329.png 7950×4334 1024 200 60 83 3.26
P2641.png 7574×5625 1024 200 63 269 3.41
P2642.png 7039×5551 1024 200 63 451 3.50
P2643.png 7568×5619 1024 200 63 249 3.40
P2645.png 4605×3442 1024 200 24 357 1.42
P2762.png 8074×4359 1024 200 60 127 3.23
P2795.png 4495×3981 1024 200 30 65 1.64
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

Multimedia Research 196 Dec 13, 2022
PyTorch implementation of the paper Deep Networks from the Principle of Rate Reduction

Deep Networks from the Principle of Rate Reduction This repository is the official PyTorch implementation of the paper Deep Networks from the Principl

459 Dec 27, 2022
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
Experiments with the Robust Binary Interval Search (RBIS) algorithm, a Query-Based prediction algorithm for the Online Search problem.

OnlineSearchRBIS Online Search with Best-Price and Query-Based Predictions This is the implementation of the Robust Binary Interval Search (RBIS) algo

S. K. 1 Apr 16, 2022
The codes I made while I practiced various TensorFlow examples

TensorFlow_Exercises The codes I made while I practiced various TensorFlow examples About the codes I didn't create these codes by myself, but re-crea

Terry Taewoong Um 614 Dec 08, 2022
Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

ddpg-aigym Deep Deterministic Policy Gradient Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Ten

Steven Spielberg P 247 Dec 07, 2022
Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Learning Domain Invariant Representations in Goal-conditioned Block MDPs Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jim

Chongyi Zheng 3 Apr 12, 2022
Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

keven 198 Dec 20, 2022
Lightweight Cuda Renderer with Python Wrapper.

pyRender Lightweight Cuda Renderer with Python Wrapper. Compile Change compile.sh line 5 to the glm library include path. This library can be download

Jingwei Huang 53 Dec 02, 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Created by Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Zheng Z

Yongming Rao 321 Dec 27, 2022
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

1.1k Dec 30, 2022
On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

Zemin Liu 4 Jun 18, 2022
TextureGAN in Pytorch

TextureGAN This code is our PyTorch implementation of TextureGAN [Project] [Arxiv] TextureGAN is a generative adversarial network conditioned on sketc

Patsorn 147 Dec 14, 2022
RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

RLMeta rlmeta - a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib Installation To build fro

Meta Research 281 Dec 22, 2022
A Player for Kanye West's Stem Player. Sort of an emulator.

Stem Player Player Stem Player Player Usage Download the latest release here Optional: install ffmpeg, instructions here NOTE: DOES NOT ENABLE DOWNLOA

119 Dec 28, 2022
CountDown to New Year and shoot fireworks

CountDown and Shoot Fireworks About App This is an small application make you re

5 Dec 31, 2022
The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data

Turing Change Point Detection Benchmark Welcome to the repository for the Turing Change Point Detection Benchmark, a benchmark evaluation of change po

The Alan Turing Institute 85 Dec 28, 2022
Generalized Random Forests

generalized random forests A pluggable package for forest-based statistical estimation and inference. GRF currently provides non-parametric methods fo

GRF Labs 781 Dec 25, 2022
EgGateWayGetShell py脚本

EgGateWayGetShell_py 免责声明 由于传播、利用此文所提供的信息而造成的任何直接或者间接的后果及损失,均由使用者本人负责,作者不为此承担任何责任。 使用 python3 eg.py urls.txt 目标 title:锐捷网络-EWEB网管系统 port:4430 漏洞成因 ?p

榆木 61 Nov 09, 2022
The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".

Aspect-Category-Opinion-Sentiment (ACOS) Quadruple Extraction This repo contains the data sets and source code of our paper: Aspect-Category-Opinion-S

NUSTM 144 Jan 02, 2023