FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

benchmark

Details (Test device: nvidia RTX 2080ti)

Methods backbone FPS mAP(%)
ReDet ReR50 8.8 76.25
S2ANet Mobilenet v2 18.9 67.46
S2ANet R50 14.4 74.14
R3Det R50 9.2 71.9
Oriented-RCNN Mobilenet v2 21.2 72.72
Oriented-RCNN R50 13.8 75.87
Oriented-RCNN R101 11.3 76.28
RetinaNet-O Mobilenet v2 22.4 67.95
RetinaNet-O R50 16.5 72.7
RetinaNet-O R101 13.3 73.7
Faster-RCNN-O Mobilenet v2 23 67.41
Faster-RCNN-O R50 14.4 72.29
Faster-RCNN-O R101 11.4 72.65
FCOSR-S Mobilenet v2 23.7 74.05
FCOSR-M Rx50 14.6 77.15
FCOSR-L Rx101 7.9 77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 74.05 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 76.11 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 77.15 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 79.25 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 77.39 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 78.80 model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 66.37 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 73.14 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 68.74 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 73.79 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 69.96 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 75.41 model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model backbone Rot. Sched. Param. Input GFLOPs FPS AP50(07) AP75(07) AP50(12) AP75(12) download
FCOSR-S Mobilenet v2 40k iters 7.29M 800×800 61.57 35.3 90.08 76.75 92.67 75.73 model/cfg
FCOSR-M ResNext50-32x4 40k iters 31.37M 800×800 127.87 26.9 90.15 78.58 94.84 81.38 model/cfg
FCOSR-L ResNext101-64x4 40k iters 89.61M 800×800 271.75 15.1 90.14 77.98 95.74 80.94 model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model backbone Head channels Sched. Param Size Input GFLOPs FPS mAP onnx TensorRT
FCOSR-lite Mobilenet v2 256 3x 6.9M 51.63MB 1024×1024 101.25 7.64 74.30 onnx trt
FCOSR-tiny Mobilenet v2 128 3x 3.52M 23.2MB 1024×1024 35.89 10.68 73.93 onnx trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name size patch size gap patches det objects det time(s)
P0031.png 5343×3795 1024 200 35 1197 2.75
P0051.png 4672×5430 1024 200 42 309 2.38
P0112.png 6989×4516 1024 200 54 184 3.02
P0137.png 5276×4308 1024 200 35 66 1.95
P1004.png 7001×3907 1024 200 45 183 2.52
P1125.png 7582×4333 1024 200 54 28 2.95
P1129.png 4093×6529 1024 200 40 70 2.23
P1146.png 5231×4616 1024 200 42 64 2.29
P1157.png 7278×5286 1024 200 63 184 3.47
P1378.png 5445×4561 1024 200 42 83 2.32
P1379.png 4426×4182 1024 200 30 686 1.78
P1393.png 6072×6540 1024 200 64 893 3.63
P1400.png 6471×4479 1024 200 48 348 2.63
P1402.png 4112×4793 1024 200 30 293 1.68
P1406.png 6531×4182 1024 200 40 19 2.19
P1415.png 4894x4898 1024 200 36 190 1.99
P1436.png 5136×5156 1024 200 42 39 2.31
P1448.png 7242×5678 1024 200 63 51 3.41
P1457.png 5193×4658 1024 200 42 382 2.33
P1461.png 6661×6308 1024 200 64 27 3.45
P1494.png 4782×6677 1024 200 48 70 2.61
P1500.png 4769×4386 1024 200 36 92 1.96
P1772.png 5963×5553 1024 200 49 28 2.70
P1774.png 5352×4281 1024 200 35 291 1.95
P1796.png 5870×5822 1024 200 49 308 2.74
P1870.png 5942×6059 1024 200 56 135 3.04
P2043.png 4165×3438 1024 200 20 1479 1.49
P2329.png 7950×4334 1024 200 60 83 3.26
P2641.png 7574×5625 1024 200 63 269 3.41
P2642.png 7039×5551 1024 200 63 451 3.50
P2643.png 7568×5619 1024 200 63 249 3.40
P2645.png 4605×3442 1024 200 24 357 1.42
P2762.png 8074×4359 1024 200 60 127 3.23
P2795.png 4495×3981 1024 200 30 65 1.64
In this tutorial, you will perform inference across 10 well-known pre-trained object detectors and fine-tune on a custom dataset. Design and train your own object detector.

Object Detection Object detection is a computer vision task for locating instances of predefined objects in images or videos. In this tutorial, you wi

Ibrahim Sobh 62 Dec 25, 2022
Malware Bypass Research using Reinforcement Learning

Malware Bypass Research using Reinforcement Learning

Bobby Filar 76 Dec 26, 2022
Deep Learning for Time Series Classification

Deep Learning for Time Series Classification This is the companion repository for our paper titled "Deep learning for time series classification: a re

Hassan ISMAIL FAWAZ 1.2k Jan 02, 2023
Config files for my GitHub profile.

Canalyst Candas Data Science Library Name Canalyst Candas Description Built by a former PM / analyst to give anyone with a little bit of Python knowle

Canalyst Candas 13 Jun 24, 2022
This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-grained Classification".

HA-in-Fine-Grained-Classification This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-g

16 Oct 29, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
Convert Mission Planner (ArduCopter) Waypoint Missions to Litchi CSV Format to execute on DJI Drones

Mission Planner to Litchi Convert Mission Planner (ArduCopter) Waypoint Surveys to Litchi CSV Format to execute on DJI Drones Litchi doesn't support S

Yaros 24 Dec 09, 2022
QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper)

QAHOI QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper) Requirements PyTorch = 1.5.1 torchvision = 0.6.1 pip install -r requ

38 Dec 29, 2022
InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision This is an official PyTorch implementation of the InsCLR paper. Download Dataset Dataset Im

Zelu Deng 25 Aug 30, 2022
TensorFlow implementation of original paper : https://github.com/hszhao/PSPNet

Keras implementation of PSPNet(caffe) Implemented Architecture of Pyramid Scene Parsing Network in Keras. For the best compability please use Python3.

VladKry 386 Dec 29, 2022
Trainable Bilateral Filter Layer (PyTorch)

Trainable Bilateral Filter Layer (PyTorch) This repository contains our GPU-accelerated trainable bilateral filter layer (three spatial and one range

FabianWagner 26 Dec 25, 2022
Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition How Fast Compare to Other Zero-Shot NAS Proxies on CIFAR-10/100 Pre-trained Model

190 Dec 29, 2022
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

TorchCAM: class activation explorer Simple way to leverage the class-specific activation of convolutional layers in PyTorch. Quick Tour Setting your C

F-G Fernandez 1.2k Dec 29, 2022
[SDM 2022] Towards Similarity-Aware Time-Series Classification

SimTSC This is the PyTorch implementation of SDM2022 paper Towards Similarity-Aware Time-Series Classification. We propose Similarity-Aware Time-Serie

Daochen Zha 49 Dec 27, 2022
Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

Consistent Depth of Moving Objects in Video This repository contains training code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in

Google 203 Jan 05, 2023
Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark (ICCV 2021)

Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark (ICCV 2021) Kun Wang, Zhenyu Zhang, Zhiqiang Yan, X

kunwang 66 Nov 24, 2022
Resources complimenting the Machine Learning Course led in the Faculty of mathematics and informatics part of Sofia University.

Machine Learning and Data Mining, Summer 2021-2022 How to learn data science and machine learning? Programming. Learn Python. Basic Statistics. Take a

Simeon Hristov 8 Oct 04, 2022
Benchmark tools for Compressive LiDAR-to-map registration

Benchmark tools for Compressive LiDAR-to-map registration This repo contains the released version of code and datasets used for our IROS 2021 paper: "

Allie 9 Nov 24, 2022
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 26 Jan 04, 2023
LUKE -- Language Understanding with Knowledge-based Embeddings

LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-trained contextualized representation of words and entities based on transf

Studio Ousia 587 Dec 30, 2022