4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

Last update: Nov 09, 2022

Overview

A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR)

Challenge Site

Overview

Synthetic Aperture Radar (SAR) has received more attention due to its complementary superiority on capturing significant information in the remote sensing area. However, for an Aerial View Object Classification (AVOC) task, SAR images still suffer from the long-tailed distribution of the aerial view objects. This disparity dampens the performance of classification methods, especially for the datasensitive deep learning models. In this paper, we propose a two-stage shake-shake network to tackle the long-tailed learning problem. Specifically, it decouples the learning procedure into the representation learning stage and the classification learning stage. Moreover, we apply the test time augmentation (TTA) and a post-processing approach (CAN) to improve the accuracy. In the PBVS 2022 Multi-modal Aerial View Object Classification Challenge Track 1, our method achieves 21.82% and 27.97% accuracy in the development phase and testing phase respectively, which achieves the top-tier among all the participants.

Requirements

Ubuntu (It's only tested on Ubuntu, so it may not work on Windows.)
Python >= 3.7
PyTorch >= 1.4.0
torchvision
```
pip install -r requirements.txt
```

Usage

The first stage training

python train.py --config ./configs/sar10/shake_shake.yaml

You need to change the value of “dataset_dir”, “dataset_dir_val”, under the “dataset” field and “output_dir” under the “train” field in the file “./configs/sar10/shake_shake.yaml”。

The second stage training

python train.py --config ./configs/sar10/shake_shake_fc.yaml

You need to change the value of “dataset_dir”, “dataset_dir_val” under the “dataset” field and “output_dir”, “checkpoint” under the “train” field in the file “./configs/sar10/shake_shake_fc.yaml”。

Test

python predict_TTA.py

You need to change the value of “dataset_dir”, “checkpoint”, under the “test” field in the file “./configs/sar10/shake_shake.yaml”, then you can find the results in file “.result/results.csv”。
You can download the trained model here.

Acknowledge

The codes borrow heavily from hysts/pytorch_image_classification.

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

Related tags

Overview

A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects

Overview

Requirements

Usage

The first stage training

The second stage training

Test

Acknowledge

Owner

LinpengPan

PyTorch reimplementation of minimal-hand (CVPR2020)

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

S2s2net - Sentinel-2 Super-Resolution Segmentation Network

A novel benchmark dataset for Monocular Layout prediction

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

This is the official repository of the paper Stocastic bandits with groups of similar arms (NeurIPS 2021). It contains the code that was used to compute the figures and experiments of the paper.

A certifiable defense against adversarial examples by training neural networks to be provably robust

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Bridging Vision and Language Model

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

tf2-keras implement yolov5

Change Detection in SAR Images Based on Multiscale Capsule Network

Opinionated code formatter, just like Python's black code formatter but for Beancount

bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

Diverse graph algorithms implemented using JGraphT library.

METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)

Code for CVPR2019 Towards Natural and Accurate Future Motion Prediction of Humans and Animals

An offline deep reinforcement learning library