Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Last update: Jan 03, 2023

Overview

Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Code for our paper: "Open-Set Recognition: A Good Closed-Set Classifier is All You Need"

Abstract: The ability to identify whether or not a test sample belongs to one of the semantic classes in a classifier's training set is critical to practical deployment of the model. This task is termed open-set recognition (OSR) and has received significant attention in recent years. In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We find that this relationship holds across loss objectives and architectures, and further demonstrate the trend both on the standard OSR benchmarks as well as on a large-scale ImageNet evaluation. Second, we use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy, and with this strong baseline achieve a new state-of-the-art on the most challenging OSR benchmark. Similarly, we boost the performance of the existing state-of-the-art method by improving its closed-set accuracy, but this does not surpass the strong baseline on the most challenging dataset. Our third contribution is to reappraise the datasets used for OSR evaluation, and construct new benchmarks which better respect the task of detecting semantic novelty, as opposed to low-level distributional shifts as tackled by neighbouring machine learning fields. In this new setting, we again demonstrate that there is negligible difference between the strong baseline and the existing state-of-the-art.

Running

Dependencies

pip install -r requirements.txt

Datasets

A number of datasets are used in this work, many of them can be downloaded directly through PyTorch servers:

Standard Benchmarks: MNIST, SVHN, CIFAR-10/100, TinyImageNet
Proposed Benchmarks: ImageNet-21K-P, CUB, FGVC-Aircraft

FGVC Open-set Splits:

For the proposed FGVC open-set benchmarks, the directory data/open_set_splits contains the proposed class splits as .pkl files. The files also include information on which open-set classes are most similar to which closed-set classes.

Config

Set paths to datasets and pre-trained models (for fine-grained experiments) in config.py

Set SAVE_DIR (logfile destination) and PYTHON (path to python interpreter) in bash_scripts scripts.

Run

To recreate results on TinyImageNet (Table 2). Our runs give us 82.60% AUROC for both (ARPL + CS)+ and Cross-Entropy+.

bash bash_scripts/osr_train_tinyimagenet.sh

Optimal Hyper-parameters:

We tuned label smoothing and RandAug hyper-parameters to optimise closed-set accuracy on a single random validation split for each dataset. For other hyper-parameters (image size, batch size, learning rate) we took values from the open-set literature for the standard datasets (specifically, the ARPL paper) and values from the FGVC literature for the proposed FGVC benchmarks.

Cross-Entropy optimal hyper-parameters:

Dataset	Image Size	Learning Rate	RandAug M	RandAug N	Label Smoothing	Batch Size
MNIST	32	0.1	1	8	0.0	128
SVHN	32	0.1	1	18	0.0	128
CIFAR-10	32	0.1	1	6	0.0	128
CIFAR + N	32	0.1	1	6	0.0	128
TinyImageNet	64	0.01	1	9	0.9	128
CUB	448	0.001	2	30	0.3	32
FGVC-Aircraft	448	0.001	2	15	0.2	32

ARPL + CS optimal hyper-parameters:

(Note the lower learning rate for TinyImageNet)

Dataset	Image Size	Learning Rate	RandAug M	RandAug N	Label Smoothing	Batch Size
MNIST	32	0.1	1	8	0.0	128
SVHN	32	0.1	1	18	0.0	128
CIFAR10	32	0.1	1	15	0.0	128
CIFAR + N	32	0.1	1	6	0.0	128
TinyImageNet	64	0.001	1	9	0.9	128
CUB	448	0.001	2	30	0.2	32
FGVC-Aircraft	448	0.001	2	18	0.1	32

Other

This repo also contains other useful utilities, including:

utils/logfile_parser.py: To directly parse stdout outputs for Accuracy / AUROC metrics
data/open_set_datasets.py: A useful framework for easily splitting existing datasets into controllable open-set splits into train, val, test_known and test_unknown. Note: ImageNet has not yet been integrated here.
utils/schedulers.py: Implementation of Cosine Warm Restarts with linear rampup as a PyTorch learning rate scheduler

Citation

If you use this code in your research, please consider citing our paper:

@article{vaze21openset,
    author  = {Sagar Vaze and Kai Han and Andrea Vedaldi and Andrew Zisserman},
    title   = {Open-Set Recognition: A Good Closed-Set Classifier is All You Need},
    journal = {arXiv preprint},
    year    = {2021},
  }

Furthermore, please also consider citing Adversarial Reciprocal Points Learning for Open Set Recognition, upon whose code we build this repo.

Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Related tags

Overview

Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Running

Dependencies

Datasets

Config

Run

Optimal Hyper-parameters:

Other

Citation

Owner

ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation

Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

OpenVINO黑客松比赛项目

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

unet-family: Ultimate version

This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-grained Classification".

OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

Modular Probabilistic Programming on MXNet

Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Animation of solving the traveling salesman problem to optimality using mixed-integer programming and iteratively eliminating sub tours

SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Learning Saliency Propagation for Semi-supervised Instance Segmentation

Semantic Segmentation for Aerial Imagery using Convolutional Neural Network

This repository is the code of the paper Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies

Self-describing JSON-RPC services made easy

The implementation of 'Image synthesis via semantic composition'.

Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).