Analyzing basic network responses to novel classes

Last update: Oct 02, 2022

Overview

novelty-detection

Analyzing how AlexNet responds to novel classes with varying degrees of similarity to pretrained classes from ImageNet.

If you find this work helpful in your research, please cite:

Eshed, N. (2020). Novelty detection and analysis in convolutional neural networks (Accession No. 27994027)[Master's thesis, Cornell University]. ProQuest Dissertations & Theses Global.

@mastersthesis{eshed_novelty_detection,
  author={Noam Eshed},
  title={Novelty detection and analysis in convolutional neural networks},
  school={Cornell University},
  year={2020},
  publisher={ProQuest Dissertations & Theses Global}
}

Data

in_out_class.csv

This is hand-annotated data from iNaturalist. The most up-to-date version can be found here The data taken directly from iNaturalist includes the biological groups and scientific names of natural things. Annotators included the common English name(s) for each creature, their relation to ImageNet, any relevant notes, and their initials. For details regarding annotation guidelines, see this link.

alexnet_inat_results/

inat_results_top_choice.json

This json file contains the results from testing a pre-trained AlexNet (trained on ImageNet) on images from iNaturalist. It only includes the top one result (i.e. the label chosen by the network) for each image in iNaturalist, and so is most efficient when looking into the distribution of labels chosen for a certain type of creature.

Biological group files

Each of these folders contains all of the results of testing a pre-trained AlexNet (trained on ImageNet) on images from iNaturalist in the given biological group. This includes all possible labels, their scores, and their confidence values for each image. Since ImageNet has 1000 classes, that means that each image in iNaturalist has 3 vectors of length 1000 to store the label, score, and confidence value information. Each of the files within these folders contains the data for a single species within the given biological group

Code

class_in_or_out.py

This script plots the distribution of the top n CNN labels for all (or part) of the image data. Looking at all species of interest, it averages the frequency of the top n labels. Note that the top n labels are not necessarily in the same order for each species, and so the labels themselves are ignored.

The species each fall under one of four annotated ImageNet relationship categories: in ImageNet, not in ImageNet, parent in ImageNet, and relative in Imagenet. These annotations are taken from in_out_class.csv. The plots may be stratified by these relationship categories.

As an example, this code can plot the frequency of the top 10 labels over all bird images, and split by the species' relationship to Imagenet. The resulting plot will show the average distribution of label frequencies. The top label frequency, for example, is the frequency of the top occuring label over all images averaged over a given species, regardless of what that top label actually was.

This plot shows the frequency of the top 20 labels over all bird species in iNaturalist:

plot_result_distribution.py

This script plots the distribution of CNN labels over each species. It does so by counting the number of occurrences of each label over many images of that species and normalizing the result to get a frequency distribution rather than an occurrence count distribution. There is an option to color and label each point according to the average confidence of the label. This can help us understand what common mistakes the network makes when classifying images of a given species.

In this example plot, we can see the distribution of all labels guessed by the network in the set of African Penguin images. It shows that approximately 19% of the images are classified as magpie, 19% as goose, etc. Interestingly, the king_penguin label is only awarded to 5% of the images and is tied for the 5th most common label.

alexnet_novelty.py

This script tests AlexNet (pretrained on ImageNet) on all of the data from iNaturalist and saves the result into the alexnet_inat_results/ folder.

Analyzing basic network responses to novel classes

Related tags

Overview

novelty-detection

Data

in_out_class.csv

alexnet_inat_results/

inat_results_top_choice.json

Biological group files

Code

class_in_or_out.py

plot_result_distribution.py

alexnet_novelty.py

Owner

Noam Eshed

Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Unbiased Learning To Rank Algorithms (ULTRA)

Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.

PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Neural Re-rendering for Full-frame Video Stabilization

This repository contain code on Novelty-Driven Binary Particle Swarm Optimisation for Truss Optimisation Problems.

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

A Japanese Medical Information Extraction Toolkit

Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

A minimalist implementation of score-based diffusion model

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

This is the official code release for the paper Shape and Material Capture at Home

Implementation of PersonaGPT Dialog Model

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

An evaluation toolkit for voice conversion models.

Churn-Prediction-Project - In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class.