Instance-based label smoothing for improving deep neural networks generalization and calibration

Last update: Aug 13, 2022

Overview

Instance-based Label Smoothing for Neural Networks

Pytorch Implementation of the algorithm.
This repository includes a new proposed method for instance-based label smoothing in neural networks, where the target probability distribution is not uniformly distributed among incorrect classes. Instead, each incorrect class is going to be assigned a target probability that is proportional to the output score of this particular class relative to all the remaining classes for a network trained with vanilla cross-entropy loss on the hard target labels.

The following figure summarizes the idea of our instance-based label smoothing that aims to keep the information about classes similarity structure while training using label smoothing.

Requirements

Python 3.x
pandas
numpy
pytorch

Usage

Datasets

CIFAR10 / CIFAR100 / FashionMNIST

Files Content

The project have a structure as below:

├── Vanilla-cross-entropy.py
├── Label-smoothing.py
├── Instance-based-smoothing.py
├── Models-evaluation.py
├── Network-distillation.py
├── utils
│   ├── data_loader.py
│   ├── utils.py
│   ├── evaluate.py
│   ├── params.json
├── models
│   ├── resnet.py
│   ├── densenet.py
│   ├── inception.py
│   ├── shallownet.py

Vanilla-cross-entropy.py is the file used for training the networks using cross-entropy without label smoothing.
Label-smoothing.py is the file used for training the networks using cross-entropy with standard label smoothing.
Instance-based-smoothing.py is the file used for training the networks using cross-entropy with instance-based label smoothing.
Models-evaluation.py is the file used for evaluation of the trained networks.
Network-distillation.py is the file used for distillation of trained networks into a shallow convolutional network of 5 layers.
models/ includes all the implementations of the different architectures used in our evaluation like ResNet, DenseNet, Inception-V4. Also, the shallow-cnn student network used in distillation experiments.
utils/ includes all utilities functions required for the different models training and evaluation.

Example

python Instance-based-smoothing.py --dataset cifar10 --model resnet18 --num_classes 10

List of Arguments accepted for Codes of Training and Evaluation of Different Models:

--lr type = float, default = 0.1, help = Starting learning rate (A weight decay of $1e^{-4}$ is used).
--tr_size type = float, default = 0.8, help = Size of training set split out of the whole training set (0.2 for validation).
--batch_size type = int, default = 512, help = Batch size of mini-batch training process.
--epochs type = int, default = 100, help = Number of training epochs.
--estop type = int, default = 10, help = Number of epochs without loss improvement leading to early stopping.
--ece_bins type = int, default = 10, help = Number of bins for expected calibration error calculation.
--dataset, type=str, help=Name of dataset to be used (cifar10/cifar100/fashionmnist).
--num_classes type = int, default = 10, help = Number of classes in the dataset.
--model, type=str, help=Name of the model to be trained. eg: resnet18 / resnet50 / inceptionv4 / densetnet (works for FashionMNIST only).

Results

Results of the comparison of different methods on 3 datasets using 4 different architectures are reported in the following table.
The experiments were repeated 3 times, and average $\pm$ stdev of log loss, expected calibration error (ECE), accuracy, distilled student network accuracy and distilled student log loss metrics are reported.

A t-sne visualization for the logits of 3-different classes in CIFAR-10 can be shown below:

Instance-based label smoothing for improving deep neural networks generalization and calibration

Related tags

Overview

Instance-based Label Smoothing for Neural Networks

Requirements

Usage

Datasets

Files Content

List of Arguments accepted for Codes of Training and Evaluation of Different Models:

Results

Owner

Mohamed Maher

Cards Against Humanity AI

Video Matting Refinement For Python

Small utility to demangle Nim symbols in callgrind files

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

This repository contains implementations and illustrative code to accompany DeepMind publications

The official implementation of the Hybrid Self-Attention NEAT algorithm

Vision-and-Language Navigation in Continuous Environments using Habitat

A C implementation for creating 2D voronoi diagrams

Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic plasticity".

Deep Q-learning for playing chrome dino game

This repository contains all code and data for the Inside Out Visual Place Recognition task

Facilitating Database Tuning with Hyper-ParameterOptimization: A Comprehensive Experimental Evaluation

Vpw analyzer - A visual J1850 VPW analyzer written in Python

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Record radiologists' eye gaze when they are labeling images.

Official pytorch implementation of Rainbow Memory (CVPR 2021)

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Yolov5+SlowFast: Realtime Action Detection Based on PytorchVideo

Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution'