Instance-based label smoothing for improving deep neural networks generalization and calibration

Overview

Instance-based Label Smoothing for Neural Networks

  • Pytorch Implementation of the algorithm.
  • This repository includes a new proposed method for instance-based label smoothing in neural networks, where the target probability distribution is not uniformly distributed among incorrect classes. Instead, each incorrect class is going to be assigned a target probability that is proportional to the output score of this particular class relative to all the remaining classes for a network trained with vanilla cross-entropy loss on the hard target labels.
Instance-based Label Smoothing idea
  • The following figure summarizes the idea of our instance-based label smoothing that aims to keep the information about classes similarity structure while training using label smoothing.
Instance-based Label Smoothing process

Requirements

  • Python 3.x
  • pandas
  • numpy
  • pytorch

Usage

Datasets

  • CIFAR10 / CIFAR100 / FashionMNIST

Files Content

The project have a structure as below:

├── Vanilla-cross-entropy.py
├── Label-smoothing.py
├── Instance-based-smoothing.py
├── Models-evaluation.py
├── Network-distillation.py
├── utils
│   ├── data_loader.py
│   ├── utils.py
│   ├── evaluate.py
│   ├── params.json
├── models
│   ├── resnet.py
│   ├── densenet.py
│   ├── inception.py
│   ├── shallownet.py

Vanilla-cross-entropy.py is the file used for training the networks using cross-entropy without label smoothing.
Label-smoothing.py is the file used for training the networks using cross-entropy with standard label smoothing.
Instance-based-smoothing.py is the file used for training the networks using cross-entropy with instance-based label smoothing.
Models-evaluation.py is the file used for evaluation of the trained networks.
Network-distillation.py is the file used for distillation of trained networks into a shallow convolutional network of 5 layers.
models/ includes all the implementations of the different architectures used in our evaluation like ResNet, DenseNet, Inception-V4. Also, the shallow-cnn student network used in distillation experiments.
utils/ includes all utilities functions required for the different models training and evaluation.

Example

python Instance-based-smoothing.py --dataset cifar10 --model resnet18 --num_classes 10

List of Arguments accepted for Codes of Training and Evaluation of Different Models:

--lr type = float, default = 0.1, help = Starting learning rate (A weight decay of $1e^{-4}$ is used).
--tr_size type = float, default = 0.8, help = Size of training set split out of the whole training set (0.2 for validation).
--batch_size type = int, default = 512, help = Batch size of mini-batch training process.
--epochs type = int, default = 100, help = Number of training epochs.
--estop type = int, default = 10, help = Number of epochs without loss improvement leading to early stopping.
--ece_bins type = int, default = 10, help = Number of bins for expected calibration error calculation.
--dataset, type=str, help=Name of dataset to be used (cifar10/cifar100/fashionmnist).
--num_classes type = int, default = 10, help = Number of classes in the dataset.
--model, type=str, help=Name of the model to be trained. eg: resnet18 / resnet50 / inceptionv4 / densetnet (works for FashionMNIST only).

Results

  • Results of the comparison of different methods on 3 datasets using 4 different architectures are reported in the following table.
  • The experiments were repeated 3 times, and average $\pm$ stdev of log loss, expected calibration error (ECE), accuracy, distilled student network accuracy and distilled student log loss metrics are reported.
  • A t-sne visualization for the logits of 3-different classes in CIFAR-10 can be shown below:
Owner
Mohamed Maher
Junior Research Fellow
Mohamed Maher
Machine Learning Time-Series Platform

cesium: Open-Source Platform for Time Series Inference Summary cesium is an open source library that allows users to: extract features from raw time s

632 Dec 26, 2022
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 04, 2023
Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

Swin Unet V2 Swin Unet V2 is a modified version of Swin Unet arxiv based on Swin

Chenxu Peng 26 Dec 03, 2022
Subdivision-based Mesh Convolutional Networks

Subdivision-based Mesh Convolutional Networks The official implementation of SubdivNet in our paper, Subdivion-based Mesh Convolutional Networks Requi

Zheng-Ning Liu 181 Dec 28, 2022
Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors,

24 Oct 26, 2022
Normalizing Flows with a resampled base distribution

Resampling Base Distributions of Normalizing Flows Normalizing flows are a popular class of models for approximating probability distributions. Howeve

Vincent Stimper 24 Nov 03, 2022
Contains a bunch of different python programm tasks

py_tasks Contains a bunch of different python programm tasks Armstrong.py - calculate Armsrong numbers in range from 0 to n with / without cache and c

Dmitry Chmerenko 1 Dec 17, 2021
Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

David Álvarez de la Torre 1 Dec 02, 2022
(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image This repo contains the source code for MobileHand, real-time estimation of 3D

90 Dec 12, 2022
Parallel Latent Tree-Induction for Faster Sequence Encoding

FastTrees This repository contains the experimental code supporting the FastTrees paper by Bill Pung. Software Requirements Python 3.6, NLTK and PyTor

Bill Pung 4 Mar 29, 2022
cisip-FIRe - Fast Image Retrieval

Fast Image Retrieval (FIRe) is an open source image retrieval project release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This project implements most of the major bi

CISiP Lab 39 Nov 25, 2022
Python Actor concurrency library

Thespian Actor Library This library provides the framework of an Actor model for use by applications implementing Actors. Thespian Site with Documenta

Kevin Quick 177 Dec 11, 2022
CVPR 2021

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-image Translation [Paper] | [Poster] | [Codes] Yahui Liu1,3, Enver Sangineto1,

Yahui Liu 37 Sep 12, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

Balanced Meta-Softmax Code for the paper Balanced Meta-Softmax for Long-Tailed Visual Recognition Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu

Jiawei Ren 65 Dec 21, 2022
A concise but complete implementation of CLIP with various experimental improvements from recent papers

x-clip (wip) A concise but complete implementation of CLIP with various experimental improvements from recent papers Install $ pip install x-clip Usag

Phil Wang 515 Dec 26, 2022
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks Introduction This repo contains the pytorch impl

Meta Research 38 Oct 10, 2022
🤗 Push your spaCy pipelines to the Hugging Face Hub

spacy-huggingface-hub: Push your spaCy pipelines to the Hugging Face Hub This package provides a CLI command for uploading any trained spaCy pipeline

Explosion 30 Oct 09, 2022
DeiT: Data-efficient Image Transformers

DeiT: Data-efficient Image Transformers This repository contains PyTorch evaluation code, training code and pretrained models for DeiT (Data-Efficient

Facebook Research 3.2k Jan 06, 2023