Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Overview

TailCalibX : Feature Generation for Long-tail Classification

by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

[arXiv] [Code] [pip Package] [Video] TailCalibX methodology

Table of contents

🐣 Easy Usage (Recommended way to use our method)

⚠ Caution: TailCalibX is just TailCalib employed multiple times. Specifically, we generate a set of features once every epoch and use them to train the classifier. In order to mimic that, three things must be done at every epoch in the following order:

  1. Collect all the features from your dataloader.
  2. Use the tailcalib package to make the features balanced by generating samples.
  3. Train the classifier.
  4. Repeat.

πŸ’» Installation

Use the package manager pip to install tailcalib.

pip install tailcalib

πŸ‘¨β€πŸ’» Example Code

Check the instruction here for a much more detailed python package information.

# Import
from tailcalib import tailcalib

# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"

# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))

# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)

# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")

πŸ§ͺ Advanced Usage

βœ” Things to do before you run the code from this repo

  • Change the data_root for your dataset in main.py.
  • If you are using wandb logging (Weights & Biases), make sure to change the wandb.init in main.py accordingly.

πŸ“€ How to use?

  • For just the methods proposed in this paper :
    • For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
  • For all the results show in the paper :
    • For CIFAR100-LT: run_all_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh

πŸ“š How to create the mini-ImageNet-LT dataset?

Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.

βš™ Arguments

  • --seed : Select seed for fixing it.

    • Default : 1
  • --gpu : Select the GPUs to be used.

    • Default : "0,1,2,3"
  • --experiment: Experiment number (Check 'libs/utils/experiment_maker.py').

    • Default : 0.1
  • --dataset : Dataset number.

    • Choices : 0 - CIFAR100, 1 - mini-imagenet
    • Default : 0
  • --imbalance : Select Imbalance factor.

    • Choices : 0: 1, 1: 100, 2: 50, 3: 10
    • Default : 1
  • --type_of_val : Choose which dataset split to use.

    • Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
    • Default : "vit"
  • --cv1 to --cv9 : Custom variable to use in experiments - purpose changes according to the experiment.

    • Default : "1"
  • --train : Run training sequence

    • Default : False
  • --generate : Run generation sequence

    • Default : False
  • --retraining : Run retraining sequence

    • Default : False
  • --resume : Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.

    • Default : False
  • --save_features : Collect feature representations.

    • Default : False
  • --save_features_phase : Dataset split of representations to collect.

    • Choices : "train", "val", "test"
    • Default : "train"
  • --config : If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.

    • Default : None

πŸ‹οΈβ€β™‚οΈ Trained weights

Experiment CIFAR100-LT (ResNet32, seed 1, Imb 100) mini-ImageNet-LT (ResNeXt50)
TailCalib Git-LFS Git-LFS
TailCalibX Git-LFS Git-LFS
CBD + TailCalibX Git-LFS Git-LFS

πŸͺ€ Results on a Toy Dataset

Open In Colab

The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.

Check this notebook to play with the toy example from which the plot below was generated.

🌴 Directory Tree

TailCalibX
β”œβ”€β”€ libs
β”‚   β”œβ”€β”€ core
β”‚   β”‚   β”œβ”€β”€ ce.py
β”‚   β”‚   β”œβ”€β”€ core_base.py
β”‚   β”‚   β”œβ”€β”€ ecbd.py
β”‚   β”‚   β”œβ”€β”€ modals.py
β”‚   β”‚   β”œβ”€β”€ TailCalib.py
β”‚   β”‚   └── TailCalibX.py
β”‚   β”œβ”€β”€ data
β”‚   β”‚   β”œβ”€β”€ dataloader.py
β”‚   β”‚   β”œβ”€β”€ ImbalanceCIFAR.py
β”‚   β”‚   └── mini-imagenet
β”‚   β”‚       β”œβ”€β”€ 0.01_test.txt
β”‚   β”‚       β”œβ”€β”€ 0.01_train.txt
β”‚   β”‚       └── 0.01_val.txt
β”‚   β”œβ”€β”€ loss
β”‚   β”‚   β”œβ”€β”€ CosineDistill.py
β”‚   β”‚   └── SoftmaxLoss.py
β”‚   β”œβ”€β”€ models
β”‚   β”‚   β”œβ”€β”€ CosineDotProductClassifier.py
β”‚   β”‚   β”œβ”€β”€ DotProductClassifier.py
β”‚   β”‚   β”œβ”€β”€ ecbd_converter.py
β”‚   β”‚   β”œβ”€β”€ ResNet32Feature.py
β”‚   β”‚   β”œβ”€β”€ ResNext50Feature.py
β”‚   β”‚   └── ResNextFeature.py
β”‚   β”œβ”€β”€ samplers
β”‚   β”‚   └── ClassAwareSampler.py
β”‚   └── utils
β”‚       β”œβ”€β”€ Default_config.yaml
β”‚       β”œβ”€β”€ experiments_maker.py
β”‚       β”œβ”€β”€ globals.py
β”‚       β”œβ”€β”€ logger.py
β”‚       └── utils.py
β”œβ”€β”€ LICENSE
β”œβ”€β”€ main.py
β”œβ”€β”€ Notebooks
β”‚   β”œβ”€β”€ Create_mini-ImageNet-LT.ipynb
β”‚   └── toy_example.ipynb
β”œβ”€β”€ readme_assets
β”‚   β”œβ”€β”€ method.svg
β”‚   └── toy_example_output.svg
β”œβ”€β”€ README.md
β”œβ”€β”€ run_all_CIFAR100-LT.sh
β”œβ”€β”€ run_all_mini-ImageNet-LT.sh
β”œβ”€β”€ run_TailCalibX_CIFAR100-LT.sh
└── run_TailCalibX_mini-imagenet-LT.sh

Ignored tailcalib_pip as it is for the tailcalib pip package.

πŸ“ƒ Citation

@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}

πŸ‘ Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

❀ About me

Rahul Vigneswaran

✨ Extras

🐝 Long-tail buzz : If you are interested in deep learning research which involves long-tailed / imbalanced dataset, take a look at Long-tail buzz to learn about the recent trending papers in this field.

πŸ“ License

MIT

Owner
Rahul Vigneswaran
Rahul Vigneswaran
Python implementation of Lightning-rod Agent, the Stack4Things board-side probe

Iotronic Lightning-rod Agent Python implementation of Lightning-rod Agent, the Stack4Things board-side probe. Free software: Apache 2.0 license Websit

2 May 19, 2022
Repo for flood prediction using LSTMs and HAND

Abstract Every year, floods cause billions of dollars’ worth of damages to life, crops, and property. With a proper early flood warning system in plac

1 Oct 27, 2021
RepositΓ³rio para arquivos sobre o MΓ³dulo 1 do curso Top Coders da Let's Code + Safra

850-Safra-DS-ModuloI RepositΓ³rio para arquivos sobre o MΓ³dulo 1 do curso Top Coders da Let's Code + Safra Para aprender mais Git https://learngitbranc

Brian Nunes 7 Dec 10, 2022
PConv-Keras - Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Partial Convolutions for Image Inpainting using Keras Keras implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions", https

Mathias Gruber 871 Jan 05, 2023
Adversarial Graph Augmentation to Improve Graph Contrastive Learning

ADGCL : Adversarial Graph Augmentation to Improve Graph Contrastive Learning Introduction This repo contains the Pytorch [1] implementation of Adversa

susheel suresh 62 Nov 19, 2022
CΓ³digo de um painel de auto atendimento feito em Python.

Painel de Auto-Atendimento O intuito desse projeto era fazer em Python um programa que simulasse um painel de auto atendimento, no maior estilo Mac Do

Calebe Alves Evangelista 2 Nov 09, 2022
Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

Tianyu Li 9 Oct 26, 2022
Norm-based Analysis of Transformer

Norm-based Analysis of Transformer Implementations for 2 papers introducing to analyze Transformers using vector norms: Kobayashi+'20 Attention is Not

Goro Kobayashi 52 Dec 05, 2022
Code for the paper: "On the Bottleneck of Graph Neural Networks and Its Practical Implications"

On the Bottleneck of Graph Neural Networks and its Practical Implications This is the official implementation of the paper: On the Bottleneck of Graph

75 Dec 22, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022
Code for classifying international patents based on the text of their titles/abstracts

Patent Classification Goal: To train a machine learning classifier that can automatically classify international patents downloaded from the WIPO webs

Prashanth Rao 1 Nov 08, 2022
The repository offers the official implementation of our paper in PyTorch.

Cloth Interactive Transformer (CIT) Cloth Interactive Transformer for Virtual Try-On Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Phi

Bingoren 49 Dec 01, 2022
Replication of Pix2Seq with Pretrained Model

Pretrained-Pix2Seq We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and c

peng gao 51 Nov 22, 2022
Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.

ConvNeXt-TF This repository provides TensorFlow / Keras implementations of different ConvNeXt [1] variants. It also provides the TensorFlow / Keras mo

Sayak Paul 87 Dec 06, 2022
QAT(quantize aware training) for classification with MQBench

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

Authors: Utkarsh A. Mishra and Dr. Dimitar Stanev Advisors: Dr. Dimitar Stanev and Prof. Auke Ijspeert, Biorobotics Laboratory (BioRob), EPFL Video Pl

Utkarsh Mishra 16 Dec 13, 2022
[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

SMYRF: Efficient attention using asymmetric clustering Get started: Abstract We propose a novel type of balanced clustering algorithm to approximate a

Giannis Daras 46 Dec 22, 2022
Brain tumor detection using Convolution-Neural Network (CNN)

Detect and Classify Brain Tumor using CNN. A system performing detection and classification by using Deep Learning Algorithms using Convolution-Neural Network (CNN).

assia 1 Feb 07, 2022
PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M

Fadi Boutros 40 Dec 22, 2022
Repository for the "Gotta Go Fast When Generating Data with Score-Based Models" paper

Gotta Go Fast When Generating Data with Score-Based Models This repo contains the official implementation for the paper Gotta Go Fast When Generating

Alexia Jolicoeur-Martineau 89 Nov 09, 2022