Manifold-Mixup implementation for fastai V2

Overview

Manifold Mixup

Unofficial implementation of ManifoldMixup (Proceedings of ICML 19) for fast.ai (V2) based on Shivam Saboo's pytorch implementation of manifold mixup, fastai's input mixup implementation plus some improvements/variants that I developped with lessw2020.

This package provides four additional callbacks to the fastai learner :

  • ManifoldMixup which implements ManifoldMixup
  • OutputMixup which implements a variant that does the mixup only on the output of the last layer (this was shown to be more performant on a benchmark and an independant blogpost)
  • DynamicManifoldMixup which lets you use manifold mixup with a schedule to increase difficulty progressively
  • DynamicOutputMixup which lets you use manifold mixup with a schedule to increase difficulty progressively

Usage

For a minimal demonstration of the various callbacks and their parameters, see the Demo notebook.

Mixup

To use manifold mixup, you need to import manifold_mixup and pass the corresponding callback to the cbs argument of your learner :

learner = Learner(data, model, cbs=ManifoldMixup())
learner.fit(8)

The ManifoldMixup callback takes three parameters :

  • alpha=0.4 parameter of the beta law used to sample the interpolation weight
  • use_input_mixup=True do you want to apply mixup to the inputs
  • module_list=None can be used to pass an explicit list of target modules

The OutputMixup variant takes only the alpha parameters.

Dynamic mixup

Dynamic callbackss, which are available via dynamic_mixup, take three parameters instead of the single alpha parameter :

  • alpha_min=0.0 the initial, minimum, value for the parameter of the beta law used to sample the interpolation weight (we recommend keeping it to 0)
  • alpha_max=0.6 the final, maximum, value for the parameter of the beta law used to sample the interpolation weight
  • scheduler=SchedCos the scheduling function to describe the evolution of alpha from alpha_min to alpha_max

The default schedulers are SchedLin, SchedCos, SchedNo, SchedExp and SchedPoly. See the Annealing section of fastai2's documentation for more informations on available schedulers, ways to combine them and provide your own.

Notes

Which modules will be intrumented by ManifoldMixup ?

ManifoldMixup tries to establish a sensible list of modules on which to apply mixup:

  • it uses a user provided module_list if possible
  • otherwise it uses only the modules wrapped with ManifoldMixupModule
  • if none are found, it defaults to modules with Block or Bottleneck in their name (targetting mostly resblocks)
  • finaly, if needed, it defaults to all modules that are not included in the non_mixable_module_types list

The non_mixable_module_types list contains mostly recurrent layers but you can add elements to it in order to define module classes that should not be used for mixup (do not hesitate to create an issue or start a PR to add common modules to the default list).

When can I use OutputMixup ?

OutputMixup applies the mixup directly to the output of the last layer. This only works if the loss function contains something like a softmax (and not when it is directly used as it is for regression).

Thus, OutputMixup cannot be used for regression.

A note on skip-connections / residual-blocks

ManifoldMixup (this does not apply to OutputMixup) is greatly degraded when applied inside a residual block. This is due to the mixed-up values becoming incoherent with the output of the skip connection (which have not been mixed).

While this implementation is equiped to work around the problem for U-Net and ResNet like architectures, you might run into problems (negligeable improvements over the baseline) with other network structures. In which case, the best way to apply manifold mixup would be to manually select the modules to be instrumented.

For more unofficial fastai extensions, see the Fastai Extensions Repository.

Owner
Nestor Demeure
PhD, Engineer specialized in computer science and applied mathematics.
Nestor Demeure
This repository contains the code for: RerrFact model for SciVer shared task

RerrFact This repository contains the code for: RerrFact model for SciVer shared task. Setup for Inference 1. Download SciFact database Download the S

Ashish Rana 1 May 22, 2022
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
A PyTorch library and evaluation platform for end-to-end compression research

CompressAI CompressAI (compress-ay) is a PyTorch library and evaluation platform for end-to-end compression research. CompressAI currently provides: c

InterDigital 680 Jan 06, 2023
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022
Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.

CaiT-TF (Going deeper with Image Transformers) This repository provides TensorFlow / Keras implementations of different CaiT [1] variants from Touvron

Sayak Paul 9 Jun 26, 2022
Second Order Optimization and Curvature Estimation with K-FAC in JAX.

KFAC-JAX - Second Order Optimization with Approximate Curvature in JAX Installation | Quickstart | Documentation | Examples | Citing KFAC-JAX KFAC-JAX

DeepMind 90 Dec 22, 2022
CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

CBREN This is the Pytorch implementation for our IEEE TCSVT paper : CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhanceme

Zhao Hengrun 3 Nov 04, 2022
PyTorch implementation of deep GRAph Contrastive rEpresentation learning (GRACE).

GRACE The official PyTorch implementation of deep GRAph Contrastive rEpresentation learning (GRACE). For a thorough resource collection of self-superv

Big Data and Multi-modal Computing Group, CRIPAC 186 Dec 27, 2022
YOLOv7 - Framework Beyond Detection

πŸ”₯πŸ”₯πŸ”₯πŸ”₯ YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! πŸ”₯πŸ”₯πŸ”₯

JinTian 3k Jan 01, 2023
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices" Introduction This repo is official PyTorch implementatio

Choi Sang Bum 203 Jan 05, 2023
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)

Self-Supervised Models are Continual Learners This is the official repository for the paper: Self-Supervised Models are Continual Learners Enrico Fini

Enrico Fini 73 Dec 18, 2022
A transformer-based method for Healthcare Image Captioning in Vietnamese

vieCap4H Challenge 2021: A transformer-based method for Healthcare Image Captioning in Vietnamese This repo GitHub contains our solution for vieCap4H

Doanh B C 4 May 05, 2022
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
Prefix-Tuning: Optimizing Continuous Prompts for Generation

Prefix Tuning Files: . β”œβ”€β”€ gpt2 # Code for GPT2 style autoregressive LM β”‚ β”œβ”€β”€ train_e2e.py # high-level script

530 Jan 04, 2023
Wide Residual Networks (WideResNets) in PyTorch

Wide Residual Networks (WideResNets) in PyTorch WideResNets for CIFAR10/100 implemented in PyTorch. This implementation requires less GPU memory than

Jason Kuen 296 Dec 27, 2022
Code for NeurIPS 2020 article "Contrastive learning of global and local features for medical image segmentation with limited annotations"

Contrastive learning of global and local features for medical image segmentation with limited annotations The code is for the article "Contrastive lea

Krishna Chaitanya 152 Dec 22, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea MΓΌller 83 Dec 14, 2022
Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems

WideLinears Pytorch parallel Neural Networks A package of pytorch modules for fast paralellization of separate deep neural networks. Ideal for agent-b

1 Dec 17, 2021