A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Last update: Dec 28, 2022

Related tags

Overview

sam.pytorch

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementation .

Requirements

Python>=3.8
PyTorch>=1.7.1

To run the example, you further need

homura by pip install -U homura-core==2020.12.0
chika by pip install -U chika

Example

python cifar10.py [--optim.name {sam,sgd}] [--model {renst20, wrn28_2}] [--optim.rho 0.05]

Results: Test Accuracy (CIFAR-10)

Model	SAM	SGD
ResNet-20	93.5	93.2
WRN28-2	95.8	95.4
ResNeXT29	96.4	95.8

SAM needs double forward passes per each update, thus training with SAM is slower than training with SGD. In case of ResNet-20 training, 80 mins vs 50 mins on my environment. Additional options --use_amp --jit_model may slightly accelerates the training.

Usage

SAMSGD can be used as a drop-in replacement of PyTorch optimizers with closures. Also, it is compatible with lr_scheduler and has state_dict and load_state_dict.

from sam import SAMSGD

optimizer = SAMSGD(model.parameters(), lr=1e-1, rho=0.05)

for input, target in dataset:
    def closure():
        optimizer.zero_grad()
        output = model(input)
        loss = loss_f(output, target)
        loss.backward()
        return loss


    loss = optimizer.step(closure)

Citation

@ARTICLE{2020arXiv201001412F,
    author = {{Foret}, Pierre and {Kleiner}, Ariel and {Mobahi}, Hossein and {Neyshabur}, Behnam},
    title = "{Sharpness-Aware Minimization for Efficiently Improving Generalization}",
    year = 2020,
    eid = {arXiv:2010.01412},
    eprint = {2010.01412},
}

@software{sampytorch
    author = {Ryuichiro Hataya},
    titile = {sam.pytorch},
    url    = {https://github.com/moskomule/sam.pytorch},
    year   = {2020}
}

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Related tags

Overview

sam.pytorch

Requirements

Example

Results: Test Accuracy (CIFAR-10)

Usage

Citation

Owner

Ryuichiro Hataya

Tiny Kinetics-400 for test

A lossless neural compression framework built on top of JAX.

[CVPR'22] COAP: Learning Compositional Occupancy of People

Implementation for NeurIPS 2021 Submission: SparseFed

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

用opencv的dnn模块做yolov5目标检测，包含C++和Python两个版本的程序

Cmsc11 arcade - Final Project for CMSC11

Learning and Building Convolutional Neural Networks using PyTorch

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Rethinking Transformer-based Set Prediction for Object Detection

TigerLily: Finding drug interactions in silico with the Graph.

Deep Residual Learning for Image Recognition

The repository offers the official implementation of our paper in PyTorch.

Code, final versions, and information on the Sparkfun Graphical Datasheets

Extremely simple and fast extreme multi-class and multi-label classifiers.

Image-generation-baseline - MUGE Text To Image Generation Baseline

Official code repository for "Exploring Neural Models for Query-Focused Summarization"

Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation

Accelerating BERT Inference for Sequence Labeling via Early-Exit