A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

Overview

MADGRAD Optimization Algorithm For Tensorflow

This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization (Aaron Defazio and Samy Jelassi, 2021).

MIT License version-shield release-shield python-shield code-style

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Contributing
  5. License
  6. Contact
  7. Citations

About The Project

The MadGrad algorithm of optimization uses Dual averaging of gradients along with momentum based adaptivity to attain results that match or outperform Adam or SGD + momentum based algorithms. This project offers a Tensorflow implementation of the algorithm along with a few usage examples and tests.



Prerequisites

Prerequisites can be installed separately through the requirements.txt file as below

pip install -r requirements.txt

Installation

This project is built with Python 3 and can be pip installed directly

pip install tf-madgrad

Usage

Open In Colab

To use the optimizer in any tf.keras model, you just need to import and instantiate the MadGrad optimizer from the tf_madgrad package.

from madgrad import MadGrad

# Create the architecture
inp = tf.keras.layers.Input(shape=shape)
...
op = tf.keras.layers.Dense(classes, activation=activation)

# Instantiate the model
model = tf.keras.models.Model(inp, op)

# Pass the MadGrad optimizer to the compile function
model.compile(optimizer=MadGrad(lr=0.01), loss=loss)

# Fit the keras model as normal
model.fit(...)

This implementation is also supported for distributed training using tf.strategy

See a MNIST example here

Contributing

Any and all contributions are welcome. Please raise an issue if the optimizer gives incorrect results or crashes unexpectedly during training.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Feel free to reach out for any issues or requests related to this implementation

Darshan Deshpande - Email | LinkedIn

Citations

@misc{defazio2021adaptivity,
      title={Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization}, 
      author={Aaron Defazio and Samy Jelassi},
      year={2021},
      eprint={2101.11075},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Owner
Helping Machines Learn Better πŸ’»πŸ˜ƒ
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 985 Jan 08, 2023
Permute Me Softly: Learning Soft Permutations for Graph Representations

Permute Me Softly: Learning Soft Permutations for Graph Representations

Giannis Nikolentzos 7 Jul 10, 2022
Official repo for our 3DV 2021 paper "Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements".

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy Paper. Pr

Yu Rong 41 Dec 13, 2022
Sequence to Sequence (seq2seq) Recurrent Neural Network (RNN) for Time Series Forecasting

Sequence to Sequence (seq2seq) Recurrent Neural Network (RNN) for Time Series Forecasting Note: You can find here the accompanying seq2seq RNN forecas

Guillaume Chevalier 1k Dec 25, 2022
Single-Shot Motion Completion with Transformer

Single-Shot Motion Completion with Transformer πŸ‘‰ [Preprint] πŸ‘ˆ Abstract Motion completion is a challenging and long-discussed problem, which is of gr

FuxiCV 78 Dec 29, 2022
Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

.NET Interactive Notebooks for C# Welcome to the home of .NET interactive notebooks for C#! How to Install Download the .NET Coding Pack for VS Code f

.NET Platform 425 Dec 25, 2022
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

LSTM-Time-Series-Prediction A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi Contest. The Link of the Cont

KevinCHEN 1 Jun 13, 2022
Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

Zhifeng Kong 68 Dec 26, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

εŒζ΅Žε€§ε­¦ζ™Ίθƒ½ζ±½θ½¦η ”η©Άζ‰€η»Όεˆζ„ŸηŸ₯η ”η©Άη»„ ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 842 Jan 04, 2023
Official implementation of Long-Short Transformer in PyTorch.

Long-Short Transformer (Transformer-LS) This repository hosts the code and models for the paper: Long-Short Transformer: Efficient Transformers for La

NVIDIA Corporation 198 Dec 29, 2022
Viperdb - A tiny log-structured key-value database written in pure Python

ViperDB 🐍 ViperDB is a lightweight embedded key-value store written in pure Pyt

17 Oct 17, 2022
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Caiyong Wang 14 Sep 20, 2022
Utilizes Pose Estimation to offer sprinters cues based on an image of their running form.

Running-Form-Correction Utilizes Pose Estimation to offer sprinters cues based on an image of their running form. How to Run Dependencies You will nee

3 Nov 08, 2022
FOSS Digital Asset Distribution Platform built on Frappe.

Digistore FOSS Digital Assets Marketplace. Distribute digital assets, like a pro. Video Demo Here Features Create, attach and list digital assets (PDF

Mohammad Hussain Nagaria 30 Dec 08, 2022
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
Repository for RNNs using TensorFlow and Keras - LSTM and GRU Implementation from Scratch - Simple Classification and Regression Problem using RNNs

RNN 01- RNN_Classification Simple RNN training for classification task of 3 signal: Sine, Square, Triangle. 02- RNN_Regression Simple RNN training for

Nahid Ebrahimian 13 Dec 13, 2022
A Pose Estimator for Dense Reconstruction with the Structured Light Illumination Sensor

Phase-SLAM A Pose Estimator for Dense Reconstruction with the Structured Light Illumination Sensor This open source is written by MATLAB Run Mode Open

Xi Zheng 14 Dec 19, 2022