Adam with minor modifications which give significant improvement

Related tags

MiscellaneousBAdam
Overview

BAdam

Modification of Adam [1] optimizer with increased stability and better performance. Tricks used:

  1. Decoupled weight decay as in AdamW [2]. Such decoupling allows easier tuning of weight decay and learning rate. This implementation follows PyTorch and multiplies weight decay by learning rate, allowing simultaneous scheduling. Due to this typical values passed to optimizer should be much higher that with SGD, if you use 1e-4 with SGD, good start would be to use 1e-2 with BAdam.
  2. Epsilon is inside sqrt to avoid NaN in mixed precision. Default value is much larger than in Adam to reduce 'adaptivity' it leads to better and wider optimums [3]. Large epsilon also works better than amsgrad version of Adam [5]
  3. exp_avg_sq inits with large value, rather than with zeros. This removes the need for lr warmup and does the same thing as all the tricks from RAdam [4], while being much simpler.
  4. Removed bias correction. It's not needed if exp_avg_sq is correctly initialized

Practical Tips

Default values for this optimizer were tuned on Imagenet and work as good baseline for other computer vision tasks. Try them as is, before further tuning.

Installation

pip install git+https://github.com/bonlime/[email protected]

Reference:

[1] Adam: A Method for Stochastic Optimization
[2] Decoupled Weight Decay Regularization
[3] On the Convergence of Adam and Beyond
[4] On the Variance of the Adaptive Learning Rate and Beyond [5] Adaptive Methods for Non-convex Optimization

Owner
Emil Zakirov. MIPT & Skoltech. Computer Vision Engineer.
The repository is about 100+ python programming exercise problem discussed, explained, and solved in different ways

Break The Ice With Python A journey of 100+ simple yet interesting problems which are explained, solved, discussed in different pythonic ways Introduc

Abdullah Al Masud Tushar 2.2k Jan 04, 2023
Blender addon, import and update mixamo animation

This is a blender addon for import and update mixamo animations.

ywaby 7 Apr 19, 2022
Goal: Enable awesome tooling for Bazel users of the C language family.

Hedron's Compile Commands Extractor for Bazel — User Interface What is this project trying to do for me? First, provide Bazel users cross-platform aut

Hedron Vision 290 Dec 26, 2022
A python script for osu!lazer rulesets auto update.

osu-lazer-rulesets-autoupdater A python script for osu!lazer rulesets auto update. How to use: 如何使用: You can refer to the python script. The begining

3 Jul 26, 2022
Karte der Allgemeinverfügungen zu Schulschließungen oder eingeschränktem Regelbetrieb in Sachsen

SNSZ Karte Datenquelle: Allgemeinverfügungen zu Schulschließungen oder eingeschränktem Regelbetrieb in Sachsen Sächsisches Staatsministerium für Kultu

Jannis Leidel 3 Sep 26, 2022
Kellogg bad | Union good | Support strike funds

KelloggBot Credit to SeanDaBlack for the basis of the script. req.py is selenium python bot. sc.js is a the base of the ios shortcut [COMING SOON] Set

407 Nov 17, 2022
pybind11 — Seamless operability between C++11 and Python

pybind11 — Seamless operability between C++11 and Python Setuptools example • Scikit-build example • CMake example pybind11 is a lightweight header-on

pybind 12.1k Jan 08, 2023
Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal resources please report to us and we will remove.

Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal

Wissem Marzouki 29 Nov 28, 2022
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python

Scalene: a high-performance CPU, GPU and memory profiler for Python by Emery Berger, Sam Stern, and Juan Altmayer Pizzorno. Scalene community Slack Ab

PLASMA @ UMass 7k Dec 30, 2022
A simple and efficient computing package for Genshin Impact gacha analysis

GGanalysisLite计算包 这个版本的计算包追求计算速度,而GGanalysis包有着更多计算功能。 GGanalysisLite包通过卷积计算分布列,通过FFT和快速幂加速卷积计算。 测试玩家得到的排名值rank的数学意义是:与抽了同样数量五星的其他玩家相比,测试玩家花费的抽数大于等于比例

一棵平衡树 34 Nov 26, 2022
Blender addons - A collection of Blender tools I've written for myself over the years.

gret A collection of Blender tools I've written for myself over the years. I use these daily so they should be bug-free, mostly. Feel free to take and

217 Jan 08, 2023
Easy Alias's for bash

easy-alias Easy Alias's for bash Setup Your system needs to have 'echo' which every 21st century computer has You dont need any python requirments but

Hashm 2 Jan 18, 2022
To lazy to read your homework ? Get it done with LOL

LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::

KorryKatti 4 Dec 08, 2022
ThnoolBox - A thneed is a multi-use versatile object

ThnoolBox Have you ever wanted a collection of bodged desktop apps that are Lorax themed ? No ? Sucks to suck I guess Apps & their downsides CalculaTh

pocoyo 1 Jan 21, 2022
WildHack 2021 solution by Nuclear Foxes team (public version).

WildHack 2021 Nuclear Foxes Team This repo contains our project for the Wildberries Hackathon 2021. Task 2: Searching tags Implement an algorithm of r

Sergey Zakharov 1 Apr 18, 2022
Tensorboard plugin 3d with python

tensorboard-plugin-3d Overview In this example, we render a run selector dropdown component. When the user selects a run, it shows a preview of all sc

KitwareMedical 26 Nov 14, 2022
An example using debezium and mysql with docker-compose

debezium-mysql An example using debezium and mysql with docker-compose The docker compose starts the Zookeeper, Kafka, Mysql and Debezium Connect. Aft

Horácio Dias Baptista Neto 4 May 21, 2022
this is a basic python project that I made using python

this is a basic python project that I made using python. This project is only for practice because my python skills are still newbie.

Elvira Firmansyah 2 Dec 14, 2022
A simple chatbot that I made for school project

Chatbot: Python A simple chatbot that I made for school Project. Tho this chatbot is dumb sometimes, but it's not too bad lol. Check it Out! FAQ How t

Prashant 2 Nov 13, 2021
Results of Robot Framework 5.0 survey

Robot Framework 5.0 survey results We had a survey asking what features Robot Framework community members would like to see in the forthcoming Robot F

Pekka Klärck 2 Oct 16, 2021