iftopt

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations.

Requirements

Python 3.7+
PyTorch 1.x

Installation

$ pip install git+https://github.com/money-shredder/iftopt.git

Usage

Assuming a bi-level optimization of the form:

y* = argmin_{y} val_loss(x*, y), where x* = argmin_{x} train_loss(x, y).

To solve for the optimal x* and y* in the optimization problem, we can implement the following with iftopt:

from iftopt import HyperOptimizer
train_lr = val_lr = 0.1
# parameter to minimize the training loss
x = torch.nn.Parameter(...)
# hyper-parameter to minimize the validation loss
y = torch.nn.Parameter(...)
# training loss optimizer
opt = torch.optim.SGD([x], lr=train_lr)
# validation loss optimizer
hopt = HyperOptimizer(
    [y], torch.optim.SGD([y], lr=val_lr), vih_lr=0.1, vih_iterations=5)
# outer optimization loop for y
for _ in range(...):
    # inner optimization loop for x
    for _ in range(...):
        z = train_loss(x, y)
        # inner optimization step for x
        opt.zero_grad()
        z.backward()
        opt.step()
    # outer optimization step for y
    hopt.set_train_parameters([x])
    z = train_loss(x, y)
    hopt.train_step(z)
    v = val_loss(x, y)
    hopt.val_step(v)
    hopt.grad()
    hopt.step()

For a concrete simple example, please check out and run demo.py, where

train_loss = lambda x, y: (x + y) ** 2
val_loss = lambda x, y: x ** 2

with x = y = 1.0 initially. It will generate a video demo.mp4 showing the optimization trajectory in the animation below. Note that although the hyper-parameter y does not have a direct gradient w.r.t. the validation loss, iftopt can still minimize the validation loss by computing the hyper-gradient via implicit function theorem.

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

Related tags

Overview

iftopt

Requirements

Installation

Usage

Owner

The Money Shredder Lab

We have made you a wrapper you can't refuse

Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Code for ICML 2021 paper: How could Neural Networks understand Programs?

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Datasets, Transforms and Models specific to Computer Vision

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

ESP32 python application to read data from a Tilt™ Hydrometer for homebrewing

A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

Bridging Vision and Language Model

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

Demo code for paper "Learning optical flow from still images", CVPR 2021.

Solving SMPL/MANO parameters from keypoint coordinates.

A repository that finds a person who looks like you by using face recognition technology.

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Inteligência artificial criada para realizar interação social com idosos.