Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Overview

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation.

Installation

Our dependencies are fully specified in Pipfile, which can be supplied to pipenv to install the environment. One failsafe approach is to install pipenv in a fresh virtual environment, then run pipenv install in this directory. Note the Pipfile specifies our Python 3.9 development environment; most experiments were run in an identical environment under Python 3.7 instead.

Difficulties with CUDA versions meant we had to manually install PyTorch and Torchvision rather than use pipenv --- the corresponding lines in Pipfile may need adjustment for your use case. Alternatively, use the list of dependencies as a guide to what to install yourself with pip, or use the full dump of our development environment in final_requirements.txt.

Datasets may not be bundled with the repository, but are expected to be found at locations specified in datasets.py, preprocessed into single PyTorch tensors of all the input and output data (generally data/<dataset>/data.pt and data/<dataset>/targets.pt).

Configuration

Training code is controlled with YAML configuration files, as per the examples in configs/. Generally one file is required to specify the dataset, and a second to specify the algorithm, using the obvious naming convention. Brief help text is available on the command line, but the meanings of each option should be reasonably self-explanatory.

For Ours (WD+LR), use the file Ours_LR.yaml; for Ours (WD+LR+M), use the file Ours_LR_Momentum.yaml; for Ours (WD+HDLR+M), use the file Ours_HDLR_Momentum.yaml. For Long/Medium/Full Diff-through-Opt, we provide separate configuration files for the UCI cases and the Fashion-MNIST cases.

We provide two additional helper configurations. Random_Validation.yaml copies Random.yaml, but uses the entire validation set to compute the validation loss at each logging step. This allows for stricter analysis of the best-performing run at particular time steps, for instance while constructing Random (3-batched). Random_Validation_BayesOpt.yaml only forces the use of the entire dataset for the very last validation loss computation, so that Bayesian Optimisation runs can access reliable performance metrics without adversely affecting runtime.

The configurations provided match those necessary to replicate the main experiments in our paper (in Section 4: Experiments). Other trials, such as those in the Appendix, will require these configurations to be modified as we describe in the paper. Note especially that our three short-horizon bias studies all require different modifications to the LongDiffThroughOpt_*.yaml configurations.

Running

Individual runs are commenced by executing train.py and passing the desired configuration files with the -c flag. For example, to run the default Fashion-MNIST experiments using Diff-through-Opt, use:

$ python train.py -c ./configs/fashion_mnist.yaml ./configs/DiffThroughOpt.yaml

Bayesian Optimisation runs are started in a similar way, but with a call to bayesopt.py rather than train.py.

For executing multiple runs in parallel, parallel_exec.py may be useful: modify the main function call at the bottom of the file as required, then call this file instead of train.py at the command line. The number of parallel workers may be specified by num_workers. Any configurations passed at the command line are used as a base, to which modifications may be added by override_generator. The latter should either be a function which generates one override dictionary per call (in which case num_repetitions sets the number of overrides to generate), or a function which returns a generator over configurations (in which case set num_repetitions = None). Each configuration override is run once for each of algorithms, whose configurations are read automatically from the corresponding files and should not be explicitly passed at the command line. Finally, main_function may be used to switch between parallel calls to train.py and bayesopt.py as required.

For blank-slate replications, the most useful override generators will be natural_sgd_generator, which generates a full SGD initialisation in the ranges we use, and iteration_id, which should be used with Bayesian Optimisation runs to name each parallel run using a counter. Other generators may be useful if you wish to supplement existing results with additional algorithms etc.

PennTreebank and CIFAR-10 were executed on clusters running SLURM; the corresponding subfolders contain configuration scripts for these experiments, and submit.sh handles the actual job submission.

Analysis

By default, runs are logged in Tensorboard format to the ./runs directory, where Tensorboard may be used to inspect the results. If desired, a descriptive name can be appended to a particular execution using the -n switch on the command line. Runs can optionally be written to a dedicated subfolder specified with the -g switch, and the base folder for logging can be changed with the -l switch.

If more precise analysis is desired, pass the directory containing the desired results to util.get_tags(), which will return a dictionary of the evolution of each logged scalar in the results. Note that this function uses Tensorboard calls which predate its --load_fast option, so may take tens of minutes to return.

This data dictionary can be passed to one of the more involved plotting routines in figures.py to produce specific plots. The script paper_plots.py generates all the plots we use in our paper, and may be inspected for details of any particular plot.

Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Multi-Time Attention Networks (mTANs) This repository contains the PyTorch implementation for the paper Multi-Time Attention Networks for Irregularly

The Laboratory for Robust and Efficient Machine Learning 68 Dec 17, 2022
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 647 Jan 04, 2023
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News December 27: v1.1.0 New loss functions: CentroidTripletLoss and VICRegLoss Mean reciprocal rank + per-class accuracies See the release notes Than

Kevin Musgrave 5k Jan 05, 2023
This repository contains the code used to quantitatively evaluate counterfactual examples in the associated paper.

On Quantitative Evaluations of Counterfactuals Install To install required packages with conda, run the following command: conda env create -f requi

Frederik Hvilshøj 1 Jan 16, 2022
This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices.

GBW This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices. Ap

Andi Han 0 Oct 22, 2021
571 Dec 25, 2022
NOD: Taking a Closer Look at Detection under Extreme Low-Light Conditions with Night Object Detection Dataset

NOD (Night Object Detection) Dataset NOD: Taking a Closer Look at Detection under Extreme Low-Light Conditions with Night Object Detection Dataset, BM

Igor Morawski 17 Nov 05, 2022
Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

This is the repo for the paper: Leveraging Two Types of Global Graph for Sequential Fashion Recommendation Requirements OS: Ubuntu 16.04 or higher ver

Yujuan Ding 10 Oct 10, 2022
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down

deepbands 25 Dec 15, 2022
PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 02, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

MAg Paper Abstract File structure Dataset prepare Data description How to use MAg? Why not try the MAg_lib! Trained models Experiment and results Some

Calvin Pang 3 Apr 08, 2022
Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

TFLite-HITNET-Stereo-depth-estimation Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite. Stereo depth e

Ibai Gorordo 22 Oct 20, 2022
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

Google 157 Dec 26, 2022
Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

S2VC Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. In thi

81 Dec 15, 2022
Python package for Bayesian Machine Learning with scikit-learn API

Python package for Bayesian Machine Learning with scikit-learn API Installing & Upgrading package pip install https://github.com/AmazaspShumik/sklearn

Amazasp Shaumyan 482 Jan 04, 2023
Research on Tabular Deep Learning (Python package & papers)

Research on Tabular Deep Learning For paper implementations, see the section "Papers and projects". rtdl is a PyTorch-based package providing a user-f

Yura Gorishniy 510 Dec 30, 2022
Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Council-GAN Implementation of our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020) Paper Ori Nizan , Ayellet Tal, Breaking the Cycle

ori nizan 260 Nov 16, 2022
Prevent `CUDA error: out of memory` in just 1 line of code.

🐨 Koila Koila solves CUDA error: out of memory error painlessly. Fix it with just one line of code, and forget it. 🚀 Features 🙅 Prevents CUDA error

RenChu Wang 1.7k Jan 02, 2023
The UI as a mobile display for OP25

OP25 Mobile Control Head A 'remote' control head that interfaces with an OP25 instance. We take advantage of some data end-points left exposed for the

Sarah Rose Giddings 13 Dec 28, 2022