ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Last update: Dec 08, 2022

Related tags

Overview

[ 👷 🏗 👷 🏗 Coming soon! Official release with improved docs. Stay tuned. 👷 🏗 👷 🏗 ]

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

[]

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

GGN eigenvalues
GGN eigenpairs (eigenvalues + eigenvector)
1ˢᵗ- and 2ⁿᵈ-order directional derivatives along GGN eigenvectors
Newton steps

These operations can also further approximate the GGN to reduce cost via sub-sampling, Monte-Carlo approximation, and block-diagonal approximation.

How does it work? ViViT uses and extends BackPACK for PyTorch. The described functionality is realized through a combination of existing and new BackPACK extensions and hooks into its backpropagation.

Installation

👷 🏗 👷 🏗 The PyPI release is coming soon. 👷 🏗 👷 🏗

For now, you need to install from GitHub via

pip install vivit-for-pytorch@git+https://github.com/f-dangel/vivit.git#egg=vivit-for-pytorch

Examples

👷 🏗 👷 🏗 Coming soon! 👷 🏗 👷 🏗

How to cite

If you are using ViViT, consider citing the paper

@misc{dangel2022vivit,
      title={{ViViT}: Curvature access through the generalized Gauss-Newton's low-rank structure},
      author={Felix Dangel and Lukas Tatzel and Philipp Hennig},
      year={2022},
      eprint={2106.02624},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Comments

[ADD] Warn about instabilities if eigenvalues are small

The directional gradient computation and transformation of the Newton step from Gram space into parameter space require division by the square root of the direction's eigenvalue. This is unstable if the eigenvalue is close to zero.

opened by f-dangel 1
[ADD] Clean `DirectionalDampedNewtonComputation`
Adds directionally damped Newton step computation with cleaned up API.

Fixes a bug in the eigenvalue criterion in the tests. It always picked one more eigenvalue than specified.
opened by f-dangel 1
[DOC] Add NTK example

Adds an example inspired by the functorch tutorial on NTKs. It demonstrates how to use vivit to compute empirical NTK matrices and makes a comparison with the functorch implementation.

opened by f-dangel 1
[ADD] Simplify `DirectionalDerivatives` API
Exotic features, like using different GGNs to compute directions and directional curvatures, as well as full control of which intermediate buffers to keep, have been deprecated in favor of a simpler API.

Remove Newton step computation for now as it was internally relying on DirectionalDerivatives

Remove many utilities and associated tests from the exotic features

Forbid duplicate indices in subsampling

Always delete intermediate buffers other than the target quantities
opened by f-dangel 1
[DOC] Set up `sphinx` and RTD

This PR adds a scaffold for the doc at https://vivit.readthedocs.io/en/latest/. Code examples are integrated via sphinx-gallery (I added a preliminary logo). Pull requests are built by the CI.

To build the docs, run make docs. You need to install the dependencies first, for example using pip install -e .[docs].

opened by f-dangel 1
Calculate Parameter Space Values of GGN Eigenvectors

The docs show how to calculate the gram matrix eigenvectors and the paper articulates that to translate from 'gram space' to parameter space we just need to multiply by the 'V' matrix.

What's the easiest way of implementing this?
question

opened by lk-wq 1
Detect loss function's `reduction`, error if unsupported
For now, the library only supports reduction='mean'. We rely on the user to use this reduction and raise awareness about this point in the documentation. It would be better to automatically have the library detect the reduction and error if it is unsupported.

This can be done via a hook into BackPACK.

[ ] Implement hook that determines the loss function reduction during backpropagation

[ ] Integrate the above hook into the *Computation and raise an exception if the reduction is not supported

[ ] Remove the comments about supported reductions in the documentation

enhancement
opened by f-dangel 0

Releases(1.0.0)

1.0.0(Jun 22, 2022)

First public release. Details about future releases will be documented in the changelog.
Source code(tar.gz)
Source code(zip)

Owner

Felix Dangel

Machine Learning PhD student at the University of Tübingen and the Max Planck Institute for Intelligent Systems.

GitHub Repository https://arxiv.org/abs/2106.02624

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

11 Oct 14, 2022

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

fcn - Fully Convolutional Networks Chainer implementation of Fully Convolutional Networks. Installation pip install fcn Inference Inference is done as

218 Oct 27, 2022

[CVPR2021] De-rendering the World's Revolutionary Artefacts

De-rendering the World's Revolutionary Artefacts Project Page | Video | Paper In CVPR 2021 Shangzhe Wu1,4, Ameesh Makadia4, Jiajun Wu2, Noah Snavely4,

49 Nov 06, 2022

Code for the paper "Multi-task problems are not multi-objective"

Multi-Task problems are not multi-objective This is the code for the paper "Multi-Task problems are not multi-objective" in which we show that the com

5 Aug 19, 2022

Pytorch version of SfmLearner from Tinghui Zhou et al.

SfMLearner Pytorch version This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghu

909 Dec 22, 2022

Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

10 Feb 10, 2022

Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

2 Nov 07, 2022

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

72 Dec 10, 2022

Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

34 Dec 23, 2022

Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

112 Dec 28, 2022

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]

PINTO_model_zoo Please read the contents of the LICENSE file located directly under each folder before using the model. My model conversion scripts ar

2.4k Jan 05, 2023

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Self-Classifier: Self-Supervised Classification Network Official PyTorch implementation and pretrained models of the paper Self-Supervised Classificat

24 Dec 21, 2022

Meshed-Memory Transformer for Image Captioning. CVPR 2020

M²: Meshed-Memory Transformer This repository contains the reference code for the paper Meshed-Memory Transformer for Image Captioning (CVPR 2020). Pl

422 Dec 28, 2022

This repository contains the code for: RerrFact model for SciVer shared task

RerrFact This repository contains the code for: RerrFact model for SciVer shared task. Setup for Inference 1. Download SciFact database Download the S

1 May 22, 2022

Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

56 Nov 13, 2022

This repository contains the implementation of the paper Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans

Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans This repository contains the implementation of the pap

40 Dec 01, 2022

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Related tags

Overview

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Installation

Examples

How to cite

Comments

[ADD] Warn about instabilities if eigenvalues are small

[ADD] Clean `DirectionalDampedNewtonComputation`

[DOC] Add NTK example

[ADD] Simplify `DirectionalDerivatives` API

[DOC] Set up `sphinx` and RTD

Calculate Parameter Space Values of GGN Eigenvectors

Detect loss function's `reduction`, error if unsupported

Releases(1.0.0)

1.0.0(Jun 22, 2022)

Owner

Felix Dangel

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

[CVPR2021] De-rendering the World's Revolutionary Artefacts

Code for the paper "Multi-task problems are not multi-objective"

Pytorch version of SfmLearner from Tinghui Zhou et al.

Optimizing synthesizer parameters using gradient approximation

Classification Modeling: Probability of Default

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Probabilistic Gradient Boosting Machines

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Meshed-Memory Transformer for Image Captioning. CVPR 2020

This repository contains the code for: RerrFact model for SciVer shared task

Code for Efficient Visual Pretraining with Contrastive Detection

This repository contains the implementation of the paper Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans

Scalable machine learning based time series forecasting

Bringing sanity to world of messed-up data

A PyTorch Implementation of "Neural Arithmetic Logic Units"

Implementation of the Chamfer Distance as a module for pyTorch