ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Related tags

Deep Learningvivit
Overview

[ ๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ— Coming soon! Official release with improved docs. Stay tuned. ๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ— ]

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Python 3.7+ [tests]

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

  • GGN eigenvalues
  • GGN eigenpairs (eigenvalues + eigenvector)
  • 1หขแต—- and 2โฟแตˆ-order directional derivatives along GGN eigenvectors
  • Newton steps

These operations can also further approximate the GGN to reduce cost via sub-sampling, Monte-Carlo approximation, and block-diagonal approximation.

How does it work? ViViT uses and extends BackPACK for PyTorch. The described functionality is realized through a combination of existing and new BackPACK extensions and hooks into its backpropagation.

Installation

๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ— The PyPI release is coming soon. ๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ—

For now, you need to install from GitHub via

pip install vivit-for-pytorch@git+https://github.com/f-dangel/vivit.git#egg=vivit-for-pytorch

Examples

๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ— Coming soon! ๐Ÿ‘ท ๐Ÿ— ๐Ÿ‘ท ๐Ÿ—

How to cite

If you are using ViViT, consider citing the paper

@misc{dangel2022vivit,
      title={{ViViT}: Curvature access through the generalized Gauss-Newton's low-rank structure},
      author={Felix Dangel and Lukas Tatzel and Philipp Hennig},
      year={2022},
      eprint={2106.02624},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Comments
  • [ADD] Warn about instabilities if eigenvalues are small

    [ADD] Warn about instabilities if eigenvalues are small

    The directional gradient computation and transformation of the Newton step from Gram space into parameter space require division by the square root of the direction's eigenvalue. This is unstable if the eigenvalue is close to zero.

    opened by f-dangel 1
  • [ADD] Clean `DirectionalDampedNewtonComputation`

    [ADD] Clean `DirectionalDampedNewtonComputation`

    Adds directionally damped Newton step computation with cleaned up API.

    • Fixes a bug in the eigenvalue criterion in the tests. It always picked one more eigenvalue than specified.
    opened by f-dangel 1
  • [DOC] Add NTK example

    [DOC] Add NTK example

    Adds an example inspired by the functorch tutorial on NTKs. It demonstrates how to use vivit to compute empirical NTK matrices and makes a comparison with the functorch implementation.

    opened by f-dangel 1
  • [ADD] Simplify `DirectionalDerivatives` API

    [ADD] Simplify `DirectionalDerivatives` API

    Exotic features, like using different GGNs to compute directions and directional curvatures, as well as full control of which intermediate buffers to keep, have been deprecated in favor of a simpler API.

    • Remove Newton step computation for now as it was internally relying on DirectionalDerivatives
    • Remove many utilities and associated tests from the exotic features
    • Forbid duplicate indices in subsampling
    • Always delete intermediate buffers other than the target quantities
    opened by f-dangel 1
  • [DOC] Set up `sphinx` and RTD

    [DOC] Set up `sphinx` and RTD

    This PR adds a scaffold for the doc at https://vivit.readthedocs.io/en/latest/. Code examples are integrated via sphinx-gallery (I added a preliminary logo). Pull requests are built by the CI.

    To build the docs, run make docs. You need to install the dependencies first, for example using pip install -e .[docs].

    opened by f-dangel 1
  • Calculate Parameter Space Values of GGN Eigenvectors

    Calculate Parameter Space Values of GGN Eigenvectors

    The docs show how to calculate the gram matrix eigenvectors and the paper articulates that to translate from 'gram space' to parameter space we just need to multiply by the 'V' matrix.

    What's the easiest way of implementing this?

    question 
    opened by lk-wq 1
  • Detect loss function's `reduction`, error if unsupported

    Detect loss function's `reduction`, error if unsupported

    For now, the library only supports reduction='mean'. We rely on the user to use this reduction and raise awareness about this point in the documentation. It would be better to automatically have the library detect the reduction and error if it is unsupported.

    This can be done via a hook into BackPACK.

    • [ ] Implement hook that determines the loss function reduction during backpropagation
    • [ ] Integrate the above hook into the *Computation and raise an exception if the reduction is not supported
    • [ ] Remove the comments about supported reductions in the documentation
    enhancement 
    opened by f-dangel 0
Releases(1.0.0)
Owner
Felix Dangel
Machine Learning PhD student at the University of Tรผbingen and the Max Planck Institute for Intelligent Systems.
Felix Dangel
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloลก Stanojeviฤ‡ 11 Oct 14, 2022
Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

fcn - Fully Convolutional Networks Chainer implementation of Fully Convolutional Networks. Installation pip install fcn Inference Inference is done as

Kentaro Wada 218 Oct 27, 2022
[CVPR2021] De-rendering the World's Revolutionary Artefacts

De-rendering the World's Revolutionary Artefacts Project Page | Video | Paper In CVPR 2021 Shangzhe Wu1,4, Ameesh Makadia4, Jiajun Wu2, Noah Snavely4,

49 Nov 06, 2022
Code for the paper "Multi-task problems are not multi-objective"

Multi-Task problems are not multi-objective This is the code for the paper "Multi-Task problems are not multi-objective" in which we show that the com

Michael Ruchte 5 Aug 19, 2022
Pytorch version of SfmLearner from Tinghui Zhou et al.

SfMLearner Pytorch version This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghu

Clรฉment Pinard 909 Dec 22, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

Mohammad Asif Zaman 34 Dec 23, 2022
Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

Olivier Sprangers 112 Dec 28, 2022
Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Self-Classifier: Self-Supervised Classification Network Official PyTorch implementation and pretrained models of the paper Self-Supervised Classificat

Elad Amrani 24 Dec 21, 2022
Meshed-Memory Transformer for Image Captioning. CVPR 2020

Mยฒ: Meshed-Memory Transformer This repository contains the reference code for the paper Meshed-Memory Transformer for Image Captioning (CVPR 2020). Pl

AImageLab 422 Dec 28, 2022
This repository contains the code for: RerrFact model for SciVer shared task

RerrFact This repository contains the code for: RerrFact model for SciVer shared task. Setup for Inference 1. Download SciFact database Download the S

Ashish Rana 1 May 22, 2022
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hรฉnaff,

DeepMind 56 Nov 13, 2022
This repository contains the implementation of the paper Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans

Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans This repository contains the implementation of the pap

Photogrammetry & Robotics Bonn 40 Dec 01, 2022
Scalable machine learning based time series forecasting

mlforecast Scalable machine learning based time series forecasting. Install PyPI pip install mlforecast Optional dependencies If you want more functio

Nixtla 145 Dec 24, 2022
Bringing sanity to world of messed-up data

Sanitize sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is dist

Alireza Savand 63 Oct 26, 2021
A PyTorch Implementation of "Neural Arithmetic Logic Units"

Neural Arithmetic Logic Units [WIP] This is a PyTorch implementation of Neural Arithmetic Logic Units by Andrew Trask, Felix Hill, Scott Reed, Jack Ra

Kevin Zakka 181 Nov 18, 2022
Implementation of the Chamfer Distance as a module for pyTorch

Chamfer Distance for pyTorch This is an implementation of the Chamfer Distance as a module for pyTorch. It is written as a custom C++/CUDA extension.

Christian Diller 205 Jan 05, 2023