What can linearized neural networks actually say about generalization?

Overview

What can linearized neural networks actually say about generalization?

This is the source code to reproduce the experiments of the NeurIPS 2021 paper "What can linearized neural networks actually say about generalization?" by Guillermo Ortiz-Jimenez, Seyed-Mohsen Moosavi-Dezfooli and Pascal Frossard.

Dependencies

To run the code, please install all its dependencies by running:

$ pip install -r requirements.txt

This assumes that you have access to a Linux machine with an NVIDIA GPU with CUDA>=11.1. Otherwise, please check the instructions to install JAX with your setup in the corresponding repository.

In general, all scripts are parameterized using hydra and their configuration files can be found in the config/ folder.

Experiments

The repository contains code to reproduce the following experiments:

Spectral decomposition of NTK

To generate our new benchmark, consisting on the eigenfunctions of the NTK at initialization, please run the python script compute_ntk.py selecting a desired model (e.g., mlp, lenet or resnet18) and supporting dataset (e.g., cifar10 or mnist). This can be done by running

$ python compute_ntk.py model=lenet data.dataset=cifar10

This script will save the eigenvalues, eigenfunctions and weights of the model under artifacts/eigenfunctions/{data.dataset}/{model}/.

For other configuration options, please consult the configuration file config/compute-ntk/config.yaml.

Warning

Take into account that, for large models, this computation can take very long. For example, it took us two days to compute the full eigenvalue decomposition of the NTK of one randomly initialized ResNet18 using 4 NVIDIA V100 GPUs. The estimation of eigenvectors for the MLP or the LeNet, on the other hand, can be done in a matter of minutes, depending on the number of GPUs available and the selected batch_size

Training on binary eigenfunctions

Once you have estimated the eigenfunctions of the NTK, you should be able to train on any of them. To that end, select the desired label_idx (i.e. eigenfunction index), model and dataset, and run

$ python train_ntk.py label_idx=100 model=lenet data.dataset=cifar10 linearize=False

You can choose to train with the original non-linear network, or its linear approximation by specifying your choice with the flag linearize. For the non-linear models, this script also computes the final alignment of the end NTK with the target function, which it stores under artifacts/eigenfunctions/{data.dataset}/{model}/alignment_plots/

To see the different supported training options, please consult the configuration file config/train-ntk/config.yaml.

Estimation of NADs

We also provide code to compute the NADs of a CNN architecture (e.g., lenet or resnet18) using the alignment with the NTK at initialization. To do so, please run

$ python compute_nads.py model=lenet

This script will save the eigenvalues, NADs and weights of the model under artifacts/nads/{model}/.

For other configuration options, please consult the configuration file config/compute-nads/config.yaml.

Training on linearly separable datasets

Once you have estimated the NADs of a network, you should be able to train on linearly separable datasets with a single NAD as discriminative feature. To that end, select the desired label_idx (i.e. NAD index) and model, and run

$ python train_nads.py label_idx=100 model=lenet linearize=False

You can choose to train with the original non-linear network, or its linear approximation by specifying your choice with the flag linearize.

To see the different supported training options, please consult the configuration file config/train-nads/config.yaml.

Comparison of training dynamics with pretrained NTK

We also provide code to compare the training dynamics of the linearize network at initialization, and after non-linear pretraining, to estimate a particular eigenfunction of the NTK at initialization. To do this, please run

$ python pretrained_ntk_comparison.py label_idx=100 model=lenet data.dataset=cifar10

To see the different supported training options, please consult the configuration file config/pretrained_ntk_comparison/config.yaml.

Training on CIFAR2

Finally, you can train a neural network and its linearize approximation on the binary version of CIFAR10, i.e., CIFAR2. To do this, please run

$ python train_cifar.py model=lenet linearize=False

To see the different supported training options, please consult the configuration file config/binary-cifar/config.yaml.

Reference

If you use this code, please cite the following paper:

@InCollection{Ortiz-JimenezNeurIPS2021,
  title = {What can linearized neural networks actually say about generalization?},
  author = {{Ortiz-Jimenez}, Guillermo and {Moosavi-Dezfooli}, Seyed-Mohsen and Frossard, Pascal},
  booktitle = {Advances in Neural Information Processing Systems 35},
  month = Dec,
  year = {2021}
}
Owner
gortizji
PhD student at EPFL
gortizji
Fast EMD for Python: a wrapper for Pele and Werman's C++ implementation of the Earth Mover's Distance metric

PyEMD: Fast EMD for Python PyEMD is a Python wrapper for Ofir Pele and Michael Werman's implementation of the Earth Mover's Distance that allows it to

William Mayner 433 Dec 31, 2022
Refactoring dalle-pytorch and taming-transformers for TPU VM

Text-to-Image Translation (DALL-E) for TPU in Pytorch Refactoring Taming Transformers and DALLE-pytorch for TPU VM with Pytorch Lightning Requirements

Kim, Taehoon 61 Nov 07, 2022
.NET bindings for the Pytorch engine

TorchSharp TorchSharp is a .NET library that provides access to the library that powers PyTorch. It is a work in progress, but already provides a .NET

Matteo Interlandi 17 Aug 30, 2021
A Simple Example for Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env

Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env This repository implements a simple algorithm for imitation learning: DAGGER. In thi

Hao 66 Nov 23, 2022
A sketch extractor for anime/illustration.

Anime2Sketch Anime2Sketch: A sketch extractor for illustration, anime art, manga By Xiaoyu Xiang Updates 2021.5.2: Upload more example results of anim

Xiaoyu Xiang 1.6k Jan 01, 2023
Rule Based Classification Project

Kural Tabanlı Sınıflandırma ile Potansiyel Müşteri Getirisi Hesaplama İş Problemi: Bir oyun şirketi müşterilerinin bazı özelliklerini kullanaraknseviy

Şafak 1 Jan 12, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 05, 2023
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 01, 2023
PyTorch implementation of Deformable Convolution

Deformable Convolutional Networks in PyTorch This repo is an implementation of Deformable Convolution. Ported from author's MXNet implementation. Buil

411 Dec 16, 2022
Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Scene Representation Networks This is the official implementation of the NeurIPS submission "Scene Representation Networks: Continuous 3D-Structure-Aw

Vincent Sitzmann 365 Jan 06, 2023
Code for GNMR in ICDE 2021

GNMR Code for GNMR in ICDE 2021 Please unzip data files in Datasets/MultiInt-ML10M first. Run labcode_preSamp.py (with graph sampling) for ECommerce-c

7 Oct 27, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.

Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf

Mini Project 64 Nov 19, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Yuliang Liu 266 Nov 24, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

ISC-Track1-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 1. Required dependencies To begin with

Wenhao Wang 115 Jan 02, 2023
This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

You can use this simple crypto backtesting script to ensure your trading strategy is successful Minimal setup required and works well with static TP a

Andrei 154 Sep 12, 2022
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
VLGrammar: Grounded Grammar Induction of Vision and Language

VLGrammar: Grounded Grammar Induction of Vision and Language

Yining Hong 27 Dec 23, 2022
Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Multi-Anchor Active Domain Adaptation for Semantic Segmentation Munan Ning*, Donghuan Lu*, Dong Wei†, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, Y

Munan Ning 36 Dec 07, 2022