[NeurIPS'20] Multiscale Deep Equilibrium Models

Related tags

Deep Learningmdeq
Overview

Multiscale Deep Equilibrium Models

💥 💥 💥 💥

This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simpler & more efficient) implementation of MDEQ with the same set of tasks as here is now available in the DEQ repo.

We STRONGLY recommend using with the MDEQ-Vision code in the DEQ repo (which also supports Jacobian-related analysis).

💥 💥 💥 💥


This repository contains the code for the multiscale deep equilibrium (MDEQ) model proposed in the paper Multiscale Deep Equilibrium Models by Shaojie Bai, Vladlen Koltun and J. Zico Kolter.

Is implicit deep learning relevant for general, large-scale pattern recognition tasks? We propose the multiscale deep equilibrium (MDEQ) model, which expands upon the DEQ formulation substantially to introduce simultaneous equilibrium modeling of multiple signal resolutions. Specifically, MDEQ solves for and backpropagates through synchronized equilibria of multiple feature representation streams. Such structure rectifies one of the major drawbacks of DEQ, and provide natural hierarchical interfaces for auxiliary losses and compound training procedures (e.g., pretraining and finetuning). Our experiment demonstrate for the first time that "shallow" implicit models can scale to and achieve near-SOTA results on practical computer vision tasks (e.g., megapixel images on Cityscapes segmentation).

We provide in this repo the implementation and the links to the pretrained classification & segmentation MDEQ models.

If you find thie repository useful for your research, please consider citing our work:

@inproceedings{bai2020multiscale,
    author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},
    title     = {Multiscale Deep Equilibrium Models},
    booktitle   = {Advances in Neural Information Processing Systems (NeurIPS)},
    year      = {2020},
}

Overview

The structure of a multiscale deep equilibrium model (MDEQ) is shown below. All components of the model are shown in this figure (in practice, we use n=4).

Examples

Some examples of MDEQ segmentation results on the Cityscapes dataset.

Requirements

PyTorch >=1.4.0, torchvision >= 0.4.0

Datasets

  • CIFAR-10: We download the CIFAR-10 dataset using PyTorch's torchvision package (included in this repo).
  • ImageNet We follow the implementation from the PyTorch ImageNet Training repo.
  • Cityscapes: We download the Cityscapes dataset from its official website and process it according to this repo. Cityscapes dataset additionally require a list folder that aligns each original image with its corresponding labeled segmented image. This list folder can be downloaded here.

All datasets should be downloaded, processed and put in the respective data/[DATASET_NAME] directory. The data/ directory should look like the following:

data/
  cityscapes/
  imagenet/
  ...          (other datasets)
  list/        (see above)

Usage

All experiment settings are provided in the .yaml files under the experiments/ folder.

To train an MDEQ classification model on ImageNet/CIFAR-10, do

python tools/cls_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

To train an MDEQ segmentation model on Cityscapes, do

python -m torch.distributed.launch --nproc_per_node=4 tools/seg_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

where you should provide the pretrained ImageNet model path in the corresponding configuration (.yaml) file. We provide a sample pretrained model extractor in pretrained_models/, but you can also write your own script.

Similarly, to test the model and generate segmentation results on Cityscapes, do

python tools/seg_test.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

You can (and probably should) initiate the Cityscapes training with an ImageNet-pretrained MDEQ. You need to extract the state dict from the ImageNet checkpointed model, and set the MODEL.PRETRAINED entry in Cityscapes yaml file to this state dict on disk.

The model implementation and MDEQ's algorithmic components (e.g., L-Broyden's method) can be found in lib/.

Pre-trained Models

We provide some reasonably good pre-trained weights here so that one can quickly play with DEQs without training from scratch.

Description Task Dataset Model
MDEQ-XL ImageNet Classification ImageNet download (.pkl)
MDEQ-XL Cityscapes(val) Segmentation Cityscapes download (.pkl)
MDEQ-Small ImageNet Classification ImageNet download (.pkl)
MDEQ-Small Cityscapes(val) Segmentation Cityscapes download (.pkl)

I. Example of how to evaluate the pretrained ImageNet model:

  1. Download the pretrained ImageNet .pkl file. (I recommend using the gdown command!)
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. Run the MDEQ classification validation command:
python tools/cls_valid.py --testModel pretrained_models/[FILENAME] --cfg experiments/imagenet/cls_mdeq_[SIZE].yaml

For example, for MDEQ-Small, you should get >75% top-1 accuracy.

II. Example of how to use the pretrained ImageNet model to train on Cityscapes:

  1. Download the pretrained ImageNet .pkl file.
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set MODEL.PRETRAINED to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation training command (see the "Usage" section above):
python -m torch.distributed.launch --nproc_per_node=[N_GPUS] tools/seg_train.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

III. Example of how to use the pretrained Cityscapes model for inference:

  1. Download the pretrained Cityscapes .pkl file
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set TEST.MODEL_FILE to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation testing command (see the "Usage" section above):
python tools/seg_test.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

Tips:

  • To load the Cityscapes pretrained model, download the .pkl file and specify the path in config.[TRAIN/TEST].MODEL_FILE (which is '' by default) in the .yaml files. This is different from setting MODEL.PRETRAINED, see the point below.
  • The difference between [TRAIN/TEST].MODEL_FILE and MODEL.PRETRAINED arguments in the yaml files: the former is used to load all of the model parameters; the latter is for compound training (e.g., when transferring from ImageNet to Cityscapes, we want to discard the final classifier FC layers).
  • The repo supports checkpointing of models at each epoch. One can resume from a previously saved checkpoint by turning on the TRAIN.RESUME argument in the yaml files.
  • Just like DEQs, the MDEQ models can be slower than explicit deep networks, and even more so as the image size increases (because larger images typically require more Broyden iterations to converge well; see Figure 5 in the paper). But one can play with the forward and backward thresholds to adjust the runtime.

Acknowledgement

Some utilization code (e.g., model summary and yaml processing) of this repo were modified from the HRNet repo and the DEQ repo.

Owner
CMU Locus Lab
Zico Kolter's Research Group
CMU Locus Lab
Subgraph Based Learning of Contextual Embedding

SLiCE Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks Dataset details: We use four public benchmark da

Pacific Northwest National Laboratory 27 Dec 01, 2022
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

Membership Inference Attack against Graph Neural Networks

MIA GNN Project Starter If you meet the version mismatch error for Lasagne library, please use following command to upgrade Lasagne library. pip insta

6 Nov 09, 2022
A DeepStack custom model for detecting common objects in dark/night images and videos.

DeepStack_ExDark This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for d

MOSES OLAFENWA 98 Dec 24, 2022
MediaPipe is a an open-source framework from Google for building multimodal

MediaPipe is a an open-source framework from Google for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. It is

Bhavishya Pandit 3 Sep 30, 2022
Unbiased Learning To Rank Algorithms (ULTRA)

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels.

71 Dec 01, 2022
SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning This repository is the official implementation of "SHRIMP: Sparser Random Featur

Bobby Shi 0 Dec 16, 2021
Wordle Env: A Daily Word Environment for Reinforcement Learning

Wordle Env: A Daily Word Environment for Reinforcement Learning Setup Steps: git pull [email&#

2 Mar 28, 2022
Liver segmentation using MONAI and pytorch

Machine Learning use case in the field of Healthcare. In this project MONAI and pytorch frameworks are used for 3D Liver segmentation.

Abhishek Gajbhiye 2 May 30, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
Pytorch and Torch testing code of CartoonGAN

CartoonGAN-Test-Pytorch-Torch Pytorch and Torch testing code of CartoonGAN [Chen et al., CVPR18]. With the released pretrained models by the authors,

Yijun Li 642 Dec 27, 2022
Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

77 Dec 24, 2022
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introdu

OATML 360 Dec 28, 2022
retweet 4 satoshi ⚡️

rt4sat retweet 4 satoshi This bot is the codebase for https://twitter.com/rt4sat please feel free to create an issue if you saw any bugs basically thi

6 Sep 30, 2022
This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

Description This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs) in [Gardy et

Ludovic Gardy 0 Feb 09, 2022
Install alphafold on the local machine, get out of docker.

AlphaFold This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP

Kui Xu 73 Dec 13, 2022
Tech Resources for Academic Communities

Free tech resources for faculty, students, researchers, life-long learners, and academic community builders for use in tech based courses, workshops, and hackathons.

Microsoft 2.5k Jan 04, 2023
The Agriculture Domain of ERPNext comes with features to record crops and land

Agriculture The Agriculture Domain of ERPNext comes with features to record crops and land, track plant, soil, water, weather analytics, and even trac

Frappe 21 Jan 02, 2023
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022
SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP

scdlpicker SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP Objective This is a simple deep learning (DL) repicker module

Joachim Saul 6 May 13, 2022