Official PyTorch implementation of Spatial Dependency Networks.

Related tags

Deep Learningsdn
Overview

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling



Example of SDN-VAE generated images.

Method Description

Spatial dependency network (SDN) is a novel neural architecture. It is based on spatial dependency layers which are designed for stacking deep neural networks that produce images e.g. generative models such as VAEs or GANs or segmentation, super-resolution and image-to-image-translation neural networks. SDNs improve upon celebrated CNNs by explicitly modeling spatial dependencies between feature vectors at each level of a deep neural network pipeline. Spatial dependency layers (i) explicitly introduce the inductive bias of spatial coherence; and (ii) offer improved modeling of long-range dependencies due to the unbounded receptive field. We applied SDN to two variants of VAE, one which we used to model image density (SDN-VAE) and one which we used to learn better disentangled representations. More generally, spatial dependency layers can be used as a drop-in replacement for convolutional layers in any image-generation-related tasks.

Graphical model of SDN layer.

Code Structure

.
├── checkpoints/               # where the model checkpoints will be stored
├── data/
     ├── ImageNet32/           # where ImageNet32 data is stored
     ├── CelebAHQ256/          # where Celeb data is stored
     ├── 3DShapes/             # where 3DShapes data is stored
     ├── lmdb_datasets.py      # LMDB data loading borrowed from https://github.com/NVlabs/NVAE
     ├── get_dataset.py        # auxiliary script for fetching data sets
├── figs/                      # figures from the paper
├── lib/
     ├── DensityVAE            # SDN-VAE which we used for density estimation
     ├── DisentanglementVAE    # VAE which we used for disentanglement
     ├── nn.py                 # The script which contains SDN and other neural net modules
     ├── probability.py        # probability models
     ├── utils.py              # utility functions
 ├── train.py                  # generic training script
 ├── evaluate.py               # the script for evaluation of trained models
 ├── train_cifar.sh            # for reproducing CIFAR10 experiments
 ├── train_celeb.sh            # for reproducing CelebAHQ256 experiments
 ├── train_imagenet.sh         # for reproducing ImageNet32 experiments
 ├── train_3dshapes.sh         # for reproducing 3DShapes experiments
 ├── requirements.txt
 ├── LICENSE
 └── README.md

Applying SDN layers to your neural network

To apply SDN layers to your framework it is sufficient that you integrate the 'lib/nn.py' file into your code. You can then import and utilize SDNLayer or ResSDNLayer (the residual variant) in the same way convolutional layer is utilized. Apart from PyTorch, no additional packages are required.

Tips & Tricks

If you would like to integrate SDN into your neural network, we recommend the following:

  • first design and debug your framework using vanilla CNN layers.
  • replace CNN layers one-by-one. Start with the lowest scale e.g. 4x4 or 8x8 to speed up debugging.
  • start with 1 or 2 directions, and then later on try using 4 directions.
  • larger number of features per SDN layers implies more expressive model which is more powerful but prone to overfitting.
  • a good idea is to use smaller number of SDN features on smaller scales and larger on larger scales.

Reproducing the experiments from the paper

Common to all experiments, you will need to install PyTorch and PyTorchLightning. The default logging system is based on Wandb but this can be changed in 'train.py'. In case you decide to use Wandb, you will need to install it and then login into your account: Follow a very simple procedure described here. To reproduce density estimation experiments you will need 8 TeslaV100 GPUs with 32Gb of memory. One way to alleviate high memory requirements is to accumulate gradient batches, however, the training will take much longer in that case. By default, you will need hardware that supports automatic mixed precision. In case your hardware does not support this, you will need to reduce the batch size, however note that the results will slightly deteriorate and that you will possibly need to reduce the learning rate too to avoid NaN values. For the disentanglement experiments, you will need a single GPU with >10Gb of memory. To install all the requirements use:

pip install -r requirements.txt

Note of caution: Ensure the right version of PyTorchLightning is used. We found multiple issues in the newer versions.

CIFAR10

The data will be automatically downloaded through PyTorch. To run the baselines that reproduce the results from the paper use:

bash train_cifar.sh
ImageNet32

To obtain the dataset go into the folder 'data/ImageNet32' and then run

bash get_imagenet_data.sh

To reproduce the experiments run:

bash train_imagenet.sh
CelebAHQ256

To obtain the dataset go into the folder 'data/CelebAHQ256' and then run

bash get_celeb_data.sh

The script is adapted from NVAE repo and is based on GLOW dataset. To reproduce the experiments run:

bash train_celeb.sh
3DShapes

To obtain the dataset follow the instructions on this GitHub repo. Place it into the 'data/3DShapes' directory. To reproduce the experiments run:

bash train_3dshapes.sh

Evaluation of trained models

To perform post hoc evaluation of your trained models, use 'evaluate.py' script and select flags corresponding to the evaluation task and the model you want to use. The evaluation can be performed on a single GPU of any type, though note that the batch size needs to be modified dependent on the available GPU memory. For the CelebAHQ256 dataset, you can download the checkpoint which contains one of the pre-trained models that we used in the paper from this link. For example, you can evaluate elbo and generate random samples by running:

python3 evaluate.py --model CelebAHQ256 --elbo --sampling

Citation

Please cite our paper if you use our code or if you re-implement our method:

@conference{miladinovic21sdn,
  title = {Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling},
  author = {Miladinović, {\DJ}or{\dj}e and Stanić, Aleksandar and Bauer, Stefan and Schmidhuber, J{\"u}rgen and Buhmann, Joachim M.},
  booktitle = {9th International Conference on Learning Representations (ICLR 2021)},
  month = may,
  year = {2021},
  month_numeric = {5}
}

Note that you might need to include the following line in your latex file:

\usepackage[T1]{fontenc}
Owner
Djordje Miladinovic
Machine learning R&D.
Djordje Miladinovic
This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Reinforcement-trading This project uses Reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can

Deepender Singla 1.4k Dec 22, 2022
DANA paper supplementary materials

DANA Supplements This repository stores the data, results, and R scripts to generate these reuslts and figures for the corresponding paper Depth Norma

0 Dec 17, 2021
A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022
Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons This repository contains the code to repr

Computational Neuroscience, University of Bern 3 Aug 04, 2022
Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra

850-Safra-DS-ModuloI Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra Para aprender mais Git https://learngitbranc

Brian Nunes 7 Dec 10, 2022
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
a Pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

A pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021" 1. Notes This is a pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in

91 Dec 26, 2022
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schröter 292 Dec 25, 2022
Implementation of Bottleneck Transformer in Pytorch

Bottleneck Transformer - Pytorch Implementation of Bottleneck Transformer, SotA visual recognition model with convolution + attention that outperforms

Phil Wang 621 Jan 06, 2023
Predict bus arrival time using VertexAI and Nvidia's Jetson Nano

bus_prediction predict bus arrival time using VertexAI and Nvidia's Jetson Nano imagenet the command for imagenet.py look like this python3 /path/to/i

10 Dec 22, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
Code release for SLIP Self-supervision meets Language-Image Pre-training

SLIP: Self-supervision meets Language-Image Pre-training What you can find in this repo: Pre-trained models (with ViT-Small, Base, Large) and code to

Meta Research 621 Dec 31, 2022
M3DSSD: Monocular 3D Single Stage Object Detector

M3DSSD: Monocular 3D Single Stage Object Detector Setup pytorch 0.4.1 Preparation Download the full KITTI detection dataset. Then place a softlink (or

mumianyuxin 64 Dec 27, 2022
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
The implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets.

Joint t-sne This is the implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets. abstract: We present Jo

IDEAS Lab 7 Dec 18, 2022
FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

Michaël Defferrard 1.8k Dec 29, 2022
Madanalysis5 - A package for event file analysis and recasting of LHC results

Welcome to MadAnalysis 5 Outline What is MadAnalysis 5? Requirements Downloading

MadAnalysis 15 Jan 01, 2023
Projecting interval uncertainty through the discrete Fourier transform

Projecting interval uncertainty through the discrete Fourier transform This repo

1 Mar 02, 2022
Codebase for Image Classification Research, written in PyTorch.

pycls pycls is an image classification codebase, written in PyTorch. It was originally developed for the On Network Design Spaces for Visual Recogniti

Facebook Research 2k Jan 01, 2023
Material del curso IIC2233 Programación Avanzada 📚

Contenidos Los contenidos se organizan según la semana del semestre en que nos encontremos, y según la semana que se destina para su estudio. Los cont

IIC2233 @ UC 72 Dec 23, 2022