Official PyTorch implementation of Spatial Dependency Networks.

Related tags

Deep Learningsdn
Overview

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling



Example of SDN-VAE generated images.

Method Description

Spatial dependency network (SDN) is a novel neural architecture. It is based on spatial dependency layers which are designed for stacking deep neural networks that produce images e.g. generative models such as VAEs or GANs or segmentation, super-resolution and image-to-image-translation neural networks. SDNs improve upon celebrated CNNs by explicitly modeling spatial dependencies between feature vectors at each level of a deep neural network pipeline. Spatial dependency layers (i) explicitly introduce the inductive bias of spatial coherence; and (ii) offer improved modeling of long-range dependencies due to the unbounded receptive field. We applied SDN to two variants of VAE, one which we used to model image density (SDN-VAE) and one which we used to learn better disentangled representations. More generally, spatial dependency layers can be used as a drop-in replacement for convolutional layers in any image-generation-related tasks.

Graphical model of SDN layer.

Code Structure

.
├── checkpoints/               # where the model checkpoints will be stored
├── data/
     ├── ImageNet32/           # where ImageNet32 data is stored
     ├── CelebAHQ256/          # where Celeb data is stored
     ├── 3DShapes/             # where 3DShapes data is stored
     ├── lmdb_datasets.py      # LMDB data loading borrowed from https://github.com/NVlabs/NVAE
     ├── get_dataset.py        # auxiliary script for fetching data sets
├── figs/                      # figures from the paper
├── lib/
     ├── DensityVAE            # SDN-VAE which we used for density estimation
     ├── DisentanglementVAE    # VAE which we used for disentanglement
     ├── nn.py                 # The script which contains SDN and other neural net modules
     ├── probability.py        # probability models
     ├── utils.py              # utility functions
 ├── train.py                  # generic training script
 ├── evaluate.py               # the script for evaluation of trained models
 ├── train_cifar.sh            # for reproducing CIFAR10 experiments
 ├── train_celeb.sh            # for reproducing CelebAHQ256 experiments
 ├── train_imagenet.sh         # for reproducing ImageNet32 experiments
 ├── train_3dshapes.sh         # for reproducing 3DShapes experiments
 ├── requirements.txt
 ├── LICENSE
 └── README.md

Applying SDN layers to your neural network

To apply SDN layers to your framework it is sufficient that you integrate the 'lib/nn.py' file into your code. You can then import and utilize SDNLayer or ResSDNLayer (the residual variant) in the same way convolutional layer is utilized. Apart from PyTorch, no additional packages are required.

Tips & Tricks

If you would like to integrate SDN into your neural network, we recommend the following:

  • first design and debug your framework using vanilla CNN layers.
  • replace CNN layers one-by-one. Start with the lowest scale e.g. 4x4 or 8x8 to speed up debugging.
  • start with 1 or 2 directions, and then later on try using 4 directions.
  • larger number of features per SDN layers implies more expressive model which is more powerful but prone to overfitting.
  • a good idea is to use smaller number of SDN features on smaller scales and larger on larger scales.

Reproducing the experiments from the paper

Common to all experiments, you will need to install PyTorch and PyTorchLightning. The default logging system is based on Wandb but this can be changed in 'train.py'. In case you decide to use Wandb, you will need to install it and then login into your account: Follow a very simple procedure described here. To reproduce density estimation experiments you will need 8 TeslaV100 GPUs with 32Gb of memory. One way to alleviate high memory requirements is to accumulate gradient batches, however, the training will take much longer in that case. By default, you will need hardware that supports automatic mixed precision. In case your hardware does not support this, you will need to reduce the batch size, however note that the results will slightly deteriorate and that you will possibly need to reduce the learning rate too to avoid NaN values. For the disentanglement experiments, you will need a single GPU with >10Gb of memory. To install all the requirements use:

pip install -r requirements.txt

Note of caution: Ensure the right version of PyTorchLightning is used. We found multiple issues in the newer versions.

CIFAR10

The data will be automatically downloaded through PyTorch. To run the baselines that reproduce the results from the paper use:

bash train_cifar.sh
ImageNet32

To obtain the dataset go into the folder 'data/ImageNet32' and then run

bash get_imagenet_data.sh

To reproduce the experiments run:

bash train_imagenet.sh
CelebAHQ256

To obtain the dataset go into the folder 'data/CelebAHQ256' and then run

bash get_celeb_data.sh

The script is adapted from NVAE repo and is based on GLOW dataset. To reproduce the experiments run:

bash train_celeb.sh
3DShapes

To obtain the dataset follow the instructions on this GitHub repo. Place it into the 'data/3DShapes' directory. To reproduce the experiments run:

bash train_3dshapes.sh

Evaluation of trained models

To perform post hoc evaluation of your trained models, use 'evaluate.py' script and select flags corresponding to the evaluation task and the model you want to use. The evaluation can be performed on a single GPU of any type, though note that the batch size needs to be modified dependent on the available GPU memory. For the CelebAHQ256 dataset, you can download the checkpoint which contains one of the pre-trained models that we used in the paper from this link. For example, you can evaluate elbo and generate random samples by running:

python3 evaluate.py --model CelebAHQ256 --elbo --sampling

Citation

Please cite our paper if you use our code or if you re-implement our method:

@conference{miladinovic21sdn,
  title = {Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling},
  author = {Miladinović, {\DJ}or{\dj}e and Stanić, Aleksandar and Bauer, Stefan and Schmidhuber, J{\"u}rgen and Buhmann, Joachim M.},
  booktitle = {9th International Conference on Learning Representations (ICLR 2021)},
  month = may,
  year = {2021},
  month_numeric = {5}
}

Note that you might need to include the following line in your latex file:

\usepackage[T1]{fontenc}
Owner
Djordje Miladinovic
Machine learning R&D.
Djordje Miladinovic
Neural style in TensorFlow! 🎨

neural-style An implementation of neural style in TensorFlow. This implementation is a lot simpler than a lot of the other ones out there, thanks to T

Anish Athalye 5.5k Dec 29, 2022
BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 03, 2023
Library for fast text representation and classification.

fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme

Facebook Research 24.1k Jan 01, 2023
Unimodal Face Classification with Multimodal Training

Unimodal Face Classification with Multimodal Training This is a PyTorch implementation of the following paper: Unimodal Face Classification with Multi

Wenbin Teng 3 Jul 06, 2022
[ECCV 2020] Gradient-Induced Co-Saliency Detection

Gradient-Induced Co-Saliency Detection Zhao Zhang*, Wenda Jin*, Jun Xu, Ming-Ming Cheng ⭐ Project Home » The official repo of the ECCV 2020 paper Grad

Zhao Zhang 35 Nov 25, 2022
Sinkformers: Transformers with Doubly Stochastic Attention

Code for the paper : "Sinkformers: Transformers with Doubly Stochastic Attention" Paper You will find our paper here. Compat This package has been dev

Michael E. Sander 31 Dec 29, 2022
Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

PackNet: https://arxiv.org/abs/1711.05769 Pretrained models are available here: https://uofi.box.com/s/zap2p03tnst9dfisad4u0sfupc0y1fxt Datasets in Py

Arun Mallya 216 Jan 05, 2023
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

Facebook Research 171 Nov 23, 2022
A library for Deep Learning Implementations and utils

deeply A Deep Learning library Table of Contents Features Quick Start Usage License Features Python 2.7+ and Python 3.4+ compatible. Quick Start $ pip

Achilles Rasquinha 1 Dec 12, 2022
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

PIC4SeRCentre 20 Jan 03, 2023
The BCNet related data and inference model.

BCNet This repository includes the some source code and related dataset of paper BCNet: Learning Body and Cloth Shape from A Single Image, ECCV 2020,

81 Dec 12, 2022
PyTorch-Multi-Style-Transfer - Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 906 Jan 04, 2023
The object detection pipeline is based on Ultralytics YOLOv5

AYOLOv2 The main goal of this repository is to rewrite the object detection pipeline with a better code structure for better portability and adaptabil

153 Dec 22, 2022
Crossover Learning for Fast Online Video Instance Segmentation (ICCV 2021)

TL;DR: CrossVIS (Crossover Learning for Fast Online Video Instance Segmentation) proposes a novel crossover learning paradigm to fully leverage rich c

Hust Visual Learning Team 79 Nov 25, 2022
PyTorch Implementation of SSTNs for hyperspectral image classifications from the IEEE T-GRS paper "Spectral-Spatial Transformer Network for Hyperspectral Image Classification: A FAS Framework."

PyTorch Implementation of SSTN for Hyperspectral Image Classification Paper links: SSTN published on IEEE T-GRS. Also, you can directly find the imple

Zilong Zhong 54 Dec 19, 2022