PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Overview

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot

Description

This is an inference sample written in PyTorch of the original Theano/Lasagne code.

I recreated the network as described in the paper of Karras et al. Since some layers seemed to be missing in PyTorch, these were implemented as well. The network and the layers can be found in model.py.

For the demo, a 100-celeb-hq-1024x1024-ours snapshot was used, which was made publicly available by the authors. Since I couldn't find any model converter between Theano/Lasagne and PyTorch, I used a quick and dirty script to transfer the weights between the models (transfer_weights.py).

This repo does not provide the code for training the networks.

Simple inference

To run the demo, simply execute predict.py. You can specify other weights with the --weights flag.

Example image:

Example image

Latent space interpolation

To try the latent space interpolation, use latent_interp.py. All output images will be saved in ./interp.

You can chose between the "gaussian interpolation" introduced in the original paper and the "slerp interpolation" introduced by Tom White in his paper Sampling Generative Networks using the --type argument.

Use --filter to change the gaussian filter size for the gaussian interpolation and --interp for the interpolation steps for the slerp interpolation.

The following arguments are defined:

  • --weights - path to pretrained PyTorch state dict
  • --output - Directory for storing interpolated images
  • --batch_size - batch size for DataLoader
  • --num_workers - number of workers for DataLoader
  • --type {gauss, slerp} - interpolation type
  • --nb_latents - number of latent vectors to generate
  • --filter - gaussian filter length for interpolating latent space (gauss interpolation)
  • --interp - interpolation length between each latent vector (slerp interpolation)
  • --seed - random seed for numpy and PyTorch
  • --cuda - use GPU

The total number of generated frames depends on the used interpolation technique.

For gaussian interpolation the number of generated frames equals nb_latents, while the slerp interpolation generates nb_latents * interp frames.

Example interpolation:

Example interpolation

Live latent space interpolation

A live demo of the latent space interpolation using PyGame can be seen in pygame_interp_demo.py.

Use the --size argument to change the output window size.

The following arguments are defined:

  • --weights - path to pretrained PyTorch state dict
  • --num_workers - number of workers for DataLoader
  • --type {gauss, slerp} - interpolation type
  • --nb_latents - number of latent vectors to generate
  • --filter - gaussian filter length for interpolating latent space (gauss interpolation)
  • --interp - interpolation length between each latent vector (slerp interpolation)
  • --size - PyGame window size
  • --seed - random seed for numpy and PyTorch
  • --cuda - use GPU

Transferring weights

The pretrained lasagne weights can be transferred to a PyTorch state dict using transfer_weights.py.

To transfer other snapshots from the paper (other than CelebA), you have to modify the model architecture accordingly and use the corresponding weights.

Environment

The code was tested on Ubuntu 16.04 with an NVIDIA GTX 1080 using PyTorch v.0.2.0_4.

  • transfer_weights.py needs Theano and Lasagne to load the pretrained weights.
  • pygame_interp_demo.py needs PyGame to visualize the output

A single forward pass took approx. 0.031 seconds.

Links

License

This code is a modified form of the original code under the CC BY-NC license with the following copyright notice:

# Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved.
#
# This work is licensed under the Creative Commons Attribution-NonCommercial
# 4.0 International License. To view a copy of this license, visit
# http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to
# Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

According the Section 3, I hereby identify Tero Karras et al. and NVIDIA as the original authors of the material.

Owner
Deep Learning Frameworks @NVIDIA
Rax is a Learning-to-Rank library written in JAX

🦖 Rax: Composable Learning to Rank using JAX Rax is a Learning-to-Rank library written in JAX. Rax provides off-the-shelf implementations of ranking

Google 247 Dec 27, 2022
A dataset for online Arabic calligraphy

Calliar Calliar is a dataset for Arabic calligraphy. The dataset consists of 2500 json files that contain strokes manually annotated for Arabic callig

ARBML 114 Dec 28, 2022
Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

Keyhole Imaging Code & Dataset Code associated with the paper "Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Singl

Stanford Computational Imaging Lab 20 Feb 03, 2022
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
Explaining neural decisions contrastively to alternative decisions.

Contrastive Explanations for Model Interpretability This is the repository for the paper "Contrastive Explanations for Model Interpretability", about

AI2 16 Oct 16, 2022
Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)

Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021) Single-cause Perturbation (SCP) is a framework to estimate the m

Zhaozhi Qian 9 Sep 28, 2022
This project generates news headlines using a Long Short-Term Memory (LSTM) neural network.

News Headlines Generator bunnysaini/Generate-Headlines Goal This project aims to generate news headlines using a Long Short-Term Memory (LSTM) neural

Bunny Saini 1 Jan 24, 2022
A minimalist environment for decision-making in autonomous driving

highway-env A collection of environments for autonomous driving and tactical decision-making tasks An episode of one of the environments available in

Edouard Leurent 1.6k Jan 07, 2023
Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT This repository is the official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. ArXiv If

International Business Machines 168 Dec 29, 2022
Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Jan 04, 2023
This is a library for training and applying sparse fine-tunings with torch and transformers.

This is a library for training and applying sparse fine-tunings with torch and transformers. Please refer to our paper Composable Sparse Fine-Tuning f

Cambridge Language Technology Lab 37 Dec 30, 2022
[ICSE2020] MemLock: Memory Usage Guided Fuzzing

MemLock: Memory Usage Guided Fuzzing This repository provides the tool and the evaluation subjects for the paper "MemLock: Memory Usage Guided Fuzzing

Cheng Wen 54 Jan 07, 2023
A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

ConvNeXt A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow. A FacebookResearch Implementation on A Conv

Raghvender 2 Feb 14, 2022
Emblaze - Interactive Embedding Comparison

Emblaze - Interactive Embedding Comparison Emblaze is a Jupyter notebook widget for visually comparing embeddings using animated scatter plots. It bun

CMU Data Interaction Group 77 Nov 24, 2022
Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

Toronto Robotics and AI Laboratory 289 Jan 05, 2023
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
Unpaired Caricature Generation with Multiple Exaggerations

CariMe-pytorch The official pytorch implementation of the paper "CariMe: Unpaired Caricature Generation with Multiple Exaggerations" CariMe: Unpaired

Gu Zheng 37 Dec 30, 2022
Open-source implementation of Google Vizier for hyper parameters tuning

Advisor Introduction Advisor is the hyper parameters tuning system for black box optimization. It is the open-source implementation of Google Vizier w

tobe 1.5k Jan 04, 2023