Generative Adversarial Text-to-Image Synthesis

Last update: Dec 31, 2022

Related tags

Deep Learning icml2016

Overview

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

This is the code for our ICML 2016 paper on text-to-image synthesis using conditional GANs. You can use it to train and sample from text-to-image models. The code is adapted from the excellent dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, and the display package.

####How to train a text to image model:

Download the birds and flowers and COCO caption data in Torch format.
Download the birds and flowers and COCO image data.
Download the text encoders for birds and flowers and COCO descriptions.
Modify the CONFIG file to point to your data and text encoder paths.
Run one of the training scripts, e.g. ./scripts/train_cub.sh

####How to generate samples:

For flowers: ./scripts/demo_flowers.sh. Add text descriptions to scripts/flowers_queries.txt.
For birds: ./scripts/demo_cub.sh.
For COCO (more general images): ./scripts/demo_coco.sh.
An html file will be generated with the results:

####Pretrained models:

####How to train a text encoder from scratch:

You may want to do this if you have your own new dataset of text descriptions.
For flowers and birds: follow the instructions here.
For MS-COCO: ./scripts/train_coco_txt.sh.

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016generative,
  title={Generative Adversarial Text-to-Image Synthesis},
  author={Scott Reed and Zeynep Akata and Xinchen Yan and Lajanugen Logeswaran and Bernt Schiele and Honglak Lee},
  booktitle={Proceedings of The 33rd International Conference on Machine Learning},
  year={2016}
}

Generative Adversarial Text-to-Image Synthesis

Related tags

Overview

Owner

Scott Ellison Reed

Turning pixels into virtual points for multimodal 3D object detection.

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)

基于pytorch构建cyclegan示例

PaSST: Efficient Training of Audio Transformers with Patchout

HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks

Implementation of the state-of-the-art vision transformers with tensorflow

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

JAXDL: JAX (Flax) Deep Learning Library

RTSeg: Real-time Semantic Segmentation Comparative Study

Notebooks for my "Deep Learning with TensorFlow 2 and Keras" course

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

A simple approach to emable dense segmentation with ViT.

A CNN model to detect hand gestures.

The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow