Generative Adversarial Text to Image Synthesis

Overview

Text To Image Synthesis

This is a tensorflow implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow.

Plese star https://github.com/tensorlayer/tensorlayer

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Datasets

  • The model is currently trained on the flowers dataset. Download the images from here and save them in 102flowers/102flowers/*.jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in 102flowers/text_c10/class_*.

N.B You can downloads all data files needed manually or simply run the downloads.py and put the correct files to the right directories.

python downloads.py

Codes

  • downloads.py download Oxford-102 flower dataset and caption files(run this first).
  • data_loader.py load data for further processing.
  • train_txt2im.py train a text to image model.
  • utils.py helper functions.
  • model.py models.

References

Results

  • the flower shown has yellow anther red pistil and bright red petals.
  • this flower has petals that are yellow, white and purple and has dark lines
  • the petals on this flower are white with a yellow center
  • this flower has a lot of small round pink petals.
  • this flower is orange in color, and has petals that are ruffled and rounded.
  • the flower has yellow petals and the center of it is brown
  • this flower has petals that are blue and white.
  • these white flowers have petals that start off white in color and end in a white towards the tips.

License

Apache 2.0

Comments
  • ValueError: Object arrays cannot be loaded when allow_pickle=False

    ValueError: Object arrays cannot be loaded when allow_pickle=False

    File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "/home/siddanath/importantforprojects/text-to-image/utils.py", line 20, in load_and_assign_npz params = tl.files.load_npz(name=name) File "/home/siddanath/importantforprojects/text-to-image/tensorlayer/files.py", line 600, in load_npz return d['params'] File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 262, in getitem pickle_kwargs=self.pickle_kwargs) File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py", line 722, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

    opened by Siddanth-pai 2
  • Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    I got a problem, how can I solve it?

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnnftxt/rnn/dynamic/rnn/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

    opened by flsd201983 1
  • Next step after download.py

    Next step after download.py

    What is the next step to do after download.py? I tried python data_loader.py, but it has FileNotFoundError: FileNotFoundError: [Errno 2] No such file or directory: '/home/ly/src/lib/text-to-image/102flowers/text_c10'

    opened by arisliang 0
  • ValueError: invalid literal for int() with base 10: 'e' - when making inference

    ValueError: invalid literal for int() with base 10: 'e' - when making inference

    code -

    sample_sentence = ["a"] * int(sample_size/ni) + ["e"] * int(sample_size/ni) + ["i"] * int(sample_size/ni) + ["o"] * int(sample_size/ni) + ["u"] * int(sample_size/ni)

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize( sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')
    
    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={
        t_real_caption: sample_sentence,
        t_z: sample_seed})
    
    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')
    
    opened by Akinleyejoshua 0
  • Excuse me, why is the flower dataset I test the result is very different from result.png

    Excuse me, why is the flower dataset I test the result is very different from result.png

    import tensorflow as tf import tensorlayer as tl from tensorlayer.layers import * from tensorlayer.prepro import * from tensorlayer.cost import * import numpy as np import scipy from scipy.io import loadmat import time, os, re, nltk

    from utils import * from model import * import model import pickle

    ###======================== PREPARE DATA ====================================### print("Loading data from pickle ...") import pickle with open("_vocab.pickle", 'rb') as f: vocab = pickle.load(f) with open("_image_train.pickle", 'rb') as f: _, images_train = pickle.load(f) with open("_image_test.pickle", 'rb') as f: _, images_test = pickle.load(f) with open("_n.pickle", 'rb') as f: n_captions_train, n_captions_test, n_captions_per_image, n_images_train, n_images_test = pickle.load(f) with open("_caption.pickle", 'rb') as f: captions_ids_train, captions_ids_test = pickle.load(f)

    images_train_256 = np.array(images_train_256)

    images_test_256 = np.array(images_test_256)

    images_train = np.array(images_train) images_test = np.array(images_test)

    ni = int(np.ceil(np.sqrt(batch_size))) save_dir = "checkpoint"

    t_real_image = tf.placeholder('float32', [batch_size, image_size, image_size, 3], name = 'real_image')

    t_real_caption = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name='real_caption_input')

    t_z = tf.placeholder(tf.float32, [batch_size, z_dim], name='z_noise') generator_txt2img = model.generator_txt2img_resnet

    net_rnn = rnn_embed(t_real_caption, is_train=False, reuse=False) net_g, _ = generator_txt2img(t_z, net_rnn.outputs, is_train=False, reuse=False, batch_size=batch_size)

    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) tl.layers.initialize_global_variables(sess)

    net_rnn_name = os.path.join(save_dir, 'net_rnn.npz400.npz') net_cnn_name = os.path.join(save_dir, 'net_cnn.npz400.npz') net_g_name = os.path.join(save_dir, 'net_g.npz400.npz') net_d_name = os.path.join(save_dir, 'net_d.npz400.npz')

    net_rnn_res = tl.files.load_and_assign_npz(sess=sess, name=net_rnn_name, network=net_rnn)

    net_g_res = tl.files.load_and_assign_npz(sess=sess, name=net_g_name, network=net_g)

    sample_size = batch_size sample_seed = np.random.normal(loc=0.0, scale=1.0, size=(sample_size, z_dim)).astype(np.float32)

    n = int(sample_size / ni) sample_sentence = ["the flower shown has yellow anther red pistil and bright red petals."] * n +
    ["this flower has petals that are yellow, white and purple and has dark lines"] * n +
    ["the petals on this flower are white with a yellow center"] * n +
    ["this flower has a lot of small round pink petals."] * n +
    ["this flower is orange in color, and has petals that are ruffled and rounded."] * n +
    ["the flower has yellow petals and the center of it is brown."] * n +
    ["this flower has petals that are blue and white."] * n +
    ["these white flowers have petals that start off white in color and end in a white towards the tips."] * n

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize(sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')

    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={ t_real_caption : sample_sentence, t_z : sample_seed})

    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')

    opened by keqkeq 0
  • Tensorflow 2.1, Tensorlayer 2.2 update

    Tensorflow 2.1, Tensorlayer 2.2 update

    Hello,

    are there any plans in the near future to update this git to the latest Tensorflow and Tensorlayer versions? I've been trying making the code run with backwards compat (compat.tf1. ...) but I've keep bumping on errors which are a bit too big of mouth full for me.

    Fyi: I've succesfully run the DCGAN Tensorlayer implementation with Tensorlayer 2.2 and a self build Tensorflow 2.1 (with 3.0 compute compatibility) from source in Python 3.7.

    So, an update would be greatly appreciated!

    opened by SadRebel1000 0
Releases(0.2)
Owner
Hao
Assistant Professor @ Peking University
Hao
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Tensorpack is a neural network training interface based on TensorFlow. Features: It's Yet Another TF high-level API, with speed, and flexibility built

Tensorpack 6.2k Jan 09, 2023
(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

DARS Code release for the paper "Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation", ICCV 2021

CVMI Lab 58 Jan 01, 2023
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

CFN-SR A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION The audio-video based multimodal

skeleton 15 Sep 26, 2022
This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

deSpeckNet-TF-GEE This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling publi

Adugna Mullissa 16 Sep 07, 2022
计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制,还收集了一些即插即用模块。由于能力有限精力有限,可能很多模块并没有包括进来,有任何的建议或者改进,可以提交issue或者进行PR。

PJDong 599 Dec 23, 2022
Bayesian optimization in PyTorch

BoTorch is a library for Bayesian Optimization built on PyTorch. BoTorch is currently in beta and under active development! Why BoTorch ? BoTorch Prov

2.5k Dec 31, 2022
Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

MIPNet: Multi-Instance Pose Networks This repository is the official pytorch python implementation of "Multi-Instance Pose Networks: Rethinking Top-Do

Rawal Khirodkar 57 Dec 12, 2022
Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces Installation After cloning the repo open

37 Dec 03, 2022
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration (NeurIPS 2021) PyTorch implementation of the paper: CoFiNet: Reli

76 Jan 03, 2023
TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations

TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations Requirements python 3.6 torch 1.9 numpy 1.19 Quick Start The experimen

DMIRLAB 4 Oct 16, 2022
Transformers based fully on MLPs

Awesome MLP-based Transformers papers An up-to-date list of Transformers based fully on MLPs without attention! Why this repo? After transformers and

Fawaz Sammani 35 Dec 30, 2022
A simple python program that can be used to implement user authentication tokens into your program...

token-generator A simple python module that can be used by developers to implement user authentication tokens into your program... code examples creat

octo 6 Apr 18, 2022
Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

dimensions Estimating the instrinsic dimensionality of image datasets Code for: The Intrinsic Dimensionaity of Images and Its Impact On Learning - Phi

Phil Pope 41 Dec 10, 2022
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

bobby 16 Dec 06, 2019
Multi-query Video Retreival

Multi-query Video Retreival

Princeton Visual AI Lab 17 Nov 22, 2022
Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Flask Google Maps Easy to use Google Maps in your Flask application requires Jinja Flask A google api key get here Contribute To contribute with the p

Flask Extensions 611 Dec 05, 2022