Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Last update: Jul 31, 2022

Related tags

Overview

Self-supervised Image-to-text and Text-to-image Synthesis

This is the official implementation of Self-supervised Image-to-text and Text-to-image Synthesis. The architecture of and are shown.

Dataset

We use Caltech-UCSD Birds-200-2011 and Oxford-102 datasets in this work.

Download Flower images
Rename the jpg folder to images and unzip 102flowers.zip and put it inside 102flowers folder
put 102flowers folder inside data folder
Download Birds data and put inside Data/
Download image data Extract them to Data/birds/

Dependencies

pytorch
torchvision
tensorboardX
pickle

Training

Training the image autoencoder

The driver program for training the image autoencoder is main.py

To train the image autoencoder on flower dataset

python main.py --cfg cfg/flowers_3stages.yml --gpu 0

To train the image autoencoder birds dataset

python main.py --cfg cfg/birds_3stages.yml --gpu 0

Models will automatically saved after a fixed number of iteration, to restart from a failed step edit netG_version in respective .yml file

Training the text autoencoder

python run_text_test.py dataset_type Input_Folder output_file.txt

For Flower Dataset dataset_type=1, for Birds Dataset dataset_type=2 e.g.

python run_text_test.py 2 /home/user/dev/unsup/data_datasets/CUB_200_2011 outbirds_n.txt

Training the mapping networks

Train the GAN-based mapping network

python MappingImageText.py Dataset_folder

e.g.

python MappingImageText.py /home/user/dev/unsup/data_datasets/CUB_200_2011

Train the MMD-based mapping network

python mmd_ganTI.py --dataset /home/das/dev/data_datasets/birds_dataset/CUB_200_2011 --gpu_device 0

python mmd_ganIT.py --dataset /home/das/dev/data_datasets/birds_dataset/CUB_200_2011 --gpu_device 0

Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Related tags

Overview

Self-supervised Image-to-text and Text-to-image Synthesis

Dataset

Dependencies

Training

Training the image autoencoder

To train the image autoencoder on flower dataset

To train the image autoencoder birds dataset

Training the text autoencoder

Training the mapping networks

Train the GAN-based mapping network

Train the MMD-based mapping network

Owner

A modular, open and non-proprietary toolkit for core robotic functionalities by harnessing deep learning

Implementation of our NeurIPS 2021 paper "A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs".

(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

Piotr - IoT firmware emulation instrumentation for training and research

Apply our monocular depth boosting to your own network!

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning

CNN designed for pansharpening

[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Audio2Face - Audio To Face With Python

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies.

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Optimizers-visualized - Visualization of different optimizers on local minimas and saddle points.

Baseline for the Spoofing-aware Speaker Verification Challenge 2022