DAGAN - Dual Attention GANs for Semantic Image Synthesis

Related tags

Text Data & NLPDAGAN
Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

Contents

Semantic Image Synthesis with DAGAN

Dual Attention GANs for Semantic Image Synthesis
Hao Tang1, Song Bai2, Nicu Sebe13.
1University of Trento, Italy, 2University of Oxford, UK, 3Huawei Research Ireland, Ireland.
In ACM MM 2020.
The repository offers the official implementation of our paper in PyTorch.

In the meantime, check out our related CVPR 2020 paper Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation and Arxiv paper Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis.

Framework

Results of Generated Images

Cityscapes (512×256)

Facades (1024×1024)

ADE20K (256×256)

CelebAMask-HQ (512×512)

Results of Generated Segmenation Maps

License

Creative Commons License
Copyright (C) 2020 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Installation

Clone this repo.

git clone https://github.com/Ha0Tang/DAGAN
cd DAGAN/

This code requires PyTorch 1.0 and python 3+. Please install dependencies by

pip install -r requirements.txt

This code also requires the Synchronized-BatchNorm-PyTorch rep.

cd DAGAN_v1/
cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ../../

To reproduce the results reported in the paper, you would need an NVIDIA DGX1 machine with 8 V100 GPUs.

Dataset Preparation

Please download the datasets on the respective webpages.

  • Facades: 55.8M, here.
  • DeepFashion: 592.3M, here.
  • CelebAMask-HQ: 2.7G, here.
  • Cityscapes: 8.4G, here.
  • ADE20K: 953.7M, here.
  • COCO-Stuff: 21.5G, here.

We also provide the prepared datasets for your convience.

sh datasets/download_dagan_dataset.sh [dataset]

where [dataset] can be one of facades, deepfashion, celeba, cityscapes, ade20k, or coco_stuff.

Generating Images Using Pretrained Model

  1. Download the pretrained models using the following script,
sh scripts/download_dagan_model.sh GauGAN_DAGAN_[dataset]

where [dataset] can be one of cityscapes, ade, facades, or celeba.

  1. Change several parameter and then generate images using test_[dataset].sh. If you are running on CPU mode, append --gpu_ids -1.
  2. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Train and Test New Models

  1. Prepare dataset.
  2. Change several parameters and then run train_[dataset].sh for training. There are many options you can specify. To specify the number of GPUs to utilize, use --gpu_ids. If you want to use the second and third GPUs for example, use --gpu_ids 1,2.
  3. Testing is similar to testing pretrained models. Use --results_dir to specify the output directory. --how_many will specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Evaluation

For more details, please refer to this issue.

Acknowledgments

This source code is inspired by both GauGAN/SPADE and LGGAN.

Related Projects

EdgeGAN | LGGAN | SelectionGAN | PanoGAN | Guided-I2I-Translation-Papers

Citation

If you use this code for your research, please consider giving stars and citing our papers 🦖 :

DAGAN

@inproceedings{tang2020dual,
  title={Dual Attention GANs for Semantic Image Synthesis},
  author={Tang, Hao and Bai, Song and Sebe, Nicu},
  booktitle ={ACM MM},
  year={2020}
}

EdgeGAN

@article{tang2020edge,
  title={Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis},
  author={Tang, Hao and Qi, Xiaojuan and Xu, Dan and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2003.13898},
  year={2020}
}

LGGAN

@inproceedings{tang2019local,
  title={Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Torr, Philip HS and Sebe, Nicu},
  booktitle={CVPR},
  year={2020}
}

SelectionGAN

@inproceedings{tang2019multi,
  title={Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation},
  author={Tang, Hao and Xu, Dan and Sebe, Nicu and Wang, Yanzhi and Corso, Jason J and Yan, Yan},
  booktitle={CVPR},
  year={2019}
}

@article{tang2020multi,
  title={Multi-channel attention selection gans for guided image-to-image translation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Corso, Jason J and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2002.01048},
  year={2020}
}

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Collaborations

I'm always interested in meeting new people and hearing about potential collaborations. If you'd like to work together or get in contact with me, please email [email protected]. Some of our projects are listed here.


Take a few minutes to appreciate what you have and how far you've come.

Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
Installation, test and evaluation of Scribosermo speech-to-text engine

Scribosermo STT Setup Scribosermo is a LGPL licensed, open-source speech recognition engine to "Train fast Speech-to-Text networks in different langua

Florian Quirin 3 Jun 20, 2022
SpikeX - SpaCy Pipes for Knowledge Extraction

SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.

Erre Quadro Srl 384 Dec 12, 2022
Must-read papers on improving efficiency for pre-trained language models.

Must-read papers on improving efficiency for pre-trained language models.

Tobias Lee 89 Jan 03, 2023
Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP

Stat4ML Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP This is the first course from our trio courses: Statistics Foundatio

Omid Safarzadeh 83 Dec 29, 2022
Hostapd-mac-tod-acl - Setup a hostapd AP with MAC ToD ACL

A brief explanation This script provides a quick way to setup a Time-of-day (Tod

2 Feb 03, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect.

117 Jan 07, 2023
Google's Meena transformer chatbot implementation

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

Francesco Pham 94 Dec 25, 2022
Code for the paper "Flexible Generation of Natural Language Deductions"

Code for the paper "Flexible Generation of Natural Language Deductions"

Kaj Bostrom 12 Nov 11, 2022
DaCy: The State of the Art Danish NLP pipeline using SpaCy

DaCy: A SpaCy NLP Pipeline for Danish DaCy is a Danish preprocessing pipeline trained in SpaCy. At the time of writing it has achieved State-of-the-Ar

Kenneth Enevoldsen 71 Jan 06, 2023
Extract Keywords from sentence or Replace keywords in sentences.

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install

Vikash Singh 5.3k Jan 01, 2023
A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

Zhenbang Feng 29 Nov 26, 2022
Sequence model architectures from scratch in PyTorch

This repository implements a variety of sequence model architectures from scratch in PyTorch. Effort has been put to make the code well structured so that it can serve as learning material. The train

Brando Koch 11 Mar 28, 2022
leaking paid token generator that was a shit lmao for 100$ haha

Discord-Token-Generator-Leaked leaking paid token generator that was a shit lmao for 100$ he selling it for 100$ wth here the code enjoy don't forget

Keevo 5 Apr 15, 2022
Official Stanford NLP Python Library for Many Human Languages

Official Stanford NLP Python Library for Many Human Languages

Stanford NLP 6.4k Jan 02, 2023
Unsupervised Abstract Reasoning for Raven’s Problem Matrices

Unsupervised Abstract Reasoning for Raven’s Problem Matrices This code is the implementation of our TIP paper. This is the first unsupervised abstract

Tao Zhuo 9 Dec 17, 2022
Milaan Parmar / Милан пармар / _米兰 帕尔马 170 Dec 13, 2022
Nmt - TensorFlow Neural Machine Translation Tutorial

Neural Machine Translation (seq2seq) Tutorial Authors: Thang Luong, Eugene Brevdo, Rui Zhao (Google Research Blogpost, Github) This version of the tut

6.1k Dec 29, 2022
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Deepangshi 1 Apr 03, 2022
Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow. Documentation Proper documentation is available at

HUSEIN ZOLKEPLI 151 Jan 05, 2023
Contact Extraction with Question Answering.

contactsQA Extraction of contact entities from address blocks and imprints with Extractive Question Answering. Goal Input: Dr. Max Mustermann Hauptstr

Jan 2 Apr 20, 2022