ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

Overview

ADGAN

PyTorch | project page | paper

PyTorch implementation for controllable person image synthesis.

Controllable Person Image Synthesis with Attribute-Decomposed GAN
Yifang Men, Yiming Mao, Yuning Jiang, Wei-Ying Ma, Zhouhui Lian, Peking University & ByteDance AI Lab, CVPR 2020(Oral).

Component Attribute Transfer

Pose Transfer

Requirement

  • python 3
  • pytorch(>=1.0)
  • torchvision
  • numpy
  • scipy
  • scikit-image
  • pillow
  • pandas
  • tqdm
  • dominate

Getting Started

You can directly download our generated images (in Deepfashion) from Google Drive.

Installation

  • Clone this repo:
git clone https://github.com/menyifang/ADGAN.git
cd ADGAN

Data Preperation

We use DeepFashion dataset and provide our dataset split files, extracted keypoints files and extracted segmentation files for convience.

The dataset structure is recommended as:

+—deepfashion
|   +—fashion_resize
|       +--train (files in 'train.lst')
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg
|       +--test (files in 'test.lst')
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg
|       +--trainK(keypoints of person images)
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg.npy
|       +--testK
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg.npy
|   +—semantic_merge
|   +—fashion-resize-pairs-train.csv
|   +—fashion-resize-pairs-test.csv
|   +—fashion-resize-annotation-pairs-train.csv
|   +—fashion-resize-annotation-pairs-test.csv
|   +—train.lst
|   +—test.lst
|   +—vgg19-dcbb9e9d.pth
|   +—vgg_conv.pth
...
  1. Person images
python tool/generate_fashion_datasets.py

Note: In our settings, we crop the images of DeepFashion into the resolution of 176x256 in a center-crop manner.

  1. Keypoints files
  • Download train/test pairs and train/test key points annotations from Google Drive, including fashion-resize-pairs-train.csv, fashion-resize-pairs-test.csv, fashion-resize-annotation-train.csv, fashion-resize-annotation-train.csv. Put these four files under the deepfashion directory.
  • Generate the pose heatmaps. Launch
python tool/generate_pose_map_fashion.py
  1. Segmentation files
  • Extract human segmentation results from existing human parser (e.g. Look into Person) and merge into 8 categories. Our segmentation results are provided in Google Drive, including ‘semantic_merge2’ and ‘semantic_merge3’ in different merge manner. Put one of them under the deepfashion directory.

Optionally, you can also generate these files by yourself.

  1. Keypoints files

We use OpenPose to generate keypoints.

  • Download pose estimator from Google Drive. Put it under the root folder ADGAN.
  • Change the paths input_folder and output_path in tool/compute_coordinates.py. And then launch
python2 compute_coordinates.py
  1. Dataset split files
python2 tool/create_pairs_dataset.py

Train a model

bash ./scripts/train.sh 

Test a model

Download our pretrained model from Google Drive. Modify your data path and launch

bash ./scripts/test.sh 

Evaluation

We adopt SSIM, IS, DS, CX for evaluation. This part is finished by Yiming Mao.

1) SSIM

For evaluation, Tensorflow 1.4.1(python3) is required.

python tool/getMetrics_market.py

2) DS Score

Download pretrained on VOC 300x300 model and install propper caffe version SSD. Put it in the ssd_score forlder.

python compute_ssd_score_fashion.py --input_dir path/to/generated/images

3) CX (Contextual Score)

Refer to folder ‘cx’ to compute contextual score.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{men2020controllable,
  title={Controllable Person Image Synthesis with Attribute-Decomposed GAN},
  author={Men, Yifang and Mao, Yiming and Jiang, Yuning and Ma, Wei-Ying and Lian, Zhouhui},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2020 IEEE Conference on},
  year={2020}
}


Acknowledgments

Our code is based on PATN and thanks for their great work.

Owner
Men Yifang
Men Yifang
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Project Page | Paper A Shading-Guided Generative Implicit Model

Xingang Pan 115 Dec 18, 2022
Hyperbolic Image Segmentation, CVPR 2022

Hyperbolic Image Segmentation, CVPR 2022 This is the implementation of paper Hyperbolic Image Segmentation (CVPR 2022). Repository structure assets :

Mina Ghadimi Atigh 46 Dec 29, 2022
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.13

Keon Lee 140 Dec 21, 2022
In this project, we create and implement a deep learning library from scratch.

ARA In this project, we create and implement a deep learning library from scratch. Table of Contents Deep Leaning Library Table of Contents About The

22 Aug 23, 2022
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

David Brüggemann 35 Dec 05, 2022
A PaddlePaddle implementation of STGCN with a few modifications in the model architecture in order to forecast traffic jam.

About This repository contains the code of a PaddlePaddle implementation of STGCN based on the paper Spatio-Temporal Graph Convolutional Networks: A D

Tianjian Li 1 Jan 11, 2022
RLDS stands for Reinforcement Learning Datasets

RLDS RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of

Google Research 135 Jan 01, 2023
Project for music generation system based on object tracking and CGAN

Project for music generation system based on object tracking and CGAN The project was inspired by MIDINet: A Convolutional Generative Adversarial Netw

1 Nov 21, 2021
Internship Assessment Task for BaggageAI.

BaggageAI Internship Task Problem Statement: You are given two sets of images:- background and threat objects. Background images are the background x-

Arya Shah 10 Nov 14, 2022
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to

DMIS Laboratory - Korea University 50 Nov 30, 2022
Code for testing various M1 Chip benchmarks with TensorFlow.

M1, M1 Pro, M1 Max Machine Learning Speed Test Comparison This repo contains some sample code to benchmark the new M1 MacBooks (M1 Pro and M1 Max) aga

Daniel Bourke 348 Jan 04, 2023
Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Muhammad Maaz 206 Jan 04, 2023
CS50's Introduction to Artificial Intelligence Test Scripts

CS50's Introduction to Artificial Intelligence Test Scripts 🤷‍♂️ What's this? 🤷‍♀️ This repository contains Python scripts to automate tests for mos

Jet Kan 2 Dec 28, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

20 Jul 29, 2022
Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

HanCo Dataset & Contrastive Representation Learning for Hand Shape Estimation Code in conjunction with the publication: Contrastive Representation Lea

Computer Vision Group, Albert-Ludwigs-Universität Freiburg 38 Dec 13, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.

FixMyPose / फिक्समाइपोज़ Code and dataset for AAAI 2021 paper "FixMyPose: Pose Correctional Describing and Retrieval" Hyounghun Kim*, Abhay Zala*, Grah

4 Sep 19, 2022
HINet: Half Instance Normalization Network for Image Restoration

HINet: Half Instance Normalization Network for Image Restoration Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, Chengpeng Chen Paper: https://arxiv.org

303 Dec 31, 2022
Official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR)

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

12 Jan 13, 2022
Explore extreme compression for pre-trained language models

Code for paper "Exploring extreme parameter compression for pre-trained language models ICLR2022"

twinkle 16 Nov 14, 2022