Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Overview

Face Identity Disentanglement via Latent Space Mapping

Description

Official Implementation of the paper Face Identity Disentanglement via Latent Space Mapping for both training and evaluation.

Face Identity Disentanglement via Latent Space Mapping
Yotam Nitzan1, Amit Bermano1, Yangyan Li2, Daniel Cohen-Or1
1Tel-Aviv University, 2Alibaba
https://arxiv.org/abs/2005.07728

Abstract: Learning disentangled representations of data is a fundamental problem in artificial intelligence. Specifically, disentangled latent representations allow generative models to control and compose the disentangled factors in the synthesis process. Current methods, however, require extensive supervision and training, or instead, noticeably compromise quality. In this paper, we present a method that learns how to represent data in a disentangled way, with minimal supervision, manifested solely using available pre-trained networks. Our key insight is to decouple the processes of disentanglement and synthesis, by employing a leading pre-trained unconditional image generator, such as StyleGAN. By learning to map into its latent space, we leverage both its state-of-the-art quality, and its rich and expressive latent space, without the burden of training it. We demonstrate our approach on the complex and high dimensional domain of human heads. We evaluate our method qualitatively and quantitatively, and exhibit its success with de-identification operations and with temporal identity coherency in image sequences. Through extensive experimentation, we show that our method successfully disentangles identity from other facial attributes, surpassing existing methods, even though they require more training and supervision.

Setup

To setup everything you need check out the setup instructions.

Training

Preparing the Dataset

The dataset is comprised of StyleGAN-generated images and W latent codes, both are generated from a single StyleGAN model.

We also use real images from FFHQ to evaluate quality at test time.

The dataset is assumed to be in the following structure:

Path Description
base directory Directory for all datasets
├  real FFHQ image dataset
├  dataset_N dataset for resolution NxN
│  ├  images images generated by StyleGAN
│  └  ws W latent codes generated by StyleGAN

To generate the dataset_N directory, run:

cd utils\
python generate_fake_data.py \ 
    --resolution N \
    --batch_size BATCH_SIZE \
    --output_path OUTPUT_PATH \
    --pretrained_models_path PRETRAINED_MODELS_PATH \
    --num_images NUM_IMAGES \
    --gpu GPU

It will generate an image dataset in similar format to FFHQ.

Start training

To train the model as done in the paper

python main.py
    NAME
    --resolution N
    --pretrained_models_path PRETRAINED_MODELS_PATH
    --dataset BASE_DATASET_DIR
    --batch_size BATCH_SIZE
    --cross_frequency 3
    --train_data_size 70000
    --results_dir RESULTS_DIR        

Please run python main.py -h for more details.

Inference

For convenience, there are a few inference functions - each serving a different use case. The functions are resolved using the name of the function.

All possible combinations in dirs

Input data: Two directories, one identity inputs and another for attribute inputs.
Runs over all N*M combinations in two directories.

python test.py 
    Name
    --pretrained_models_path PRETRAINED_MODELS_PATH \
    --load_checkpoint PATH_TO_WEIGHTS \
    --id_dir DIR_OF_IMAGES_FOR_ID \
    --attr_dir DIR_OF_IMAGES_FOR_ATTR \
    --output_dir DIR_FOR_OUTPUTS \
    --test_func infer_on_dirs

Paired data

Input data: Two directories, one identity inputs and another for attribute inputs.
The two directories are assumed to be paired. Inference runs on images with the same names.

python test.py 
    Name
    --pretrained_models_path PRETRAINED_MODELS_PATH \
    --load_checkpoint PATH_TO_WEIGHTS \
    --id_dir DIR_OF_IMAGES_FOR_ID \
    --attr_dir DIR_OF_IMAGES_FOR_ATTR \
    --output_dir DIR_FOR_OUTPUTS \
    --test_func infer_pairs

Disentangled interpolation

Interpolating attributes

Interpolating identity

Input data: A directory with any number of subdirectories. In each subdir, there are three images. All images should have exactly one of attr or id in their name. If there are two attr images and one id image, it will interpolate attribute. If there is one attr images and two id images, it will interpolate identity.

python test.py 
    Name
    --pretrained_models_path PRETRAINED_MODELS_PATH \
    --load_checkpoint PATH_TO_WEIGHTS \
    --input_dir PARENT_DIR \
    --output_dir DIR_FOR_OUTPUTS \
    --test_func interpolate

Checkpoints

Our pretrained 256x256 checkpoint is also available.

Citation

If you use this code for your research, please cite our paper using:

@article{Nitzan2020FaceID,
  title={Face identity disentanglement via latent space mapping},
  author={Yotam Nitzan and A. Bermano and Yangyan Li and D. Cohen-Or},
  journal={ACM Transactions on Graphics (TOG)},
  year={2020},
  volume={39},
  pages={1 - 14}
}
Simple (but Strong) Baselines for POMDPs

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs Welcome to the POMDP world! This repo provides some simple baselines for POMDPs, specific

Tianwei V. Ni 172 Dec 29, 2022
LeetCode Solutions https://t.me/tenvlad

leetcode LeetCode Solutions groupped by common patterns YouTube: https://www.youtube.com/c/vladten Telegram: https://t.me/nilinterface Problems source

Vlad Ten 158 Dec 29, 2022
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Awesome Weak-Shot Learning

Awesome Weak-Shot Learning In weak-shot learning, all categories are split into non-overlapped base categories and novel categories, in which base cat

BCMI 162 Dec 30, 2022
Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"

FAME: Feature-based Adversarial Meta-Embeddings This is the companion code for the experiments reported in the paper "FAME: Feature-Based Adversarial

Bosch Research 11 Nov 27, 2022
Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

Qing Guo 13 Nov 29, 2022
Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

41 Jan 04, 2023
Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

Belinda Li 20 May 17, 2022
Task-related Saliency Network For Few-shot learning

Task-related Saliency Network For Few-shot learning This is an official implementation in Tensorflow of TRSN. Abstract An essential cue of human wisdo

1 Nov 18, 2021
Chinese named entity recognization with BiLSTM using Keras

Chinese named entity recognization (Bilstm with Keras) Project Structure ./ ├── README.md ├── data │   ├── README.md │   ├── data 数据集 │   │   ├─

1 Dec 17, 2021
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Payphone 8 Nov 21, 2022
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 02, 2023
MakeItTalk: Speaker-Aware Talking-Head Animation

MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation

Adobe Research 285 Jan 08, 2023
Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices Abstract For practical deep neural network design on mobile devices, it is e

11 Dec 30, 2022
Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)

spatial-intention-maps This code release accompanies the following paper: Spatial Intention Maps for Multi-Agent Mobile Manipulation Jimmy Wu, Xingyua

Jimmy Wu 70 Jan 02, 2023
Código de um painel de auto atendimento feito em Python.

Painel de Auto-Atendimento O intuito desse projeto era fazer em Python um programa que simulasse um painel de auto atendimento, no maior estilo Mac Do

Calebe Alves Evangelista 2 Nov 09, 2022
Codes for the compilation and visualization examples to the HIF vegetation dataset

High-impedance vegetation fault dataset This repository contains the codes that compile the "Vegetation Conduction Ignition Test Report" data, which a

1 Dec 12, 2021
Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Renato Almeida de Oliveira 18 Aug 31, 2022
An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

Transformer-in-Transformer An Implementation of the Transformer in Transformer paper by Han et al. for image classification, attention inside local pa

Rishit Dagli 40 Jul 25, 2022
An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

THUNLP 2.3k Jan 07, 2023