Chunkmogrify: Real image inversion via Segments

Overview

Chunkmogrify: Real image inversion via Segments

Logo

Teaser video with live editing sessions can be found here

This code demonstrates the ideas discussed in arXiv submission Real Image Inversion via Segments.
http://arxiv.org/abs/2110.06269
(David Futschik, Michal Lukáč, Eli Shechtman, Daniel Sýkora)

Abstract:
We present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be estimated with greater accuracy due to the lower number of constraints. When codes are altered by the user the content in the image is manipulated locally while the rest of it remains unaffected. Thanks to this property the final edited image better retains the original structures and thus helps to preserve natural look.

before after

before after

What do I need?

You will need a local machine with a relatively recent GPU - I wouldn't recommend trying Chunkmogrify with anything older than RTX 2080. It is technically possible to run even on CPU, but the operations become so slow that the user experience is not enjoyable.

Quick startup guide

Requirements:
Python 3.7 or newer

Note: If you are using Anaconda, I recommend creating a new environment to run this project. Packages installed with conda and pip often don't play together very nicely.

Steps to be able to successfully run the project:

  1. Clone or download the repository and open a terminal / Powershell instance in the directory.
  2. Install the required python packages by running pip install -r requirements.txt. This might take a while, since it will download a few packages which will be several hundred MBs of data. Some packages might need to compile their extensions (as well as this project itself), so a C++ compiler needs to be present. On Linux, this is typically not an issue, but running on Windows might require Visual Studio and CUDA installations to successfully setup the project.
  3. Run python app.py. When running for the first time, it will automatically download required resources, which are also several hundred megabytes. Progression of the download can be monitored in the command line window.

To see if everything installed and configured properly, load up a photo and try running a projection step. If there are no errors, you are good to go.

Possible problems:

Torch not compiled with CUDA enabled.
Run

pip uninstall torch
pip cache purge
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Explanation of usage

Tutorial video: click below

Open an image using File -> Image from File. There is a sample image provided to check functionality.

Mask painting:
Left click paints, right click unpaints. Mouse wheel controls the size of the brush.

Projection:
Input a number of steps (100 or 200 is ok, 500 is max before LR goes to 0 currently) and press Projection Steps. Wait until projection finishes, you can observe the global image view by choosing output mode Projection Only during this process. To fine-tune, you can perform a small number of Pivotal Tuning steps.

Editing:
To add an edit, click the double arrow down icon in the Attribute Editor on the left side. Choose the type of edit (W, S, Styleclip), the direction of the edit, and drag the sliders to change the currently masked region. Usually it's necessary to increase the multiplier before noticeable changes are reflected via the direction slider.

Multiple different edits can be composed on top of each other at the same time. Their order is largely irrelevant. Currently in the default mode, only one region is being edited, and so all selected edits apply to the same region. If you would like to change the region, you can Freeze the current image, and perform a new projection, but you will lose the ability to change existing edits.

To save the current image, click the Save Current Image button. If the Unalign checkbox is active, the program will attempt to compose the aligned face back into the original image. Saved images can be found in the SavedImages directory by default. This can be changed in _config.yaml.

Keyboard shortcuts

Current keyboard shortcuts include:

Show/Hide mask :: Alt+M
Toggle mask painting :: Alt+N

W-space editing

Source for some of the basic directions:
(https://twitter.com/robertluxemburg/status/1207087801344372736)

To add your own directions, save them in a numpy pickle format as a (num_ws, 512) or (1, 512) format and specify their path in w_directions.py.

Style-space editing (S space edits)

Source:
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
(https://arxiv.org/abs/2011.12799)
(https://github.com/betterze/StyleSpace)

The presets can be found in s_presets.py, some were taken directly from the paper, others I found by manual exploration. You can perform similar exploration by choosing the Custom preset once you have a projection.

StyleCLIP editing

Source:
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
(https://arxiv.org/abs/2103.17249)
(https://github.com/orpatashnik/StyleCLIP)

Pretrained models taken from (https://github.com/orpatashnik/StyleCLIP/blob/main/utils.py) and manually removed the decoder from the state dict, since it's not used and takes up majority of file size.

PTI Optimization

Source:
Pivotal Tuning for Latent-based Editing of Real Images
(https://arxiv.org/abs/2106.05744)

This method allows you to match the target photo very closely, while retaining editing capacities.

It's often good to run 30-50 iterations of PTI to get very close matching of the source image, which won't cause a very noticeable drop in the editing capabilities.

Attribution

This repository makes use of code provided by the various repositories linked above, plus additionally code from:

styleganv2-ada-pytorch (https://github.com/NVlabs/stylegan2-ada-pytorch)
poisson-image-editing (https://github.com/PPPW/poisson-image-editing) for optional support of idempotent blend (slow implementation of blending that only changes the masked part which can be accessed by uncommenting the option in synthesis.py)

Citation

If you find this code useful for your research, please cite the arXiv submission linked above.

Owner
David Futschik
PhD student @ CTU Prague, Czech Republic.
David Futschik
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Vision Transformer Pytorch reimplementation of Google's repository for the ViT model that was released with the paper An Image is Worth 16x16 Words: T

Eunkwang Jeon 1.4k Dec 28, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research 338 Dec 29, 2022
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

0 May 06, 2022
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. Check the unlearning effect

Yige-Li 51 Dec 07, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
Exploring Simple 3D Multi-Object Tracking for Autonomous Driving (ICCV 2021)

Exploring Simple 3D Multi-Object Tracking for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Exploring Simple 3D Multi-Object Tracking for

QCraft 141 Nov 21, 2022
Image Segmentation Evaluation

Image Segmentation Evaluation Martin Keršner, [email protected] Evaluation

Martin Kersner 273 Oct 28, 2022
FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

Face Anonymizer Blur faces from image and video files in /input/ folder. Require

22 Nov 03, 2022
Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

InterDigital 63 Jan 03, 2023
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)

TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c

Nguyen Mau Dung 438 Dec 29, 2022
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

TensorFlow White Paper Notes Features Notes broken down section by section, as well as subsection by subsection Relevant links to documentation, resou

Sam Abrahams 437 Oct 09, 2022
Toolbox to analyze temporal context invariance of deep neural networks

PyTCI A toolbox that estimates the integration window of a sensory response using the "Temporal Context Invariance" paradigm (TCI). The TCI method Int

4 Oct 23, 2022
Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Philipp Erler 329 Jan 06, 2023
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022