StarGAN2 for practice

Overview

StarGAN2 for practice

This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. At least, this is what I use nearly daily myself.
Here are few pieces, made with it: Terminal Blink, Occurro, etc.
Tested on Pytorch 1.4-1.8. Sequence-to-video conversions require FFMPEG. For more explicit details refer to the original implementation.

Features

  • streamlined workflow, focused on practical tasks [TBA]
  • cleaned up and simplified code for better readability
  • stricter memory management to fit bigger batches on consumer GPUs
  • models mixing (SWA) for better stability

NB: In the meantime here's only training code and some basic inference (processing). More various methods & use cases may be added later.

Presumed file structure

stargan2 root
├  _in input data for processing
├  _out generation output (sequences & videos)
├  data datasets for training
│  └  afhq [example] some dataset
│     ├  cats [example] images for training
│     │  └  test [example] images for validation
│     ├  dogs [example] images for training
│     │  └  test [example] images for validation
│     └  ⋯
├  models trained models for inference/processing
│  └  afhq-256-5-100.pkl [example] trained model file
├  src source code
└  train training folders
   └  afhq.. [example] auto-created training folder

Training

  • Prepare your multi-domain dataset as shown above. Main directory should contain folders with images of different domains (e.g. cats, dogs, ..); every such folder must contain test subfolder with validation subset. Such structure allows easy data recombination for experiments. The images may be of any sizes (they'll be randomly cropped during training), but not smaller than img_size specified for training (default is 256).

  • Train StarGAN2 on the prepared dataset (e.g. afhq):

 python src/train.py --data_dir data/afhq --model_dir train/afhq --img_size 256 --batch 8

This will run training process, according to the settings in src/train.py (check and explore those!). Models are saved under train/afhq and named as dataset-size-domaincount-kimgs, e.g. afhq-256-5-100.ckpt (required for resuming).

  • Resume training on the same dataset from the iteration 50 (thousands), presuming there's corresponding complete 3-models set (with nets and optims) in train/afhq:
 python src/train.py --data_dir data/afhq --model_dir train/afhq --img_size 256 --batch 8 --resume 50
  • Make an averaged model (only for generation) from the directory of those, e.g. train/select:
 python src/swa.py -i train/select 

Few personal findings

  1. Batch size is crucial for this network! Official settings are batch=8 for size 256, if you have large GPU RAM. One can fit batch 3 or 4 on 11gb GPU; those results are interesting, but less impressive. Batches of 2 or 1 are for the brave only.. Size is better kept as 256; the network has auto-scaling layer count, but I didn't manage to get comparable results for size 512 with batches up to 7 (max for 32gb).
  2. Model weights may seriously oscillate during training, especially for small batches (typical for Cycle- or Star- GANs), so it's better to save models frequently (there may be jewels). The best selected models can be mixed together with swa.py script for better stability. By default, Generator network is saved every 1000 iterations, and the full set - every 5000 iterations. 100k iterations (few days on a single GPU) may be enough; 200-250k would give pretty nice overfit.
  3. Lambda coefficients lambda_ds (diversity), lambda_cyc (reconstruction) and lambda_sty (style) may be increased for smaller batches, especially if the goal is stylization, rather than photo-realistic transformation. The videos above, for instance, were made with these lambdas equal 3. The reference-based generation is nearly lost with such settings, but latent-based one can make nice art.
  4. The order of domains in the training set matters a lot! I usually put some photos first (as it will be the main source imagery), and the closest to photoreal as second; but other approaches may go well too (and your mileage may vary).
  5. I particularly love this network for its' failures. Even the flawed results (when the batches are small, the lambdas are wrong, etc.) are usually highly expressive and "inventive", just the kind of "AI own art", which is so spoken about. Experimenting with such aesthetics is a great fun.

Generation

  • Transform image test.jpg with AFHQ model (can be downloaded here):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt

This will produce 3 images (one per trained domain in the model) in the _out directory.
If source is a directory, every image in it will be processed accordingly.

  • Generate output for the domain(s), referenced by number(s):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt --ref 2
  • Generate output with reference image for domain 1 (ref filename must start with that number):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt --ref 1-ref.jpg

To be continued..

Credits

StarGAN2
Copyright © 2020, NAVER Corp. All rights reserved.
Made available under Creative Commons BY-NC 4.0 license.
Original paper: https://arxiv.org/abs/1912.01865

Owner
vadim epstein
vadim epstein
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
Reinforcement Learning for finance

Reinforcement Learning for Finance We apply reinforcement learning for stock trading. Fetch Data Example import utils # fetch symbols from yahoo fina

Tomoaki Fujii 159 Jan 03, 2023
[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets Introduction This repo contains the source code accompanying the paper: Well-tuned Sim

52 Jan 04, 2023
Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Poplar Implementation of Bundle Adjustment using Gaussian Belief Propagation on Graphcore's IPU Implementation of CVPR 2020 paper: Bundle Adjustment o

Joe Ortiz 34 Dec 05, 2022
Code for the Paper "Diffusion Models for Handwriting Generation"

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022
Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks This repository contains the code and pre-trained

Hassan Dbouk 7 Dec 05, 2022
Efficiently Disentangle Causal Representations

Efficiently Disentangle Causal Representations Install dependency pip install -r requirements.txt Main experiments Causality direction prediction cd

4 Apr 01, 2022
Display, filter and search log messages in your terminal

Textualog Display, filter and search logging messages in the terminal. This project is powered by rich and textual. Some of the ideas and code in this

Rik Huygen 24 Dec 10, 2022
Official implementation of the ICML2021 paper "Elastic Graph Neural Networks"

ElasticGNN This repository includes the official implementation of ElasticGNN in the paper "Elastic Graph Neural Networks" [ICML 2021]. Xiaorui Liu, W

liuxiaorui 34 Dec 04, 2022
Imagededup - 😎 Finding duplicate images made easy

imagededup is a python package that simplifies the task of finding exact and near duplicates in an image collection.

idealo 4.3k Jan 07, 2023
This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

Classifier-Balancing This repository contains code for the paper: Decoupling Representation and Classifier for Long-Tailed Recognition Bingyi Kang, Sa

Facebook Research 820 Dec 26, 2022
LSTM-VAE Implementation and Relevant Evaluations

LSTM-VAE Implementation and Relevant Evaluations Before using any file in this repository, please create two directories under the root directory name

Lan Zhang 5 Oct 08, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
“Robust Lightweight Facial Expression Recognition Network with Label Distribution Training”, AAAI 2021.

EfficientFace Zengqun Zhao, Qingshan Liu, Feng Zhou. "Robust Lightweight Facial Expression Recognition Network with Label Distribution Training". AAAI

Zengqun Zhao 119 Jan 08, 2023
Reimplementation of NeurIPS'19: "Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting" by Shu et al.

[Re] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting Reimplementation of NeurIPS'19: "Meta-Weight-Net: Learning an Explicit Mapping

Robert Cedergren 1 Mar 13, 2020
Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

Saliency Guided Training Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training" by Aya Abdelsalam Ismail, Hector Cor

8 Sep 22, 2022
Syed Waqas Zamir 906 Dec 30, 2022
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
AI-based, context-driven network device ranking

Batea A batea is a large shallow pan of wood or iron traditionally used by gold prospectors for washing sand and gravel to recover gold nuggets. Batea

Secureworks Taegis VDR 269 Nov 26, 2022