StyleGAN2-ada for practice

Overview

StyleGAN2-ada for practice

Open In Colab

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + PyTorch 1.7.1, requires FFMPEG for sequence-to-video conversions. For more explicit details refer to the original implementations.

Here is previous Tensorflow-based version, which produces compatible models (but not vice versa).
I still prefer it for few-shot training (~100 imgs), and for model surgery tricks (not ported here yet).

Features

  • inference (image generation) in arbitrary resolution (finally with proper padding on both TF and Torch)
  • multi-latent inference with split-frame or masked blending
  • non-square aspect ratio support (auto-picked from dataset; resolution must be divisible by 2**n, such as 512x256, 1280x768, etc.)
  • transparency (alpha channel) support (auto-picked from dataset)
  • using plain image subfolders as conditional datasets
  • funky "digression" inference technique, ported from Aydao

Few operation formats ::

  • Windows batch-files, described below (if you're on Windows with powerful GPU)
  • local Jupyter notebook (for non-Windows platforms)
  • Colab notebook (max ease of use, requires Google drive)

Just in case, original StyleGAN2-ada charms:

  • claimed to be up to 30% faster than original StyleGAN2
  • has greatly improved training (requires 10+ times fewer samples)
  • has lots of adjustable internal training settings
  • works with plain image folders or zip archives (instead of custom datasets)
  • should be easier to tweak/debug

Training

  • Put your images in data as subfolder or zip archive. Ensure they all have the same color channels (monochrome, RGB or RGBA).
    If needed, first crop square fragments from source video or directory with images (feasible method, if you work with patterns or shapes, rather than compostions):
 multicrop.bat source 512 256 

This will cut every source image (or video frame) into 512x512px fragments, overlapped with 256px shift by X and Y. Result will be in directory source-sub, rename it as you wish. If you edit the images yourself (e.g. for non-square aspect ratios), ensure their correct size. For conditional model split the data by subfolders (mydata/1, mydata/2, ..).

  • Train StyleGAN2-ada on the prepared dataset (image folder or zip archive):
 train.bat mydata

This will run training process, according to the settings in src/train.py (check and explore those!!). Results (models and samples) are saved under train directory, similar to original Nvidia approach. For conditional model add --cond option.

Please note: we save both compact models (containing only Gs network for inference) as -...pkl (e.g. mydata-512-0360.pkl), and full models (containing G/D/Gs networks for further training) as snapshot-...pkl. The naming is for convenience only.

Length of the training is defined by --lod_kimg X argument (training duration per layer/LOD). Network with base resolution 1024px will be trained for 20 such steps, for 512px - 18 steps, et cetera. Reasonable lod_kimg value for full training from scratch is 300-600, while for finetuning 20-40 is sufficient. One can override this approach, setting total duration directly with --kimg X.

If you have troubles with custom cuda ops, try removing their cached version (C:\Users\eps\AppData\Local\torch_extensions on Windows).

  • Resume training on mydata dataset from the last saved model at train/000-mydata-512-.. directory:
 train_resume.bat mydata 000-mydata-512-..
  • Uptrain (finetune) well-trained model ffhq-512.pkl on new data:
 train_resume.bat newdata ffhq-512.pkl

No need to count exact steps in this case, just stop when you're ok with the results (it's better to set low lod_kimg to follow the progress).

Generation

Generated results are saved as sequences and videos (by default, under _out directory).

  • Test the model in its native resolution:
 gen.bat ffhq-1024.pkl
  • Generate custom animation between random latent points (in z space):
 gen.bat ffhq-1024 1920-1080 100-20

This will load ffhq-1024.pkl from models directory and make a 1920x1080 px looped video of 100 frames, with interpolation step of 20 frames between keypoints. Please note: omitting .pkl extension would load custom network, effectively enabling arbitrary resolution, multi-latent blending, etc. Using filename with extension will load original network from PKL (useful to test foreign downloaded models). There are --cubic and --gauss options for animation smoothing, and few --scale_type choices. Add --save_lat option to save all traversed dlatent w points as Numpy array in *.npy file (useful for further curating).

  • Generate more various imagery:
 gen.bat ffhq-1024 3072-1024 100-20 -n 3-1

This will produce animated composition of 3 independent frames, blended together horizontally (similar to the image in the repo header). Argument --splitfine X controls boundary fineness (0 = smoothest).

Instead of simple frame splitting, one can load external mask(s) from b/w image file (or folder with file sequence):

 gen.bat ffhq-1024 1024-1024 100-20 --latmask _in/mask.jpg

Arguments --digress X would add some animated funky displacements with X strength (by tweaking initial const layer params). Arguments --trunc X controls truncation psi parameter, as usual.

NB: Windows batch-files support only 9 command arguments; if you need more options, you have to edit batch-file itself.

  • Project external images onto StyleGAN2 model dlatent points (in w space):
 project.bat ffhq-1024.pkl photo

The result (found dlatent points as Numpy arrays in *.npy files, and video/still previews) will be saved to _out/proj directory.

  • Generate smooth animation between saved dlatent points (in w space):
 play_dlatents.bat ffhq-1024 dlats 25 1920-1080

This will load saved dlatent points from _in/dlats and produce a smooth looped animation between them (with resolution 1920x1080 and interpolation step of 25 frames). dlats may be a file or a directory with *.npy or *.npz files. To select only few frames from a sequence somename.npy, create text file with comma-delimited frame numbers and save it as somename.txt in the same directory (check examples for FFHQ model). You can also "style" the result: setting --style_dlat blonde458.npy will load dlatent from blonde458.npy and apply it to higher layers, producing some visual similarity. --cubic smoothing and --digress X displacements are also applicable here.

  • Generate animation from saved point and feature directions (say, aging/smiling/etc for FFHQ model) in dlatent w space:
 play_vectors.bat ffhq-1024.pkl blonde458.npy vectors_ffhq

This will load base dlatent point from _in/blonde458.npy and move it along direction vectors from _in/vectors_ffhq, one by one. Result is saved as looped video.

Credits

StyleGAN2: Copyright © 2021, NVIDIA Corporation. All rights reserved.
Made available under the Nvidia Source Code License-NC
Original paper: https://arxiv.org/abs/2006.06676

Owner
vadim epstein
vadim epstein
Scripts and outputs related to the paper Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings.

Knowledge Graph Embeddings and Chemical Effect Prediction, 2020. Scripts and outputs related to the paper Prediction of Adverse Biological Effects of

Knowledge Graphs at the Norwegian Institute for Water Research 1 Nov 01, 2021
Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation: Work In Progress, Results can't be replicated yet with the m

Yad Konrad 196 Aug 30, 2022
subpixel: A subpixel convnet for super resolution with Tensorflow

subpixel: A subpixel convolutional neural network implementation with Tensorflow Left: input images / Right: output images with 4x super-resolution af

Atrium LTS 2.1k Dec 23, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 203 Dec 31, 2022
Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Time Masking for Temporal Language Models This repository provides a reference implementation of the paper: Time Masking for Temporal Language Models

Guy Rosin 12 Jan 06, 2023
Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

Code accompanying "Dynamic Neural Relational Inference" This codebase accompanies the paper "Dynamic Neural Relational Inference" from CVPR 2020. This

Colin Graber 48 Dec 23, 2022
Material del curso IIC2233 Programación Avanzada 📚

Contenidos Los contenidos se organizan según la semana del semestre en que nos encontremos, y según la semana que se destina para su estudio. Los cont

IIC2233 @ UC 72 Dec 23, 2022
Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation

Tiny-NewsRec The source codes for our paper "Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation". Requirements PyTorch == 1.6.0 Tensor

Yang Yu 3 Dec 07, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 06, 2022
Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

RL_toolbox all the algorithm is running on pycharm IDE, or the package loss error may exist. implemented algorithm: trpo a3c a3c:for continous action

yupei.wu 44 Oct 10, 2022
Catalyst.Detection

Accelerated DL R&D PyTorch framework for Deep Learning research and development. It was developed with a focus on reproducibility, fast experimentatio

Catalyst-Team 12 Oct 25, 2021
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022
Implementation of QuickDraw - an online game developed by Google, combined with AirGesture - a simple gesture recognition application

QuickDraw - AirGesture Introduction Here is my python source code for QuickDraw - an online game developed by google, combined with AirGesture - a sim

Viet Nguyen 89 Dec 18, 2022
Generates all variables from your .tf files into a variables.tf file.

tfvg Generates all variables from your .tf files into a variables.tf file. It searches for every var.variable_name in your .tf files and generates a v

1 Dec 01, 2022
A new version of the CIDACS-RL linkage tool suitable to a cluster computing environment.

Fully Distributed CIDACS-RL The CIDACS-RL is a brazillian record linkage tool suitable to integrate large amount of data with high accuracy. However,

Robespierre Pita 5 Nov 04, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

111 Dec 27, 2022
Implementation of Wasserstein adversarial attacks.

Stronger and Faster Wasserstein Adversarial Attacks Code for Stronger and Faster Wasserstein Adversarial Attacks, appeared in ICML 2020. This reposito

21 Oct 06, 2022
Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Fast and Context-Aware Framework for Space-Time Video Super-Resolution Preparation Dependencies PyTorch 1.2.0 CUDA 10.0 DCNv2 cd model/DCNv2 bash make

Xueheng Zhang 1 Mar 29, 2022
A rule learning algorithm for the deduction of syndrome definitions from time series data.

README This project provides a rule learning algorithm for the deduction of syndrome definitions from time series data. Large parts of the algorithm a

0 Sep 24, 2021
Official Pytorch implementation of "Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral)"

Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral): Official Project Webpage This repository provides the off

Kakao Enterprise Corp. 68 Dec 17, 2022