Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Overview

Introduction

Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper

Song Park1, Sanghyuk Chun2, 3, Junbum Cha3, Bado Lee3, Hyunjung Shim1
1 School of Integrated Technology, Yonsei university
2 NAVER AI Lab
3 NAVER CLOVA

A few-shot font generation (FFG) method has to satisfy two objectives: the generated images should preserve the underlying global structure of the target character and present the diverse local reference style. Existing FFG methods aim to disentangle content and style either by extracting a universal representation style or extracting multiple component-wise style representations. However, previous methods either fail to capture diverse local styles or cannot be generalized to a character with unseen components, e.g., unseen language systems. To mitigate the issues, we propose a novel FFG method, named Multiple Localized Experts Few-shot Font Generation Network (MX-Font). MX-Font extracts multiple style features not explicitly conditioned on component labels, but automatically by multiple experts to represent different local concepts, e.g., left-side sub-glyph. Owing to the multiple experts, MX-Font can capture diverse local concepts and show the generalizability to unseen languages. During training, we utilize component labels as weak supervision to guide each expert to be specialized for different local concepts. We formulate the component assign problem to each expert as the graph matching problem, and solve it by the Hungarian algorithm. We also employ the independence loss and the content-style adversarial loss to impose the content-style disentanglement. In our experiments, MX-Font outperforms previous state-of-the-art FFG methods in the Chinese generation and cross-lingual, e.g., Chinese to Korean, generation.

You can find more related projects on the few-shot font generation at the following links:


Prerequisites

conda install numpy scipy scikit-image tqdm jsonlib-python3 fonttools

Usage

Note that, we only provide the example font files; not the font files used for the training the provided weight (generator.pth). The example font files are downloaded from here.

Preparing Data

  • The examples of datasets are in (./data)

Font files (.ttf)

  • Prepare the TrueType font files(.ttf) to use for the training and the validation.
  • Put the training font files and validation font files into separate directories.

The text files containing the available characters of .ttf files (.txt)

  • If you have the available character list of a .ttf file, save its available characters list to a text file (.txt) with the same name in the same directory with the ttf file.
    • (example) TTF file: data/ttfs/train/MaShanZheng-Regular.ttf, its available characters: data/ttfs/train/MaShanZheng-Regular.txt
  • You can also generate the available characters files automatically using the get_chars_from_ttf.py
# Generating the available characters file

python get_chars_from_ttf.py --root_dir path/to/ttf/dir
  • --root_dir: The root directory to find the .ttf files. All the .ttf files under this directory and its subdirectories will be processed.

The json files with decomposition information (.json)

  • The files for the decomposition information are needed.
    • The files for the Chinese characters are provided. (data/chn_decomposition.json, data/primals.json)
    • If you want to train the model with a language other than Chinese, the files for the decomposition rule (see below) are also needed.
      • Decomposition rule
        • structure: dict (in json format)
        • format: {char: [list of components]}
        • example: {'㐬': ['亠', '厶', '川'], '㐭': ['亠', '囗', '口']}
      • Primals
        • structure: list (in json format)
        • format: [All the components in the decomposition rule file]
        • example: ['亠', '厶', '川', '囗', '口']

Training

Modify the configuration file (cfgs/train.yaml)

- use_ddp:  whether to use DataDistributedParallel, for multi-GPUs.
- port:  the port for the DataDistributedParallel training.

- work_dir:  the directory to save checkpoints, validation images, and the log.
- decomposition:  path to the "decomposition rule" file.
- primals:  path to the "primals" file.

- dset:  (leave blank)
  - train:  (leave blank)
    - data_dir : path to .ttf files for the training
  - val: (leave blank)
    - data_dir : path to .ttf files for the validation
    - source_font : path to .ttf file used as the source font during the validation

Run training

python train.py cfgs/train.yaml
  • arguments
    • path/to/config (first argument): path to configration file.
    • --resume (optional) : path to checkpoint to resume.

Test

Preparing the reference images

  • Prepare the reference images and the .ttf file to use as the source font.
  • The reference images are should be placed in this format:
    * data_dir
    |-- font1
        |-- char1.png
        |-- char2.png
        |-- char3.png
    |-- font2
        |-- char1.png
        |-- char2.png
            .
            .
            .
  • The names of the directory or the image files are not important, however, the images with the same reference style are should be grouped with the same directory.
  • If you want to generate only specific characters, prepare the file containing the list of the characters to generate.
    • The example file is provided. (data/chn_gen.json)

Modify the configuration file (cfgs/eval.yaml)

- dset:  (leave blank)
  - test:  (leave blank)
    - data_dir: path to reference images
    - source_font: path to .ttf file used as the source font during the generation
    - gen_chars_file: path to file of the characters to generate. Leave blank if you want to generate all the available characters in the source font.

Run test

python eval.py \
    cfgs/eval.yaml \
    --weight generator.pth \
    --result_dir path/to/save/images
  • arguments
    • path/to/config (first argument): path to configration file.
    • --weight : path to saved weight to test.
    • --result_dir: path to save generated images.

Code license

This project is distributed under MIT license, except modules.py which is adopted from https://github.com/NVlabs/FUNIT.

MX-Font
Copyright (c) 2021-present NAVER Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

Acknowledgement

This project is based on clovaai/dmfont and clovaai/lffont.

How to cite

@article{park2021mxfont,
    title={Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts},
    author={Park, Song and Chun, Sanghyuk and Cha, Junbum and Lee, Bado and Shim, Hyunjung},
    year={2021},
    journal={arXiv preprint arXiv:2104.00887},
}
Owner
Clova AI Research
Open source repository of Clova AI Research, NAVER & LINE
Clova AI Research
Check out the StyleGAN repo and place it in the same directory hierarchy as the present repo

Variational Model Inversion Attacks Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani Most commands are in run_scripts. W

Jackson Wang 15 Dec 26, 2022
CTF Challenge for CSAW Finals 2021

Terminal Velocity Misc CTF Challenge for CSAW Finals 2021 This is a challenge I've had in mind for almost 15 years and never got around to building un

Jordan 6 Jul 30, 2022
A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

jie jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model

Bojing Jia 9 Sep 29, 2022
Transformer part of 12th place solution in Riiid! Answer Correctness Prediction

kaggle_riiid Transformer part of 12th place solution in Riiid! Answer Correctness Prediction. Please see here for more information. Execution You need

Sakami Kosuke 2 Apr 23, 2022
A full pipeline AutoML tool for tabular data

HyperGBM Doc | 中文 We Are Hiring! Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are k

DataCanvas 240 Jan 03, 2023
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
Annotate with anyone, anywhere.

h h is the web app that serves most of the https://hypothes.is/ website, including the web annotations API at https://hypothes.is/api/. The Hypothesis

Hypothesis 2.6k Jan 08, 2023
Compressed Video Action Recognition

Compressed Video Action Recognition Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl. In CVPR, 2018. [Proj

Chao-Yuan Wu 479 Dec 26, 2022
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
Optical machine for senses sensing using speckle and deep learning

# Senses-speckle [Remote Photonic Detection of Human Senses Using Secondary Speckle Patterns](https://doi.org/10.21203/rs.3.rs-724587/v1) paper Python

Zeev Kalyuzhner 0 Sep 26, 2021
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Learning to trade under the reinforcement learning framework

Trading Using Q-Learning In this project, I will present an adaptive learning model to trade a single stock under the reinforcement learning framework

Uirá Caiado 470 Nov 28, 2022
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

1 Nov 10, 2021
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

BUPT GAMMA Lab 519 Jan 02, 2023
Prevent `CUDA error: out of memory` in just 1 line of code.

🐨 Koila Koila solves CUDA error: out of memory error painlessly. Fix it with just one line of code, and forget it. 🚀 Features 🙅 Prevents CUDA error

RenChu Wang 1.7k Jan 02, 2023
TLoL (Python Module) - League of Legends Deep Learning AI (Research and Development)

TLoL-py - League of Legends Deep Learning Library TLoL-py is the Python component of the TLoL League of Legends deep learning library. It provides a s

7 Nov 29, 2022
Localization Distillation for Object Detection

Localization Distillation for Object Detection This repo is based on mmDetection. This is the code for our paper: Localization Distillation

274 Dec 26, 2022
PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

Asha Anoosheh 139 Dec 20, 2022
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors   In order to facilitate the res

yujmo 11 Dec 12, 2022
Generating synthetic mobility data for a realistic population with RNNs to improve utility and privacy

lbs-data Motivation Location data is collected from the public by private firms via mobile devices. Can this data also be used to serve the public goo

Alex 11 Sep 22, 2022