An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix

Overview

An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix, with glyphs based on cwTeXFangSong. The font is optimised for vertical typesetting. Below is a sample:

The font contains roughly 13,000 glyphs, mostly for traditional Chinese.

I created the font for one of my own projects. The font is admittedly not perfect, but nevertheless have many ineteresting features; therefore I am sharing the font file and programs used to generate it.

Download

Download the font directly at dist/tkFangSong.ttf.

The name of the font is 剔骨仿宋 (thek-kwot-fang-song, So named because the algorithm that created it resembles "deboning"). It is licensed under SIL open font license. If you wish to credit the author, you might use my name 黃令東/黄令东, or the romanization "Lingdong Huang".

Features

The font builds on the elegant shapes from cwTeXFangSong to add more hand-made look and feel reminiscent of the aesthetics of old woodblock printed books.

The font has a wider proportion compared to the original cwTeXFangSong, and is further widened towards the bottom, to accentuate the finishing strokes. The "center of mass" is also moved downwards:

Above right is a visualization of the base function used to warp the skeleton.

The height of a glyph is additionally tweaked based on its vertical complexity, computed with Sobel operator and taking the max of each pixel row.

Many fonts are optimized for horizontal typesetting, and as such, when arranged vertically, the center of mass shifts left and right, giving a jagged look. This font attempts to solve the problem by computing centroids (via image moments) and aligning them.

The font has rich textures. Some of them are artifacts produced by pix2pix network; others are fine-tuned noises delibrately added.

It is to be noted that, as an automated process, it doesn't always produce optimal results; some characters might end up looking ugly, or use the wrong caligraphic movement for certain strokes; For some caligraphers, some strokes might appear too "weak" to their tastes.

Process

The medial axis (skeleton) is computed for each raster rendering of the glyphs in the original font. (The resultant hershey font can be found at ./dist/CWFS64J.HF.TXT)

Pairs of images: the original rendering vs the skeletons are sent to pix2pix for training. pix2pix learns the correspondance and becomes capable of turning skeletons to glyphs.

New skeletons are generated by warping the originals according to my (questionable) taste.

All the new skeletons are fed into the trained network to obtain the new glyphs. The new glyphs are warped in structure, but the weight and shape of the strokes still look legit.

Some post-processing is applied, and potrace is used to re-vectorize the glyphs. Finally, fontforge is used to create a TTF file.

Building the font

Note that to use the font, you can simply download it here. This sections is for reproducing the results from scratch.

The scripts used to build the font are included in the workflow/ folder. Note that making the font is a quite involved process (especially the part of training the neural net). You might also need to modify the scripts to fit your system/folder configuration, but here are some rough steps:

  • Get OpenCV, tensorflow<=1.13.1, numpy and friends
  • First install swig version of skeleton-tracing for python, then obtain cwTeXFangSong from here.
  • Run skel.py > CWFS64.HF.TXT, then join.py > CWFS64J.HF.TXT
  • Modify pairs.py to read CWFS64J.HF.TXT and output to a folder you will create.
  • Download pix2pix-tensorflow and train on the images created in the previous step. (Good luck getting a prehistoric tensorflow project running, you'll need it. Once you do it's pretty straightforward, follow their README)
  • I trained 40K steps on 1.3K images, afterwhich the quality didn't seem to improve much, your mileage might vary.
  • Run warp.py > CWFS64W3.HF.TXT. Modify pairs.py to read from it, and create an output folder like before. Run pairs.py.
  • Evaluate the new image pairs with pix2pix. Edit any of the images that have too much defect with Photoshop or software of your choice. Put the edited ones in a separated folder, say retouched/.
  • Modify refine.py to read from the images and retouched images folders, create an output folder fine/ for it, and run the script.
  • Download potrace, make it runnable from commandline, and run trace_all.py.
  • Run forgefont.py to create a TTF from SVGs generated in the previous step.
  • Done! You can also preview the glyphs with preview.py > index.html, or preview the skeleton with preview_hf.py > index.html.

A PDF containing all the glyphs can be found here. If you find this font not bad, you might also enjoy qiji-font, a more authentic reproduction of a historical typeface.

Owner
Lingdong Huang
Lingdong Huang
Um simulador de caixa registradora com database usando arquivos .txt

🛒 Caixa Registradora V2 ❓ - Como usar? Execute o caixa-registradora.py, nele vai ter um menu interativo, você pode cadastrar diversos produtos em um

Gabriel 0 Sep 25, 2022
汉字转拼音(pypinyin)

汉字拼音转换工具(Python 版) 将汉字转为拼音。可以用于汉字注音、排序、检索(Russian translation) 。 基于 hotoo/pinyin 开发。 Documentation: http://pypinyin.rtfd.io/ GitHub: https://github.co

Huang Huang 4.2k Jan 03, 2023
Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files

cdvpp Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files Reads a Digital Vaccination Pass PDF file as in

Esteban Borai 1 Nov 17, 2021
A pipeline for making highlighted text stand-alone.

title emoji colorFrom colorTo sdk app_file pinned decontextualizer 📤 green gray streamlit main.py false Decontextualizer As a second step in improvin

Paul Bricman 26 Dec 17, 2022
This is an AI that is supposed to say you if your text is formal or not

This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the te

1 Jan 12, 2022
PyNews 📰 Simple newsletter made with python 🐍🗞️

PyNews 📰 Simple newsletter made with python Install dependencies This project has some dependencies (see requirements.txt) that are not included in t

Luciano Felix 4 Aug 21, 2022
Text Summarizationcls app with python

Text Summarizationcls app This is the repo for the Text Summarization AI Project. It makes use of pre-trained Hugging Face models Packages Used The pa

Edem Gold 1 Oct 23, 2021
Python Lex-Yacc

PLY (Python Lex-Yacc) Copyright (C) 2001-2020 David M. Beazley (Dabeaz LLC) All rights reserved. Redistribution and use in source and binary forms, wi

David Beazley 2.4k Dec 31, 2022
A Python3 script that simulates the user typing a text on their keyboard.

A Python3 script that simulates the user typing a text on their keyboard. (control the speed, randomness, rate of typos and more!)

Jose Gracia Berenguer 3 Feb 22, 2022
Convert text to morse code and play morse code sound.

Convert text(english) to morse codes and play morse sound!

Mohammad Dori 5 Jul 15, 2022
CowExcept - Spice up those exceptions with cowexcept!

CowExcept - Spice up those exceptions with cowexcept!

James Ansley 41 Jun 30, 2022
Paranoid text spacing in Python

pangu.py Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width charact

Vinta Chen 194 Nov 19, 2022
A minimal python script for generating multiple onetime use bip39 seed phrases

seed_signer_ontimes WARNING This project has mainly been used for local development, and creation should be ran on a air-gapped machine. A minimal pyt

CypherToad 4 Sep 12, 2022
This repository contains scripts to control a RGB text fan attached to a Raspberry Pi.

RGB Text Fan Controller This repository contains scripts to control a RGB text fan attached to a Raspberry Pi. Setup The Raspberry Pi and RGB text fan

Luke Prior 1 Oct 01, 2021
基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

Pytex-for-MCM 基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端。

3 May 17, 2021
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

Harry 5 Mar 27, 2022
RSS Reader application for the Emacs Application Framework.

EAF RSS Reader RSS Reader application for the Emacs Application Framework. Load application (add-to-list 'load-path "~/.emacs.d/site-lisp/eaf-rss-read

EAF 15 Dec 07, 2022
Extract price amount and currency symbol from a raw text string

price-parser is a small library for extracting price and currency from raw text strings.

Scrapinghub 252 Dec 31, 2022
Tools to extract questionaire of finalexam.eu and provide interactive questionaire with summary

AskMe This script is completely terminal based. No user interface is added. You can get the command line options by using the --help argument. Make su

David Loewe 1 Nov 09, 2021
This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorithm to summarize documents and FastAPI for the framework.

Indonesian Text Summarization Using FastAPI This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorit

Viqi Nurhaqiqi 2 Nov 03, 2022