Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Last update: Dec 29, 2022

Related tags

Text Data & NLP python-zhuyin

Overview

Python-Zhuyin (pyzhuyin) 注音和拼音轉換

Introduction 介紹

pyzhuyin is an open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo).

pyzhuyin 是一個開放原始碼的 Python 套件，提供了將拼音轉換成注音的統一介面。

Installation 安裝

pip install pyzhuyin

Usage 使用

from pyzhuyin import pinyin_to_zhuyin, zhuyin_to_pinyin


assert(pinyin_to_zhuyin("lu3") == "ㄌㄨˇ")
assert(pinyin_to_zhuyin("dan4") == "ㄉㄢˋ")
assert(map(pinyin_to_zhuyin, ["lu3", "dan4"]) == ["ㄌㄨˇ", "ㄉㄢˋ"])

assert(zhuyin_to_pinyin("ㄌㄩˊ") == "lü2")
assert(zhuyin_to_pinyin("˙ㄗ") == "zi5")
assert(map(lambda z: zhuyin_to_pinyin(z, u_to_v=True), ["ㄌㄩˊ", "˙ㄗ"]) == ["lv2", "zi5"])

Testing 測試

Run the following command at the root of the project to test the library:

在根目錄執行以下指令以測試套件:

python3 -m unittest

Notes 備註

Only support numeric tone for pinyin
- e.g. "lu3" instead of "lǔ"
Neutral tone is represented as 5
- e.g. "˙ㄗ" -> "zi5"
For pinyin_to_zhuyin:
- if corresponding zhuyin not found, raise ValueError
- internally convert all v to ü
For zhuyin_to_pinyin:
- if corresponding pinyin not found, raise ValueError
兒化音 is not supported because it is not representable in the zhuyin system as a "combo" word
- e.g. "公園兒" -> "gong1 yuanr2" -> "ㄍㄨㄥㄩㄢㄦˊ" (not allowed)

Data Sources 資料來源

中華民國教育部（Ministry of Education, R.O.C.）。《重編國語辭典修訂本》（版本編號：2015_20210928 ）

網址：https://dict.revised.moe.edu.tw/

CC BY-ND 3.0 TW 授權

Author 作者

Raymond Ku

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Related tags

Overview

Python-Zhuyin (pyzhuyin) 注音和拼音轉換

Introduction 介紹

Installation 安裝

Usage 使用

Testing 測試

Notes 備註

Data Sources 資料來源

Author 作者

Owner

AllenNLP integration for Shiba: Japanese CANINE model

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

An open-source NLP research library, built on PyTorch.

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

A library for Multilingual Unsupervised or Supervised word Embeddings

Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module.

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".

FedNLP: A Benchmarking Framework for Federated Learning in Natural Language Processing

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

✨Fast Coreference Resolution in spaCy with Neural Networks

Repository for Project Insight: NLP as a Service

multi-label，classifier，text classification，多标签文本分类，文本分类，BERT，ALBERT，multi-label-classification，seq2seq，attention，beam search

Text to speech converter with GUI made in Python.

This is a general repo that helps you develop fast/effective NLP classifiers using Huggingface

pyupbit 라이브러리를 활용하여 upbit에서 비트코인을 자동매매하는 코드입니다. 조코딩 유튜브 채널에서 자세한 강의 영상을 보실 수 있습니다.

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

A library for end-to-end learning of embedding index and retrieval model