DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Related tags

Audiopython-dcl
Overview

Diacritics Library

Code Style: Black Imports: isort PRs welcome

This library is used for adding, and removing diacritics from strings.

Getting started

Start by importing the module:

import dcl

DCL currently supports a multitude of diacritics:

  • acute
  • breve
  • caron
  • cedilla
  • grave
  • interpunct
  • macron
  • ogonek
  • ring
  • ring_and_acute
  • slash
  • stroke
  • stroke_and_acute
  • tilde
  • tittle
  • umlaut/diaresis
  • umlaut_and_macron

Each accent has their own attribute which is directly accessible from the dcl module.

dcl.acute('a')
>>> 'á'

These attributes return a Character object, which is essentially just a handy "wrapper" around our diacritic, which we can use to access various attributes to retrieve further information about the diacritic we're focusing on.

" char.character # the same as str(char) >>> 'ą' char.diacritic # some return >>> '˛' char.diacritic_name >>> 'ogonek' char.raw # returns the raw representation of our character >>> '\U00000105' char.raw_diacritic >>> '\U000002db' ">
char = dcl.ogonek('a')

repr(char)
>>> ""

char.character  # the same as str(char)
>>> 'ą'

char.diacritic  # some return 
>>> '˛'

char.diacritic_name
>>> 'ogonek'

char.raw  # returns the raw representation of our character
>>> '\U00000105'

char.raw_diacritic 
>>> '\U000002db'

Some functions can't take certain letters. For example, the letter h cannot take a cedilla diacritic. In this case, an exception is raised named DiacriticError. You can access this exception via dcl.errors.DiacriticError.

from dcl.errors import DiacriticError

try:
    char = dcl.cedilla('h')
except DiacriticError as e:
    print(e)
else:
    print(repr(char))

>>> 'Character h cannot take a cedilla diacritic'

If you want to, you may also use the DiacriticApplicant object from dcl.objects. The functions you see above use this object too, and it's virtually the same principle, except from the fact that we use properties to get the diacritic, and the class simply holds the string and it's properties. Alas with the functions above, this object also returns the same Character object through it's properties.

" ">
from dcl.objects import DiacriticApplicant

da = DiacriticApplicant('a')
repr(da.ogonek)
>>> ""

There is also the clean_diacritics function, accessible straight from the dcl module. This function allows us to completely clean a string from any diacritics.

>> 'Kreusada' dcl.clean_diacritics("Café") >>> 'Cafe' ">
dcl.clean_diacritics("Krëûšàdå")
>>> 'Kreusada'

dcl.clean_diacritics("Café")
>>> 'Cafe'

Along with this function, there's also count_diacritics, get_diacritics and has_diacritics.

The has_diacritics function simply checks if the string contains a character with a diacritic.

>> True dcl.has_diacritics("dcl") >>> False ">
dcl.has_diacritics("Café")
>>> True

dcl.has_diacritics("dcl")
>>> False

The get_diacritics function is used to get all the diacritics in a string. It returns a dictionary. For each diacritic in the string, the key will show the diacritic's index in the string, and the value will show the Character representation.

>> {3: } dcl.get_diacritics("Krëûšàdå") >>> {2: , 3: , 4: , 5: , 7: } ">
dcl.get_diacritics("Café")
>>> {3: <acute 'é'>}

dcl.get_diacritics("Krëûšàdå")
>>> {2: <umlaut 'ë'>, 3: <circumflex 'û'>, 4: <caron 'š'>, 5: <grave 'à'>, 7: <ring 'å'>}

The count_diacritics function counts the number of diacritics in a string. The actual implementation of this simply returns the dictionary length from get_diacritics.

>> 1 ">
dcl.count_diacritics("Café")
>>> 1

Creating an end user program

Creating a program would be pretty simple for this, and I'd love to be able to help you out with a base idea. Have a look at this for example:

import dcl
import string

from dcl.errors import DiacriticError

char = str(input("Enter a character: "))
if not char in string.ascii_letters:
    print("Please enter a letter from a-Z.")
else:
    accent = str(input("Enter an accent, you can choose from the following: " + ", ".join(dcl.diacritic_list)))
    if not dcl.isdiacritictype(accent):
        print("That was not a valid accent.")
    else:
        try:
            function = getattr(dcl, accent)  # or dcl.objects.DiacriticApplicant
            output = function(char)
        except DiacriticError as e:
            print(e)
        else:
            print(str(output))

It's worth checking if the provided accent is a diacritic type. If it is, then you can use getattr. Without checking, the user could provide a default global such as __file__.

You can also create a program which can remove diacritics from a string. It's made easy!

import dcl

string = str(input("Enter the string which you want to be cleared from diacritics: "))
print("Here is your cleaned string: " + dcl.clean_diacritics(string))

Or perhaps your program wants to count the number of diacritics contained within your string.

import dcl

string = str(input("This program will count the number of diacritics contained in your input. Enter a string: "))
count = dcl.count_diacritics(string)
if count == 1:
    grammar = "is"
else:
    grammar = "are"
print(f"There {grammar} {count} diacritics/accent in your string.")
Owner
Kreus Amredes
Python developer, contributor and maintainer. 🐍
Kreus Amredes
Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

LAKH MuseNet MIDI Dataset Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums) Bonus: Choir on Channel 10 Please CC

Alex 6 Nov 20, 2022
Multi-Track Music Generation with the Transfomer and the Johann Sebastian Bach Chorales dataset

MMM: Exploring Conditional Multi-Track Music Generation with the Transformer and the Johann Sebastian Bach Chorales Dataset. Implementation of the pap

102 Dec 08, 2022
Audio Retrieval with Natural Language Queries: A Benchmark Study

Audio Retrieval with Natural Language Queries: A Benchmark Study Paper | Project page | Text-to-audio search demo This repository is the implementatio

21 Oct 31, 2022
Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

3 Feb 07, 2022
Python game programming in Jupyter notebooks.

Jupylet Jupylet is a Python library for programming 2D and 3D games, graphics, music and sound synthesizers, interactively in a Jupyter notebook. It i

Nir Aides 178 Dec 09, 2022
Library for Python 3 to communicate with the Google Chromecast.

pychromecast Library for Python 3.6+ to communicate with the Google Chromecast. It currently supports: Auto discovering connected Chromecasts on the n

Home Assistant Libraries 2.4k Jan 02, 2023
Terminal-based audio-to-text converter

att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa

Sven Eschlbeck 4 Dec 15, 2022
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 02, 2023
Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

IELab@ Korea University 0 Nov 12, 2021
Some utils for auto speech recognition

About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.

1 Jan 24, 2022
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chànggēlog. What it does is allow your program to print logs

黄巍 22 Sep 03, 2022
A python program to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks.

I'm writing a python script to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks called ReCut. So far there are two

Dönerspiess 1 Oct 27, 2021
DaisyXmusic ❤ A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 34 Oct 22, 2022
An audio digital processing toolbox based on a workflow/pipeline principle

AudioTK Audio ToolKit is a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in

Matthieu Brucher 238 Oct 18, 2022
Pyrogram bot to automate streaming music in voice chats

Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group

Roj 124 Oct 21, 2022
A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

نافع الهلالي 1 Oct 29, 2021
:sound: Play and Record Sound with Python :snake:

Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu

spatialaudio.net 750 Dec 31, 2022
Algorithmic and AI MIDI Drums Generator Implementation

Algorithmic and AI MIDI Drums Generator Implementation

Tegridy Code 8 Dec 30, 2022
digital audio workstation, instrument and effect plugins, wave editor

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 05, 2023
A telegram bot for which is help to play songs in vc 🥰 give 🌟 and fork this repo before use 😏

TamilVcMusic 🌟 TamilVCMusicBot 🌟 Give your 💙 Before clicking on deploy to heroku just click on fork and star just below How to deploy Click the bel

TamilBots 150 Dec 13, 2022