An anthology of a variety of tools for the Persian language in Python

Last update: Nov 08, 2022

Overview

Persian Tools

An anthology of a variety of tools for the Persian language in Python

Installation

pip install persian-tools

Modules

digits
separator
ordinal suffix
bank
1. card number
2. sheba
national id
phone number

Usage

Let's take a look at what an example test case would look like using persian-tools.

digits

This module will help you to normalize digits from Persian, Arabic and English to only one of them.

from persian_tools import digits

digits.convert_to_fa(123)          # '۱۲۳'
digits.convert_to_fa('123')        # '۱۲۳'
digits.convert_to_fa('123٤٥٦')     # '۱۲۳۴۵۶'
digits.convert_to_fa('sth 123٤٥٦') # 'sth ۱۲۳۴۵۶'

digits.convert_to_en('۱۲۳')        # '123'
digits.convert_to_en('۱۲۳٤٥٦')     # '123456'
digits.convert_to_en('sth ۱۲۳٤٥٦') # 'sth 123456'

digits.convert_to_ar(123)          # '۱۲۳'
digits.convert_to_ar('123')        # '۱۲۳'
digits.convert_to_ar('sth 123۴۵۶') # 'sth ۱۲۳٤٥٦'

persian-tools also, has another function to convert numbers to words; you can convert result to ordinal mode with ordinal=True in inputs.

from persian_tools import digits

digits.convert_to_word(500443)                  # پانصد هزار و چهارصد و چهل و سه
digits.convert_to_word(-500443)                 # منفی پانصد هزار و چهارصد و چهل و سه
digits.convert_to_word(500443, ordinal=True)    # پانصد هزار و چهارصد و چهل و سوم
digits.convert_to_word(30000000000)             # سی میلیارد

separator

Adding or removing thousands separators will handle; default separator is ',' but can change with second input.

from persian_tools import separator

separator.add(300)                 # '300'
separator.add(3000000)             # '3,000,000'
separator.add(3000000.0003)        # '3,000,000.0003'
separator.add(3000000, '/')        # '3/000/000'
separator.add('۳۰۰۰۰')             # '۳۰,۰۰۰'

separator.remove('300')            # '300'
separator.remove('3,000,000')      # '3000000'
separator.remove('3/000/000', '/') # '3000000'
separator.remove('۳۰,۰۰۰')         # '۳۰۰۰۰'

ordinal suffix

Adding or removing ordinal suffix for persian numbers in word will handle.

from persian_tools import ordinal_suffix

ordinal_suffix.add('بیست')          # 'بیستم'
ordinal_suffix.add('سی و سه')       # 'سی و سوم'
ordinal_suffix.add('سی')            # 'سی اُم'

ordinal_suffix.remove('دومین')      # 'دو'
ordinal_suffix.remove('سی و سوم')   # 'سی و سه'
ordinal_suffix.remove('بیستم')      # 'بیست'
ordinal_suffix.remove('سی اُم')      # 'سی'

bank

card number

This module has useful functions related to bank cards number, like:

validating them
find bank data of a card number
extract card numbers from a text

from persian_tools.bank import card_number

card_number.validate('6037701689095443')    # True
card_number.validate('6219861034529007')    # True
card_number.validate('6219861034529008')    # False

card_number.bank_data('6219861034529007')
# {'nickname': 'saman', 'name': 'Saman Bank', 'persian_name': 'بانک سامان', 'card_prefix': ['621986'], 'sheba_code': ['056']}
card_number.bank_data('6037701689095443')
# {'nickname': 'keshavarzi', 'name': 'Keshavarzi', 'persian_name': 'بانک کشاورزی', 'card_prefix': ['603770', '639217'], 'sheba_code': ['016']}



card_number.extract_card_numbers('''شماره کارتم رو برات نوشتم:
                                     6219-8610-3452-9007
                                     اینم یه شماره کارت دیگه ای که دارم
                                    ۵۰۲۲-۲۹۱۰-۷۰۸۷-۳۴۶۶                                     
                                    5022291070873466''',                # first argument is a text
                                    check_validation=True,              # a boolean that define you need only valid card numbers in result, default: True
                                    detect_bank_name=True,              # this will add bank name in result, default: False
                                    filter_valid_card_numbers=True)     # just valid card numbers will be in result; be careful to `check_validation` be also True, default: True
# result
# [
#    {'pure': '6219861034529007', 'base': '6219-8610-3452-9007', 'index': 1, 'is_valid': True,
#     'bank_data': {
#         'nickname': 'saman',
#         'name': 'Saman Bank',
#         'persian_name': 'بانک سامان',
#         'card_prefix': ['621986'],
#         'sheba_code': ['056'],
#     }},
#    {'pure': '5022291070873466', 'base': '5022291070873466', 'index': 3, 'is_valid': True,
#     'bank_data': {
#         'nickname': 'pasargad',
#         'name': 'Pasargad Bank',
#         'persian_name': 'بانک پاسارگاد',
#         'card_prefix': ['502229', '639347'],
#         'sheba_code': ['057'],
#     }},
# ]

sheba

sheba module contain 2 functions:

validating them
find bank data of a sheba number

from persian_tools.bank import sheba

sheba.validate('IR820540102680020817909002')    # True
sheba.validate('IR01234567890123456789')        # False

sheba.bank_data('IR820540102680020817909002')
# {
#     'nickname': 'parsian',
#     'name': 'Parsian Bank',
#     'persian_name': 'بانک پارسیان',
#     'card_prefix': ['622106', '627884'],
#     'sheba_code': ['054'],
#     'account_number': '020817909002',
#     'formatted_account_number': '002-00817909-002'
# }

national id

This module has useful functions related to iranian national id (code-e melli), like:

validating them
generate a random one
find place of national id by the prefix of id

from persian_tools import national_id

national_id.validate('0499370899')      # True
national_id.validate('0684159415')      # False

national_id.generate_random()           # '0458096784'
national_id.generate_random()           # '1156537101'

national_id.find_place('0906582709')    # {'code': ['089', '090'], 'city': 'کاشمر', 'province': 'خراسان رضوی'}
national_id.find_place('0643005846')    # {'code': ['064', '065'], 'city': 'بیرجند', 'province': 'خراسان جنوبی'}

phone number

This module can validate and give you some data from a phone number.

from persian_tools import phone_number

phone_number.validate('09123456789')        # True
phone_number.validate('+989123456789')      # True
phone_number.validate('989123456789')       # True
phone_number.validate('98912345678')        # False


phone_number.operator_data('09123456789')
# {'province': ['البرز', 'زنجان', 'سمنان', 'قزوین', 'قم', 'برخی از شهرستان های استان مرکزی'], 'base': 'تهران', 'type': ['permanent'], 'operator': 'همراه اول'}
phone_number.operator_data('09303456789')
# {'province': [], 'base': 'کشوری', 'type': ['permanent', 'credit'], 'operator': 'ایرانسل'}

Comments

Economic national id (Shenas-e melli)

Hi there,

Firstly, I should say thanks a lot for your valuable & useful project and your clean codes. Due to lack of an economic national id (shenas-e melli) validator and generator, and despite there are few sources, I could find a website that explains how to validate an economic national id, so I tried to implement it according to the structure of your project.

That would be great if you read my code and if everything is fine, accept my pr and add this module to your project.

Best Regards, M. A

opened by mabedis 3
Configure Renovate
Welcome to Renovate! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin.

🚦 To activate Renovate, merge this Pull Request. To disable Renovate, simply close this Pull Request unmerged.

Detected Package Files

.github/workflows/test.yml (github-actions)

Configuration

🔡 Renovate has detected a custom config for this PR. Feel free to ask for help if you have any doubts and would like it reviewed.

Important: Now that this branch is edited, Renovate can't rebase it from the base branch any more. If you make changes to the base branch that could impact this onboarding PR, please merge them manually.

What to Expect

With your current configuration, Renovate will create 3 Pull Requests:

chore(deps): update actions/checkout action to v3

Schedule: ["at any time"]

Branch name: renovate/actions-checkout-3.x

Merge into: master

Upgrade actions/checkout to v3

chore(deps): update actions/setup-python action to v4

Schedule: ["at any time"]

Branch name: renovate/actions-setup-python-4.x

Merge into: master

Upgrade actions/setup-python to v4

chore(deps): update codecov/codecov-action action to v3

Schedule: ["at any time"]

Branch name: renovate/codecov-codecov-action-3.x

Merge into: master

Upgrade codecov/codecov-action to v3

🚸 Branch creation will be limited to maximum 2 per hour, so it doesn't swamp any CI resources or spam the project. See docs for prhourlylimit for details.

❓ Got questions? Check out Renovate's Docs, particularly the Getting Started section. If you need any further assistance then you can also request help here.

This PR has been generated by Mend Renovate. View repository job log here.
opened by renovate[bot] 1

problem with with datetime.timezone

import pytz
import datetime
now2 = datetime.datetime.now(tz=pytz.utc)
now = datetime.datetime.now(tz=datetime.timezone.utc)
from persiantools.jdatetime import JalaliDateTime
JalaliDateTime.to_jalali(now2).strftime("%d",   locale="en") # it works
JalaliDateTime.to_jalali(now).strftime("%d",   locale="en") # it dosn't works

TypeError: tzname(dt) argument must be a datetime instance or None, not JalaliDateTime

help wanted question

opened by moosavimaleki 0

chore(deps): update codecov/codecov-action action to v3
This PR contains the following updates:

| Package | Type | Update | Change | |---|---|---|---| | codecov/codecov-action | action | major | v1 -> v3 |

Release Notes

codecov/codecov-action

v3

Compare Source

v2

Compare Source

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

[ ] If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.
opened by renovate[bot] 1
Dependency Dashboard
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

[ ] chore(deps): update actions/checkout action to v3

[ ] chore(deps): update actions/setup-python action to v4

[ ] chore(deps): update codecov/codecov-action action to v3

[ ] Click on this checkbox to rebase all open PRs at once

Detected dependencies

github-actions

.github/workflows/test.yml

actions/checkout v2

actions/setup-python v1

codecov/codecov-action v1

[ ] Check this box to trigger a request for Renovate to run again on this repository
opened by renovate[bot] 0
chore(deps): update actions/setup-python action to v4
This PR contains the following updates:

| Package | Type | Update | Change | |---|---|---|---| | actions/setup-python | action | major | v1 -> v4 |

Release Notes

actions/setup-python

v4

Compare Source

v3

Compare Source

v2

Compare Source

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

[ ] If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.
opened by renovate[bot] 1
chore(deps): update actions/checkout action to v3
This PR contains the following updates:

| Package | Type | Update | Change | |---|---|---|---| | actions/checkout | action | major | v2 -> v3 |

Release Notes

actions/checkout

v3

Compare Source

Use @actions/core saveState and getState

Add github-server-url input

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

[ ] If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.
opened by renovate[bot] 1

Releases(v0.0.10)

v0.0.10(Dec 17, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.9(Mar 22, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.8(Feb 27, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.7(Feb 22, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.6(Feb 21, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Persian Tools

PersianTools.js is a standalone, library-agnostic JavaScript that enables some of the Persian features for use in the JavaScript.

GitHub Repository

RSS Reader application for the Emacs Application Framework.

EAF RSS Reader RSS Reader application for the Emacs Application Framework. Load application (add-to-list 'load-path "~/.emacs.d/site-lisp/eaf-rss-read

15 Dec 07, 2022

汉字转拼音(pypinyin)

汉字拼音转换工具（Python 版）将汉字转为拼音。可以用于汉字注音、排序、检索(Russian translation) 。基于 hotoo/pinyin 开发。 Documentation: http://pypinyin.rtfd.io/ GitHub: https://github.co

4.2k Jan 03, 2023

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Contents Maintainer wanted Introduction Installation Documentation License History Source code Authors Maintainer wanted I am looking for a new mainta

1.2k Dec 16, 2022

utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

utoken utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresse

11 Jan 05, 2023

A program that looks through entered text and replaces certain commands with mathematical symbols

TextToSymbolConverter A program that looks through entered text and replaces certain commands with mathematical symbols Example: Syntax: Enter text in

1 Jan 02, 2022

Hamming code generation, error detection & correction.

2 Jun 30, 2022

Auto translate Localizable.strings for multiple languages in Xcode

auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install

13 Nov 22, 2022

A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file

13 Dec 09, 2022

WorldCloud Orçamento de Estado 2022

World Cloud Orçamento de Estado 2022 What it does This script creates a worldcloud, masked on a image, from a txt file How to run it? Install all libr

2 Oct 12, 2021

一款高性能敏感词(非法词/脏字)检测过滤组件，附带繁体简体互换，支持全角半角互换，汉字转拼音，模糊搜索等功能。

一款高性能非法词(敏感词)检测组件，附带繁体简体互换，支持全角半角互换，获取拼音首字母，获取拼音字母，拼音模糊搜索等功能。

3.6k Jan 07, 2023

Add your new words to a text file and get them randomly.

Memorize-New-Words In this very very very little project, I've wrote a code to memorize new english words. Therefore you can add the words and their m

2 Jul 04, 2022

An implementation of figlet written in Python

All of the documentation and the majority of the work done was by Christopher Jones ([emai

1.1k Jan 02, 2023

A minimal python script for generating multiple onetime use bip39 seed phrases

seed_signer_ontimes WARNING This project has mainly been used for local development, and creation should be ran on a air-gapped machine. A minimal pyt

4 Sep 12, 2022

基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

Pytex-for-MCM 基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端。

3 May 17, 2021

一个可以可以统计群组用户发言，并且能将聊天内容生成词云的机器人

当前版本 v2.2 更新维护日志更新维护日志有问题请加群组反馈 Telegram 交流反馈群组点击加入演示配置要求内存：1G以上安装方法使用 Docker 安装 Docker官方安装

117 Dec 29, 2022

Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

3k Jan 02, 2023

A generator library for concise, unambiguous and URL-safe UUIDs.

Description shortuuid is a simple python library that generates concise, unambiguous, URL-safe UUIDs. Often, one needs to use non-sequential IDs in pl

1.8k Dec 31, 2022

TextStatistics - Get a text file wich contains English text

TextStatistics This program get a text file wich contains English text. The program analyses the text, and print some information. For this program I

2 Nov 15, 2021

PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages

PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different

19 Dec 26, 2022

Convert ebooks with few clicks on Telegram!

E-Book Converter Bot A bot that converts e-books to various formats, powered by calibre! It currently supports 34 input formats and 19 output formats.

45 Jan 05, 2023

An anthology of a variety of tools for the Persian language in Python

Related tags

Overview

Persian Tools

Installation

Modules

Usage

digits

separator

ordinal suffix

bank

card number

sheba

national id

phone number

Comments

Detected Package Files

Configuration

What to Expect

Release Notes

Configuration

Open

Detected dependencies

Release Notes

Configuration

Release Notes

Configuration

Releases(v0.0.10)

v0.0.10(Dec 17, 2021)

v0.0.9(Mar 22, 2021)

v0.0.8(Feb 27, 2021)

v0.0.7(Feb 22, 2021)

v0.0.6(Feb 21, 2021)

Owner

Persian Tools

RSS Reader application for the Emacs Application Framework.

汉字转拼音(pypinyin)

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

A program that looks through entered text and replaces certain commands with mathematical symbols

Hamming code generation, error detection & correction.

Auto translate Localizable.strings for multiple languages in Xcode

A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file

WorldCloud Orçamento de Estado 2022

一款高性能敏感词(非法词/脏字)检测过滤组件，附带繁体简体互换，支持全角半角互换，汉字转拼音，模糊搜索等功能。

Add your new words to a text file and get them randomly.

An implementation of figlet written in Python

A minimal python script for generating multiple onetime use bip39 seed phrases

基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

一个可以可以统计群组用户发言，并且能将聊天内容生成词云的机器人

Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

A generator library for concise, unambiguous and URL-safe UUIDs.

TextStatistics - Get a text file wich contains English text

PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages

Convert ebooks with few clicks on Telegram!