JSON and CSV data for Swahili dictionary with over 16600+ words

Overview

kamusi

JSON and CSV data for swahili dictionary with over 16600+ words.

This repo consists of data from swahili dictionary with about 16683 words together with their meaning, synonyms and conjugations.

This repo couldn't exist without Kamusi-Mobile, Thanks to great effort done by Jack Siro.

So how this data was generated ?

This data is result of webscraping done to kamusi with help of selenium and BeautifulSoup.

There are basically two scripts, one for scraping app.py while the other one to_json serves a purpose of converting scraped CSV data into json tha can easily be used by others.

Gathering data

I'm currently gathering and organizing swahili data mainly for doing NLP purposes, if you now any other places that we can scrap useful data in swahili please raise an issue for it.

Looking forward to see what you're going to build with it.

Give it a star

Was this useful to you ? Then give it a star so that more people can manke use of this.

Credits

All the credits to:

Owner
Jordan Kalebu
Python Developer | Rust | Javascript | C/C++ | MachineLearning | OpenSource
Jordan Kalebu
py-trans is a Free Python library for translate text into different languages.

Free Python library to translate text into different languages.

I'm Not A Bot #Left_TG 13 Aug 27, 2022
Making simplex testing clean and simple

Making Simplex Project Testing - Clean and Simple What does this repo do? It organizes the python stack for the coding project What do I need to do in

Mohit Mahajan 1 Jan 30, 2022
Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

SeatGeek 1.2k Jan 01, 2023
汉字转拼音(pypinyin)

汉字拼音转换工具(Python 版) 将汉字转为拼音。可以用于汉字注音、排序、检索(Russian translation) 。 基于 hotoo/pinyin 开发。 Documentation: http://pypinyin.rtfd.io/ GitHub: https://github.co

Huang Huang 4.2k Jan 03, 2023
This is an AI that is supposed to say you if your text is formal or not

This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the te

1 Jan 12, 2022
Word and phrase lists in CSV

Word Lists Word and phrase lists in CSV, collected from different sources. Oxford Word Lists: oxford-5k.csv - Oxford 3000 and 5000 oxford-opal.csv - O

Anton Zhiyanov 14 Oct 14, 2022
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Contents Maintainer wanted Introduction Installation Documentation License History Source code Authors Maintainer wanted I am looking for a new mainta

Antti Haapala 1.2k Dec 16, 2022
Um simulador de caixa registradora com database usando arquivos .txt

🛒 Caixa Registradora V2 ❓ - Como usar? Execute o caixa-registradora.py, nele vai ter um menu interativo, você pode cadastrar diversos produtos em um

Gabriel 0 Sep 25, 2022
An anthology of a variety of tools for the Persian language in Python

An anthology of a variety of tools for the Persian language in Python

Persian Tools 106 Nov 08, 2022
A minimal code sceleton for a textadveture parser written in python.

Textadventure sceleton written in python Use with a map file generated on https://www.trizbort.io Use the following Sockets for walking directions: n

1 Jan 06, 2022
Translate .sbv subtitle files

deepl4subtitle Deeplを使って字幕ファイル(.sbv)を翻訳します。タイムスタンプも含めて出力しますが、翻訳時はタイムスタンプは文の一部とは切り離されるので、.sbvファイルをそのまま翻訳機に突っ込むよりも高精度な翻訳ができるはずです。 つかいかた 入力する.sbvファイルの前処理

Yasunori Toshimitsu 1 Oct 20, 2021
Fuzz a language by mixing up only few words.

afasi Fuzz a language by mixing up only few words. Status Beta. Note: The default branch is default. Use Examples Version General Help Translate Help

Stefan Hagen 2 Dec 14, 2022
A username generator made from French Canadian most common names.

This script is used to generate a username list using the most common first and last names in Quebec in different formats. It can generate some passwords using specific patterns such as Tremblay2020.

5 Nov 26, 2022
从flomo导出的笔记中生成词云

flomo-word-cloud 从flomo导出的笔记中生成词云 如何使用? 将本项目克隆到你的电脑上,使用如下的命令,安装所需python库 pip install -r requirements.txt 在项目里新建一个file文件夹,把所有从flomo导出的html文件放入其中 运行main

Hannnk 9 Dec 30, 2022
A program that looks through entered text and replaces certain commands with mathematical symbols

TextToSymbolConverter A program that looks through entered text and replaces certain commands with mathematical symbols Example: Syntax: Enter text in

1 Jan 02, 2022
Answer some questions and get your brawler csvs ready!

BRAWL-STARS-V11-BRAWLER-MAKER-TOOL Answer some questions and get your brawler csvs ready! HOW TO RUN on android: Install pydroid3 from playstore, and

9 Jan 07, 2023
Python flexible slugify function

awesome-slugify Python flexible slugify function PyPi: https://pypi.python.org/pypi/awesome-slugify Github: https://github.com/dimka665/awesome-slugif

Dmitry Voronin 471 Dec 20, 2022
LazyText is inspired b the idea of lazypredict, a library which helps build a lot of basic models without much code.

LazyText is inspired b the idea of lazypredict, a library which helps build a lot of basic models without much code. LazyText is for text what lazypredict is for numeric data.

Jay Vala 13 Nov 04, 2022
Auto translate Localizable.strings for multiple languages in Xcode

auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install

Wesley Zhang 13 Nov 22, 2022
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.

Python User Agents user_agents is a Python library that provides an easy way to identify/detect devices like mobile phones, tablets and their capabili

Selwin Ong 1.3k Dec 22, 2022