🏆 • 5050 most frequent words in 109 languages

Last update: Nov 24, 2022

Overview

🏆 Most Common Words Multilingual

5000 most frequent words in 109 languages. Uses wordfrequency.info as a source.

🔗 License

source code license
data is released under different license(s), as they're taken from online sources. Feel free to contribute with your own data!

🌐 Language	📁 File
Afrikaans (af)	.txt
Albanian (sq)	.txt
Amharic (am)	.txt
Arabic (ar)	.txt
Armenian (hy)	.txt
Azerbaijani (az)	.txt
Basque (eu)	.txt
Belarusian (be)	.txt
Bengali (bn)	.txt
Bosnian (bs)	.txt
Bulgarian (bg)	.txt
Catalan (ca)	.txt
Cebuano (ceb)	.txt
Chichewa (ny)	.txt
Chinese (simplified) (zh-CN)	.txt
Chinese (traditional) (zh-TW)	.txt
Corsican (co)	.txt
Croatian (hr)	.txt
Czech (cs)	.txt
Danish (da)	.txt
Dutch (nl)	.txt
English (en)	.txt
Esperanto (eo)	.txt
Estonian (et)	.txt
Filipino (tl)	.txt
Finnish (fi)	.txt
French (fr)	.txt
Frisian (fy)	.txt
Galician (gl)	.txt
Georgian (ka)	.txt
German (de)	.txt
Greek (el)	.txt
Gujarati (gu)	.txt
Haitian creole (ht)	.txt
Hausa (ha)	.txt
Hawaiian (haw)	.txt
Hebrew (iw)	.txt
Hindi (hi)	.txt
Hmong (hmn)	.txt
Hungarian (hu)	.txt
Icelandic (is)	.txt
Igbo (ig)	.txt
Indonesian (id)	.txt
Irish (ga)	.txt
Italian (it)	.txt
Japanese (ja)	.txt
Javanese (jw)	.txt
Kannada (kn)	.txt
Kazakh (kk)	.txt
Khmer (km)	.txt
Kinyarwanda (rw)	.txt
Korean (ko)	.txt
Kurdish (ku)	.txt
Kyrgyz (ky)	.txt
Lao (lo)	.txt
Latin (la)	.txt
Latvian (lv)	.txt
Lithuanian (lt)	.txt
Luxembourgish (lb)	.txt
Macedonian (mk)	.txt
Malagasy (mg)	.txt
Malay (ms)	.txt
Malayalam (ml)	.txt
Maltese (mt)	.txt
Maori (mi)	.txt
Marathi (mr)	.txt
Mongolian (mn)	.txt
Myanmar (my)	.txt
Nepali (ne)	.txt
Norwegian (no)	.txt
Odia (or)	.txt
Pashto (ps)	.txt
Persian (fa)	.txt
Polish (pl)	.txt
Portuguese (pt)	.txt
Punjabi (pa)	.txt
Romanian (ro)	.txt
Russian (ru)	.txt
Samoan (sm)	.txt
Scots gaelic (gd)	.txt
Serbian (sr)	.txt
Sesotho (st)	.txt
Shona (sn)	.txt
Sindhi (sd)	.txt
Sinhala (si)	.txt
Slovak (sk)	.txt
Slovenian (sl)	.txt
Somali (so)	.txt
Spanish (es)	.txt
Sundanese (su)	.txt
Swahili (sw)	.txt
Swedish (sv)	.txt
Tajik (tg)	.txt
Tamil (ta)	.txt
Tatar (tt)	.txt
Telugu (te)	.txt
Thai (th)	.txt
Turkish (tr)	.txt
Turkmen (tk)	.txt
Ukrainian (uk)	.txt
Urdu (ur)	.txt
Uyghur (ug)	.txt
Uzbek (uz)	.txt
Vietnamese (vi)	.txt
Welsh (cy)	.txt
Xhosa (xh)	.txt
Yiddish (yi)	.txt
Yoruba (yo)	.txt
Zulu (zu)	.txt

Count the frequency of letters or words in a text file and show a graph.

Word Counter By EBUS Coding Club Count the frequency of letters or words in a text file and show a graph. Requirements Python 3.9 or higher matplotlib

0 Apr 9, 2022

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

1 Oct 10, 2022

Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/ string:word - returns js

0 Jan 4, 2022

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

This Project is based on NLTK(Natural Language Toolkit) It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

2 Nov 17, 2021

Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

7 Dec 14, 2022

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

28 May 25, 2021

Turkish Stop Words Türkçe Dolgu Sözcükleri

trstop Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with th

103 Nov 12, 2022

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

1 Apr 3, 2022

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

1 Feb 1, 2022

Comments

build(deps): bump certifi from 2021.10.8 to 2022.12.7
Bumps certifi from 2021.10.8 to 2022.12.7.

Commits

9e9e840 2022.12.07

b81bdb2 2022.09.24

939a28f 2022.09.14

aca828a 2022.06.15.2

de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...

b8eb5e9 2022.06.15.1

47fb7ab Fix deprecation warning on Python 3.11 (#199)

b0b48e0 fixes #198 -- update link in license

9d514b4 2022.06.15

4151e88 Add py.typed to MANIFEST.in to package in sdist (#196)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
build(deps): bump numpy from 1.21.4 to 1.22.0
Bumps numpy from 1.21.4 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Releases(0.1.0)

0.1.0(Dec 19, 2021)

Data comes from wordfrequency.info.
Source code(tar.gz)
Source code(zip)
all.json(13.20 MB)
most-common-words.zip(2.24 MB)

Owner

🃏 effectively learn new languages by using cool methods, such as flashcards and most common words!

GitHub Repository

🗣️ NALP is a library that covers Natural Adversarial Language Processing.

NALP: Natural Adversarial Language Processing Welcome to NALP. Have you ever wanted to create natural text from raw sources? If yes, NALP is for you!

21 Aug 12, 2022

Code voor mijn Master project omtrent VideoBERT

Code voor masterproef Deze repository bevat de code voor het project van mijn masterproef omtrent VideoBERT. De code in deze repository is gebaseerd o

35 Oct 18, 2021

An open-source NLP library: fast text cleaning and preprocessing.

An open-source NLP library: fast text cleaning and preprocessing

21 Mar 18, 2022

CMeEE 数据集医学实体抽取

医学实体抽取_GlobalPointer_torch 介绍思想来自于苏神 GlobalPointer，原始版本是基于keras实现的，模型结构实现参考现有 pytorch 复现代码【感谢!】，基于torch百分百复现苏神原始效果。数据集中文医学命名实体数据集点这里申请，很简单，共包含九类医学

85 Dec 28, 2022

Creating an LSTM model to generate music

Music-Generation Creating an LSTM model to generate music music-generator Used to create basic sin wave sounds music-ai Contains the functions to conv

2 Dec 02, 2021

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation Tasks | Datasets | LongLM | Baselines | Paper Introduction LOT is a ben

46 Dec 28, 2022

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data. It is implemented using Python.

6 Jun 27, 2022

Ecommerce product title recognition package

revizor This package solves task of splitting product title string into components, like type, brand, model and article (or SKU or product code or you

16 Mar 03, 2022

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Python_Natural_Language_Processing This repository contains tutorials on important topics related to Natural Language Processing (NPL). No. Name 01 01

170 Dec 13, 2022

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

VampiresVsWerewolves Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition. Our Algorithm finish

1 Jan 21, 2022

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning This is the PyTorch companion code for the paper: A

69 Jan 03, 2023

SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec，which can be used for individual real-time corpus cluster task。基于single-pass算法思想的自动文本聚类小组件，内置tfidf和doc2vec两种文本向量方法，可自动输出聚类数目、类簇文档集合和簇类大小，用于自有实时数据的聚类任务。

项目的背景 SinglepassTextCluster, an TextCluster tool based on Singlepass cluster algorithm that use tfidf vector and doc2vec，which can be used for individ

34 Dec 18, 2022

🏆 • 5050 most frequent words in 109 languages

Related tags

Overview

🏆 Most Common Words Multilingual

🔗 License

You might also like...

Count the frequency of letters or words in a text file and show a graph.

This program do translate english words to portuguese

Python powered crossword generator with database with 20k+ polish words

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Russian words synonyms and antonyms

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Turkish Stop Words Türkçe Dolgu Sözcükleri

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Comments

build(deps): bump certifi from 2021.10.8 to 2022.12.7

build(deps): bump numpy from 1.21.4 to 1.22.0

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Releases(0.1.0)

0.1.0(Dec 19, 2021)

Owner

🗣️ NALP is a library that covers Natural Adversarial Language Processing.

Code voor mijn Master project omtrent VideoBERT

An open-source NLP library: fast text cleaning and preprocessing.

CMeEE 数据集医学实体抽取

Creating an LSTM model to generate music

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

Ecommerce product title recognition package

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Learning Spatio-Temporal Transformer for Visual Tracking

TruthfulQA: Measuring How Models Imitate Human Falsehoods

Python package for Turkish Language.

Simple translation demo showcasing our headliner package.

COVID-19 Related NLP Papers

Auto translate textbox from Japanese to English or Indonesia

Shellcode antivirus evasion framework

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio