trstop

Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with the highest frequency. In order to test the new candidate words in future, I add a small python script, and a 10 thousand item word list with highest frequency. At https://github.com/sgsinclair/trombone/blob/master/src/main/resources/org/voyanttools/trombone/keywords/stop.tr.turkish-lucene.txt are some Turkish stop words. However, some stop words in that list do not belong to the ten thousand highest frequency words.

In order to use the module:

import trstop

print(trstop.is_stop_word(parameter))

Contributors:

Ahmet Aksoy
Toprak Öztürk

Bu depoya en sık kullanılan 10 bin Türkçe sözcük listesinde yer alan dolgu sözcüklerini ekledim. Dolgu sözcükleri (stop words), sık kullanılan, ama iptal edildiklerinde ayrıldıkları cümlenin anlamında önemli değişiklikler oluşturmayan sözcüklerdir.

"Stop words" terimine karşılık "dolgu sözcükleri" terimini kullandım. Daha iyi bir seçenek varsa, değiştirmeye hazırım. Depoya eklediğim "turkce-stop-words-dict.py" betiğini, ileride listeye yeni sözcükler eklemek istediğimizde kullanım sıklığını denetlemek amacıyla kullanabiliriz.

https://github.com/sgsinclair/trombone/blob/master/src/main/resources/org/voyanttools/trombone/keywords/stop.tr.turkish-lucene.txt adresinde de bazı dolgu sözcükleri listelenmiş. Ancak buradaki bazı sözcükler ilk on bine girecek kadar yoğun frekansa sahip değil.

Modülü kullanmak için:

import trstop

print(trstop.is_stop_word(parametre))

Projeye katkıda bulunanlar:

Ahmet Aksoy
Toprak Öztürk

Son güncelleme: 29.06.2018

Turkish Stop Words Türkçe Dolgu Sözcükleri

Related tags

Overview

trstop

In order to use the module:

Contributors:

Modülü kullanmak için:

Projeye katkıda bulunanlar:

Owner

Ahmet Aksoy

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Azure Text-to-speech service for Home Assistant

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

This is a GUI program that will generate a word search puzzle image

A repo for materials relating to the tutorial of CS-332 NLP

Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al

English loanwords in the world's languages

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Get list of common stop words in various languages in Python

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

The tool to make NLP datasets ready to use

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Converts python code into c++ by using OpenAI CODEX.

A Telegram bot to add notes to Flomo.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

KoBERT - Korean BERT pre-trained cased (KoBERT)

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!