मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

Overview

For English, scroll down

मराठी शब्द

मराठी भाषा वाचवण्यासाठी मी हा ओपन सोर्स प्रोजेक्ट सुरू केला आहे.

माझ्या मते, आपली भाषा हळूहळू आणि कोणाचाही लक्षात न येता एका मृत भाषेच्या दिशेने वाटचाल करत आहे. या उपक्रमात सगळ्यांचे स्वागत आहे, ज्यांना कोणाला हा एक गंभीर विषय वाटतो व त्यात काही सुधारणा करण्याची गरज आहे असे वाटते.

अगदी सोप्या रीतीने सांगायचं झाला तर खालील उदाहरण पहा -

१. मराठी वाक्यांमधील इंग्रजी शब्दांचा जास्त आणि अनावश्यक वापर.

  • अयोग्य - "फार bore झालंय. चला एखादा picture बघूया."
  • योग्य - "फार कंटाळा आलाय. चला एखादा चित्रपट बघूया. "

२. देवनगरीऐवजी लॅटिन अक्षरे वापरुन मराठी टायपिंग / लिहिणे

  • अयोग्य - "me tujhya sobat marathi bolat ahe."
  • योग्य - "मी तुझ्या सोबत मराठीत बोलत आहे."

अधिक माहितीसाठी खालील इंग्रजी मजकूर वाचा. आपण सॉफ्टवेअर अभियंते जरी नसाल तरीही आपण योगदान करू शकता.

योगदान करण्यासाठी

१. "Github" वर आपले खाते बनवा

२. "Discussions" पृष्ठावरील आपल्या कल्पना, टिप्पण्या इ. वर चर्चा करा.

Marathi shabd

About

This project is being developed as a part of an effort to help save the Marathi language from its gradual and unnoticeable decline into a dying language.

Goal


(This is the goal of the overall idea and not just this project.)

Revive the usage of Marathi language in its original/unadulterated form in day-to-day life in both spoken and written medium.

How to do it?


  1. Make people realise that these problems exist
  2. Motivate them to work towards fixing it
  3. Provide them with resources (this project basically is a part of this step)
  4. Ask them to do actually implement this in their daily life

This will be done with a combination of videos, blogs and software tools such as this. (Contributions in all these are welcome.)

Overview of this project

The idea is to have a static website (ad free, bloat free and fast) where people, looking to improve their Marathi vocabulary, can search for an English word/phrase and quickly find its Marathi equivalent, and also usage example wherever possible.

Words can also be categorised into various topics (tags) so that words used in same context can be found together to improve the vocabulary those particular topics. More features can be added in the future, if necessary.

So basically it will be an ad-free and fast English-to-Marathi thesaurus for day-to-day words with some additonal features.

Development and contribution

It is currently in its very initial stage where I am conceptualising it and looking for contributors (developers as well as people well versed in the Marathi language).

Some places to do contributions

  • Database update - adding English words with Marathi equivalents
  • Static website creation - Basically parsing the database and creating an output markdown file with all the content. This file will be used on the github.io static website page.
    • note - I would particularly like help in this area as it is new to me as well.
  • Adding/correcting content in Marathi language to this project's documentation (readme, website pages etc.)

(This is the current plan and can be improvised.)

Please suggest your ideas, comments etc. in the "Discussions" page.

I also have in mind quite a few other ideas related to creating resources in Marathi language, which I plan to start once I have this project's website first ready at some usable level.

What is the need to do this?

As I see it, there 2 main problems which are explained below -

  1. Excessive use of English words in Marathi sentences.

Simply stated this is using a lot of English words in our sentences where we could easily use Marathi words. Example -

  • Not OK - "फार bore झालंय. चला एखादा picture बघूया."
  • OK - "फार कंटाळा आलाय. चला एखादा चित्रपट बघूया. "

The direct consequence of this is that we are loosing our grip on the Marathi vocabulary. And this problem is ever growing like a snowball, which needs external force and motivation to fix it. This problem exists in both the spoken as well as the written form. Also while this is particularly serious in the urban population, it may also expand to rural areas as the reach of English schools and the internet widens.

This project currently is for working on the above problem only.

  1. Typing/writing Marathi using the Latin alphabet instead of Devanagari.

This is basically typing Marathi like this

  • Not OK - "me tujhya sobat marathit bolat ahe."
  • OK - "मी तुझ्या सोबत मराठीत बोलत आहे."

This problem is something that I feel should not exist in today's date, as we now have good keyboards for typing in Marathi using Devanagari on all platforms be it mobile or computers. However it continues to exist, as people find it easier to type using Latin alphabet on the qwerty keyboard.

Owner
मुक्त स्त्रोत
मुक्त स्त्रोत
Gold standard corpus annotated with verb-preverb connections for Hungarian.

Hungarian Preverb Corpus A gold standard corpus manually annotated with verb-preverb connections for Hungarian. corpus The corpus consist of the follo

RIL Lexical Knowledge Representation Research Group 3 Jan 27, 2022
Easy-to-use CPM for Chinese text generation

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

382 Jan 07, 2023
Practical Machine Learning with Python

Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.

Dipanjan (DJ) Sarkar 2k Jan 08, 2023
A minimal code for fairseq vq-wav2vec model inference.

vq-wav2vec inference A minimal code for fairseq vq-wav2vec model inference. Runs without installing the fairseq toolkit and its dependencies. Usage ex

Vladimir Larin 7 Nov 15, 2022
Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

rJAM splitscreen message reader for MysticBBS A46+

Robbert Langezaal 4 Nov 22, 2022
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

Rishikesh (ऋषिकेश) 33 Sep 22, 2022
test

Lidar-data-decode In this project, you can decode your lidar data frame(pcap file) and make your own datasets(test dataset) in Windows without any hug

46 Dec 05, 2022
This is a NLP based project to extract effective date of the contract from their text files.

Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is

Sambhav Garg 1 Jan 26, 2022
Nested Named Entity Recognition

Nested Named Entity Recognition Training Dataset: CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark url: https://tianchi.aliyun.

8 Dec 25, 2022
TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks, which unifies general text transformation, task-specific transformation, adversarial attack, sub-popu

TextFlint 587 Dec 20, 2022
MicBot - MicBot uses Google Translate to speak everyone's chat messages

MicBot MicBot uses Google Translate to speak everyone's chat messages. It can al

2 Mar 09, 2022
Residual2Vec: Debiasing graph embedding using random graphs

Residual2Vec: Debiasing graph embedding using random graphs This repository contains the code for S. Kojaku, J. Yoon, I. Constantino, and Y.-Y. Ahn, R

SADAMORI KOJAKU 5 Oct 12, 2022
NLP tool to extract emotional phrase from tweets 🤩

Emotional phrase extractor Extract phrase in the given text that is used to express the sentiment. Capturing sentiment in language is important in the

Shahul ES 38 Oct 17, 2022
Treemap visualisation of Maya scene files

Ever wondered which nodes are responsible for that 600 mb+ Maya scene file? Features Fast, resizable UI Parsing at 50 mb/sec Dependency-free, single-f

Marcus Ottosson 76 Nov 12, 2022
Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Time-aware Large Kernel (TaLK) Convolutions (Lioutas et al., 2020) This repository contains the source code, pre-trained models, as well as instructio

Vasileios Lioutas 28 Dec 07, 2022
Community and sentiment analysis based on tweets

The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of th

3 Nov 17, 2022
Text Classification in Turkish Texts with Bert

You can watch the details of the project on my youtube channel Project Interface Project Second Interface Goal= Correctly guessing the classification

42 Dec 31, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 797 Dec 26, 2022
🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

🤗 🖼️ HuggingPics Fine-tune Vision Transformers for anything using images found on the web. Check out the video below for a walkthrough of this proje

Nathan Raw 185 Dec 21, 2022
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd

fastNLP 2.8k Jan 01, 2023