NLP tool to extract emotional phrase from tweets 🤩

Last update: Oct 17, 2022

Overview

Emotional phrase extractor

Extract phrase in the given text that is used to express the sentiment. Capturing sentiment in language is important in these times where decisions and reactions are created and updated in seconds. But, which words actually lead to the sentiment description? This project aims to solve this problem.

Powered using Pytorch + hugggingface 🤗

Try it out.

git clone https://github.com/shahules786/twitter-emotions.git

cd twitter-emotions

sudo docker build --tag twitter-emotions:api .

sudo docker run -p 9999:9999  -it twitter-emotions:api python twitteremotions/app.py

Server will start running on port 9999 of localhost

Example

Installation for development

git clone https://github.com/shahules786/twitter-emotions.git

cd twitter-emotions

pip install -r requirements.txt

Train Model on your data

from twitteremotions.emotions import TwitterEmotions
emotions = TwitterEmotions()
emotions.train(train_path="data/train.csv", epochs=10, batch_size=32, max_len=168, test_size=0.25)

Contributing

All contrbutions are welcome 👋

You might also like...

HuggingTweets - Train a model to generate tweets

HuggingTweets - Train a model to generate tweets Create in 5 minutes a tweet generator based on your favorite Tweeter Make my own model with the demo

318 Jan 4, 2023

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nijmegen Licensed under GPLv3 (See http://www.gnu.org/licenses/gpl-3.0.html

122 Nov 17, 2022

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

46 Dec 14, 2022

The tool to make NLP datasets ready to use

chazutsu photo from Kaikado, traditional Japanese chazutsu maker chazutsu is the dataset downloader for NLP. import chazutsu r = chazutsu.data

243 Dec 29, 2022

Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

3.7k Dec 30, 2022

Search for documents in a domain through Google. The objective is to extract metadata

MetaFinder - Metadata search through Google _____ __ ___________ .__ .___ / \

85 Dec 16, 2022

Extract Keywords from sentence or Replace keywords in sentences.

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install

5.3k Jan 1, 2023

Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

3.5k Feb 12, 2021

Textpipe: clean and extract metadata from text

textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata

298 Nov 21, 2022

Comments

avoid confusion : end_tokens instead of start_tokens
Avoid Confusion

Replace start_tokens with end_tokens for the fourth argument to calculate the loss function to avoid confusion :)

While reviewing your amazing project, I noticed that the EmotionData class of the dataloader.py file is returning:

{ ... # start_tokens "start_tokens": torch.tensor(start_tokens, dtype=torch.long), # end_tokens "end_tokens": torch.tensor(end_tokens, dtype=torch.long), }

But in the engine.py file you are passing start_tokens for both the third and fourth arguments of the loss_fn():

loss = loss_fn( start, end, torch.argmax(data["start_tokens"], axis=1), torch.argmax(data["start_tokens"], axis=1) )

But the fourth has to be end_tokens. This minor change will not affect the loss_fn() output function since they are equal in all cases [=1].But, to respect conventions and avoid confusion, it would be better if it looks like the one shown below on the right:
opened by zekaouinoureddine 0

Releases(v1.0.0)

v1.0.0(May 17, 2021)

Trained Roberta base weights for twitter-emotions.
Source code(tar.gz)
Source code(zip)
emotion_torch.pth(475.54 MB)
pytorch_model.bin(477.98 MB)

Owner

Shahul ES

Data Scientist | Kaggle GrandMaster ( Rank 20) | Opensource @mljar

GitHub Repository

A framework for implementing federated learning

This is partly the reproduction of the paper of [Privacy-Preserving Federated Learning in Fog Computing](DOI: 10.1109/JIOT.2020.2987958. 2020)

46 Sep 23, 2022

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Dedupe Python Library dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on

3.6k Jan 02, 2023

A library for end-to-end learning of embedding index and retrieval model

Poeem Poeem is a library for efficient approximate nearest neighbor (ANN) search, which has been widely adopted in industrial recommendation, advertis

54 Dec 21, 2022

Maha is a text processing library specially developed to deal with Arabic text.

An Arabic text processing library intended for use in NLP applications Maha is a text processing library specially developed to deal with Arabic text.

184 Nov 27, 2022

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

55 Nov 17, 2022

Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models.

Tevatron Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models. The toolkit has a modularized

193 Jan 04, 2023

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles NewsMTSC is a dataset for target-dependent sentiment classification (TSC)

79 Dec 30, 2022

Prompt tuning toolkit for GPT-2 and GPT-Neo

mkultra mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo. Prompt tuning injects a string of 20-100 special tokens into the context in order to

61 Jan 01, 2023

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

2 Jul 15, 2022

基于“Seq2Seq+前缀树”的知识图谱问答

KgCLUE-bert4keras 基于“Seq2Seq+前缀树”的知识图谱问答简介博客：https://kexue.fm/archives/8802 环境软件：bert4keras=0.10.8 硬件：目前的结果是用一张Titan RTX（24G）跑出来的。运行第一次运行的时候，会给知

65 Dec 12, 2022

Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

150 Dec 23, 2022

A collection of models for image - text generation in ACM MM 2021.

Bi-directional Image and Text Generation UMT-BITG (image & text generator) Unifying Multimodal Transformer for Bi-directional Image and Text Generatio

63 Oct 30, 2022

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

The baseline code is for EDA: Easy Data Augmentation techniques for boosting performance on text classification tasks

81 Dec 09, 2022

Collection of useful (to me) python scripts for interacting with napari

Napari scripts A collection of napari related tools in various state of disrepair/functionality. Browse_LIF_widget.py This module can be imported, for

5 Aug 15, 2022

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for t

922 Dec 31, 2022

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

Visualize, analyze, and explore NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BER

1.6k Dec 25, 2022

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

PythonTextObfuscator Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Requi

2 Aug 29, 2022

GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

GCRC GCRC: A New Challenging MRC Dataset from Gaokao Chinese for Explainable Eva

5 Nov 04, 2022

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

japanese-gpt2 This repository provides the code for training Japanese GPT-2 models. This code has been used for producing japanese-gpt2-medium release

491 Jan 07, 2023

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

AAGCN-ACSA EMNLP 2021 Introduction This repository was used in our paper: Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment An

36 Dec 18, 2022

NLP tool to extract emotional phrase from tweets 🤩

Related tags

Overview

Emotional phrase extractor

Try it out.

Example

Installation for development

Contributing

You might also like...

HuggingTweets - Train a model to generate tweets

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

The tool to make NLP datasets ready to use

Snips Python library to extract meaning from text

Search for documents in a domain through Google. The objective is to extract metadata

Extract Keywords from sentence or Replace keywords in sentences.

Snips Python library to extract meaning from text

Textpipe: clean and extract metadata from text

Comments

avoid confusion : end_tokens instead of start_tokens

Avoid Confusion

Releases(v1.0.0)

v1.0.0(May 17, 2021)

Owner

Shahul ES

A framework for implementing federated learning

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

A library for end-to-end learning of embedding index and retrieval model

Maha is a text processing library specially developed to deal with Arabic text.

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models.

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

Prompt tuning toolkit for GPT-2 and GPT-Neo

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

基于“Seq2Seq+前缀树”的知识图谱问答

Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

A collection of models for image - text generation in ACM MM 2021.

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

Collection of useful (to me) python scripts for interacting with napari

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021