Community and sentiment analysis based on tweets

Last update: Nov 17, 2022

Overview

Social Media Analytics project

Community and sentiment analysis based on tweets

The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of the new measures. In particular, we want to research the reference hubs present on the network, but also the sentiment and emotions of peoples with respect to the new limitations.

Motivation

One of the hottest topics in Italy in the last months of 2021 concerns the introduction of the Super Green Pass to access indoor clubs, events, gyms, etc. This security measure entered into force on 6 December 2021 and in fact no longer allows access to various services to those who have not completed the vaccination cycle. For these reasons it was decided, for the development of the project, to analyze the impressions of the Italian Twitter community regarding the Super Green Pass, with the aim of understanding who are the users who write and interact on the platform and if there are specific communities among the users who have commented on the introduction of this extension. We also want to analyze the possible influencing nodes of the network and verify the sentiment around them.

Data

The data was collected by Twitter using their API and Tweepy python package. All tweets were written on December 6th in italian languages.
In data folder you can find the .csv file with all the collected tweet (here), and you can also find two extras files that contains the sentiment extracted for each tweet (here) and the aggregated sentiment per cluster (here).

Files

All the developed code is present in the file Code.ipynb. You can also find the report and presentation made for the exam. Both in italian language.

How to run code?

We advise you to run all the code in Google Colaboratory platform. All notebooks all already setted to import the necessary packages! If you have any doubt please feel free to contact me!

Graph visualization

In Pyvis_export folder you can find two exported interactive visualization of the network graph. You can also find a static version of the images in .jpg files if you want to see them quickly (html version is quite slow at opening).

Results

We have found that hubs are not famous people, this may be an expected result due to the particular context of the no-vax discussion. In this context, the ideas and contents are more important than the celebrity of the person.
Focusing on sentiment analysis we noticed that the vast majority of tweets are neutral or negative! This is a far cry from the reality where most people have been vaccinated and are not that disappointed with the new rules.

Community and sentiment analysis based on tweets

Related tags

Overview

Social Media Analytics project

Community and sentiment analysis based on tweets

Motivation

Data

Files

How to run code?

Graph visualization

Results

About us

Riccardo Confalonieri - Data Science Student @ University of Milano-Bicocca

Justin Armanini - Data Science Student @ University of Milano-Bicocca

Chiara Cormio - Data Science Student @ University of Milano-Bicocca

Owner

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

A Python script that compares files in directories

A demo for end-to-end English and Chinese text spotting using ABCNet.

Beyond Paragraphs: NLP for Long Sequences

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Tool which allow you to detect and translate text.

NLP-based analysis of poor Chinese movie reviews on Douban

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Code for the Python code smells video on the ArjanCodes channel.

Google and Stanford University released a new pre-trained model called ELECTRA

Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Easy-to-use CPM for Chinese text generation