Code for EMNLP 2021 paper: "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training"

Overview

SCAPT-ABSA

Code for EMNLP2021 paper: "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training"

Overview

In this repository, we provide code for Superived ContrAstive Pre-Training (SCAPT) and aspect-aware fine-tuning, retrieved sentiment corpora from YELP/Amazon reviews, and SemEval2014 Restaurant/Laptop with addtional implicit_sentiment labeling.

SCAPT aims to tackle implicit sentiments expression in aspect-based sentiment analysis(ABSA). In our work, we define implicit sentiment as sentiment expressions that contain no polarity markers but still convey clear human-aware sentiment polarity.

Here are examples for explicit and implicit sentiment in ABSA:

examples

SCAPT

SCAPT gives an aligned representation of sentiment expressions with the same sentiment label, which consists of three objectives:

  • Supervised Contrastive Learning (SCL)
  • Review Reconstruction (RR)
  • Masked Aspect Prediction (MAP)
SCAPT

Aspect-aware Fine-tuning

Sentiment representation and aspect-based representation are taken into account for sentiment prediction in aspect-aware fine-tuning.

Aspect_fine-tuning

Requirement

  • cuda 11.0
  • python 3.7.9
    • lxml 4.6.2
    • numpy 1.19.2
    • pytorch 1.8.0
    • pyyaml 5.3.1
    • tqdm 4.55.0
    • transformers 4.2.2

Data Preparation & Preprocessing

For Pre-training

Retrieved sentiment corpora contain millions-level reviews, we provide download links for original corpora and preprocessed data. Download if you want to do pre-training and further use them:

File Google Drive Link Baidu Wangpan Link Baidu Wangpan Code
scapt_yelp_json.zip link link q7fs
scapt_amazon_json.zip link link i1da
scapt_yelp_pkl.zip link link j9ce
scapt_amazon_pkl.zip link link 3b8t

These pickle files can also be generated from json files by the preprocessing method:

bash preprocess.py --pretrain

For Fine-tuning

We have already combined the opinion term labeling to the original SemEval2014 datasets. For example:

    <sentence id="1634">
        <text>The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not.</text>
        <aspectTerms>
            <aspectTerm term="food" polarity="positive" from="4" to="8" implicit_sentiment="False" opinion_words="exceptional"/>
            <aspectTerm term="kitchen" polarity="positive" from="55" to="62" implicit_sentiment="False" opinion_words="capable"/>
            <aspectTerm term="menu" polarity="neutral" from="141" to="145" implicit_sentiment="True"/>
        </aspectTerms>
        <aspectCategories>
            <aspectCategory category="food" polarity="positive"/>
        </aspectCategories>
    </sentence>

implicit_sentiment indicates whether it is an implicit sentiment expression and yield opinion_words if not implicit. The opinion_words lebaling is credited to TOWE.

Both original and extended fine-tuning data and preprocessed dumps are uploaded to this repository.

Consequently, the structure of your data directory should be:

├── Amazon
│   ├── amazon_laptops.json
│   └── amazon_laptops_preprocess_pretrain.pkl
├── laptops
│   ├── Laptops_Test_Gold_Implicit_Labeled_preprocess_finetune.pkl
│   ├── Laptops_Test_Gold_Implicit_Labeled.xml
│   ├── Laptops_Test_Gold.xml
│   ├── Laptops_Train_v2_Implicit_Labeled_preprocess_finetune.pkl
│   ├── Laptops_Train_v2_Implicit_Labeled.xml
│   └── Laptops_Train_v2.xml
├── MAMS
│   ├── test_preprocess_finetune.pkl
│   ├── test.xml
│   ├── train_preprocess_finetune.pkl
│   ├── train.xml
│   ├── val_preprocess_finetune.pkl
│   └── val.xml
├── restaurants
│   ├── Restaurants_Test_Gold_Implicit_Labeled_preprocess_finetune.pkl
│   ├── Restaurants_Test_Gold_Implicit_Labeled.xml
│   ├── Restaurants_Test_Gold.xml
│   ├── Restaurants_Train_v2_Implicit_Labeled_preprocess_finetune.pkl
│   ├── Restaurants_Train_v2_Implicit_Labeled.xml
│   └── Restaurants_Train_v2.xml
└── YELP
    ├── yelp_restaurants.json
    └── yelp_restaurants_preprocess_pretrain.pkl

Pre-training

The pre-training is conducted on multiple GPUs.

  • Pre-training [TransEnc|BERT] on [YELP|Amazon]:

    python -m torch.distributed.launch --nproc_per_node=${THE_CARD_NUM_YOU_HAVE} multi_card_train.py --config config/[yelp|amazon]_[TransEnc|BERT]_pretrain.yml

Model checkpoints are saved in results.

Fine-tuning

  • Directly train [TransEnc|BERT] on [Restaurants|Laptops|MAMS] As [TransEncAsp|BERTAsp]:

    python train.py --config config/[restaurants|laptops|mams]_[TransEnc|BERT]_finetune.yml
  • Fine-tune the pre-trained [TransEnc|BERT] on [Restaurants|Laptops|MAMS] As [TransEncAsp+SCAPT|BERTAsp+SCAPT]:

    python train.py --config config/[restaurants|laptops|mams]_[TransEnc|BERT]_finetune.yml --checkpoint PATH/TO/MODEL_CHECKPOINT

Model checkpoints are saved in results.

Evaluation

  • Evaluate [TransEnc|BERT]-based model on [Restaurants|Laptops|MAMS] dataset:

    python evaluate.py --config config/[restaurants|laptops|mams]_[TransEnc|BERT]_finetune.yml --checkpoint PATH/TO/MODEL_CHECKPOINT

Our model parameters:

Model Dataset File Google Drive Link Baidu Wangpan Link Baidu Wangpan Code
TransEncAsp+SCAPT SemEval2014 Restaurant TransEnc_restaurants.zip link link 5e5c
TransEncAsp+SCAPT SemEval2014 Laptop TransEnc_laptops.zip link link 8amq
TransEncAsp+SCAPT MAMS TransEnc_MAMS.zip link link bf2x
BERTAsp+SCAPT SemEval2014 Restaurant BERT_restaurants.zip link link 1w2e
BERTAsp+SCAPT SemEval2014 Laptop BERT_laptops.zip link link zhte
BERTAsp+SCAPT MAMS BERT_MAMS.zip link link 1iva

Citation

If you found this repository useful, please cite our paper:

@inproceedings{li-etal-2021-learning-implicit,
    title = "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training",
    author = "Li, Zhengyan  and
      Zou, Yicheng  and
      Zhang, Chong  and
      Zhang, Qi  and
      Wei, Zhongyu",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.22",
    pages = "246--256",
    abstract = "Aspect-based sentiment analysis aims to identify the sentiment polarity of a specific aspect in product reviews. We notice that about 30{\%} of reviews do not contain obvious opinion words, but still convey clear human-aware sentiment orientation, which is known as implicit sentiment. However, recent neural network-based approaches paid little attention to implicit sentiment entailed in the reviews. To overcome this issue, we adopt Supervised Contrastive Pre-training on large-scale sentiment-annotated corpora retrieved from in-domain language resources. By aligning the representation of implicit sentiment expressions to those with the same sentiment label, the pre-training process leads to better capture of both implicit and explicit sentiment orientation towards aspects in reviews. Experimental results show that our method achieves state-of-the-art performance on SemEval2014 benchmarks, and comprehensive analysis validates its effectiveness on learning implicit sentiment.",
}
Owner
Zhengyan Li
Zhengyan Li
Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN"

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

68 Dec 21, 2022
VOLO: Vision Outlooker for Visual Recognition

VOLO: Vision Outlooker for Visual Recognition, arxiv This is a PyTorch implementation of our paper. We present Vision Outlooker (VOLO). We show that o

Sea AI Lab 876 Dec 09, 2022
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 05, 2022
App customer segmentation cohort rfm clustering

CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat

hieulmsc 3 Dec 18, 2021
Measuring if attention is explanation with ROAR

NLP ROAR Interpretability Official code for: Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Toke

Andreas Madsen 19 Nov 13, 2022
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
An open-source project for applying deep learning to medical scenarios

Auto Vaidya An open source solution for creating end-end web app for employing the power of deep learning in various clinical scenarios like implant d

Smaranjit Ghose 18 May 29, 2022
Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Introduction This repository contains several materials that supplements the Spoofing-Aware Speaker Verification (SASV) Challenge 2022 including: calc

40 Dec 28, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
Automatically erase objects in the video, such as logo, text, etc.

Video-Auto-Wipe Read English Introduction:Here   本人不定期的基于生成技术制作一些好玩有趣的算法模型,这次带来的作品是“视频擦除”方向的应用模型,它实现的功能是自动感知到视频中我们不想看见的部分(譬如广告、水印、字幕、图标等等)然后进行擦除。由于图标擦

seeprettyface.com 141 Dec 26, 2022
Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Oral)

CMT Code for paper Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Best Paper Award) [Paper] [Site] Directory Struc

Zhaokai Wang 198 Dec 27, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

9 May 10, 2022
🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Deep Residual Fourier Transformation for Single Image Deblurring Xintian Mao, Yiming Liu, Wei Shen, Qingli Li and Yan Wang code will be released soon

145 Dec 13, 2022
Repository for Driving Style Recognition algorithms for Autonomous Vehicles

Driving Style Recognition Using Interval Type-2 Fuzzy Inference System and Multiple Experts Decision Making Created by Iago Pachêco Gomes at USP - ICM

Iago Gomes 9 Nov 28, 2022
Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Segmentation from Natural Language Expressions This repository contains the code for the following paper: R. Hu, M. Rohrbach, T. Darrell, Segmentation

Ronghang Hu 88 May 24, 2022
A library to inspect itermediate layers of PyTorch models.

A library to inspect itermediate layers of PyTorch models. Why? It's often the case that we want to inspect intermediate layers of a model without mod

archinet.ai 380 Dec 28, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

43 Nov 21, 2022
Embeddinghub is a database built for machine learning embeddings.

Embeddinghub is a database built for machine learning embeddings.

Featureform 1.2k Jan 01, 2023
DM-ACME compatible implementation of the Arm26 environment from Mujoco

ACME-compatible implementation of Arm26 from Mujoco This repository contains a customized implementation of Mujoco's Arm26 model, that can be used wit

1 Dec 24, 2021