An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Last update: Oct 21, 2022

Related tags

Overview

pl_prompt_sst

An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SST2 sentiment analysis dataset. Leveraging the pytorch-lightning features like logging, gradient accumulation and early stopping, etc. Can be used as a template for further development.

Run

Install requirement

pip install -r requirements.txt

Setup the prompt to use in sst2/prompt_config.json

{
    "template_text": "{\"placeholder\": \"text_a\"} In summary, the film was {\"mask\"}.",
    "label_words": [["bad"], ["good"]]
}

Adjust the arguments in run.sh or the code below for your need, and run it.

CUDA_VISIBLE_DEVICES=0 python -u main.py --input_dir ./sst2 \
                                         --prompt_config_dir ./sst2/prompt_config.json \
                                         --model_class bert \
                                         --model_name_or_path prajjwal1/bert-tiny \
                                         --lr 2e-4
                                         --bs 32 \
                                         --max_seq_length 64 \
                                         --patience 4 \
                                         --accumulation 2 \
                                         --seed 666

In my preliminary experiment with the settings above, the model achieve 0.822 F1 compared to 0.820 without prompt.

Note

Can only be executed after this fix on state_dict()

An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Related tags

Overview

pl_prompt_sst

Run

Note

Owner

Zhiling Zhang

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Code Generation using a large neural network called GPT-J

Lingtrain Aligner — ML powered library for the accurate texts alignment.

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Rootski - Full codebase for rootski.io (without the data)

Python api wrapper for JellyFish Lights

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Module for automatic summarization of text documents and HTML pages.

Associated Repository for "Translation between Molecules and Natural Language"

Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.

FactSumm: Factual Consistency Scorer for Abstractive Summarization

Korean Sentence Embedding Repository

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated