Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

We explore the capabilities and limits of GPT-2 in the case of well-formed poems, specifically limericks and haikus. We hypothesized that GPT-2 trained without phonetic annotations would be unable to systematically learn and generate syllabic patterns and rhyme scheme, since these features are grounded in real world acoustic representations. Our model trained with list-of-rhymes annotations outperformed baselines, generating perfect-scoring limericks 33% of the time. Our best haiku model generated valid haikus in 29% of cases, with an average syllable error rate of <0.4. Our work invites further research into methods of combining text and phonetic data for more convincing text generation.

Limericks Colab here ->

Haiku Colab Here ->

Evaluation Data Here: https://docs.google.com/spreadsheets/d/1rd1qCbCcTX1zHa0Dvh1q8OJ2iidxxrifTJlYWg3MMes

Examples (Find more in the repo):

Limericks

To the one grading our research, I'd say,
that a lot of work's been done today.
our paper's been checked,
And our work is all correct.
We're not mired in conjecture today.

The Indians' chief deity, they say,
Was a god of the earth all day.
But the gods he made
Were the ones who would fade
As they were replaced by a new way.

A large, thick, thick, and thickly cut tree
(A weeping cedar) will please me.
It's a tree that's known
As a cedar it's own,
And it's named for a bird that I see.

Haiku

The only thing that
gets me going is you So
let's keep this going

Saw a duck come in
from the woods and now i know
what a duck is lol

the only thing I
wanna say to you is good
bye don't disappoint

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Related tags

Overview

Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

Examples (Find more in the repo):

Limericks

Haiku

Owner

Bardia Shahrestani

Exploration of BERT-based models on twitter sentiment classifications

Python package for performing Entity and Text Matching using Deep Learning.

Simple Speech to Text, Text to Speech

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

A flask application to predict the speech emotion of any .wav file.

COVID-19 Chatbot with Rasa 2.0: open source conversational AI

Write Python in Urdu - اردو میں کوڈ لکھیں

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

translate using your voice

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Rootski - Full codebase for rootski.io (without the data)

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

A natural language modeling framework based on PyTorch

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"