This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

Related tags

Text Data & NLPNLP
Overview

FALLABOUT-SRMMIC 21

POETRY-GENERATION

HINGLISH

DESCRIPTION

We have developed a NLP(natural language processing) model which automatically generates a poem based on the initial/promt text given as input by the user.

Motivation

The majority of ML/DL models result is usualy based on the training/validation accuracy and loss. And one of the models which does not depend on either on accuracy or loss is NLP text generating model. Irrespective of the accuracy the generated text may or maynot make sense. Sometimes the accuracy can be very high and not give satisfactory results or end up in a loop. So this can only be done by looking at the result after many trails and training.

Uses

  1. Can be used for creative and fun purposes.
  2. Can sometimes used for reproducing or generating the text for larger datasets.
  3. Literature purpose like understanding and analysing a certain poetric style.

What's unique?

  1. Unlike many poetry generation, we also built a hindi poetry text generation model.
  2. We provide an analysis for LSTM layers and transformers with an example for better understanding.

Built with

  1. Streamlit for frontend
  2. tensorflow keras for hindi poetry
  3. aitextgen for english poetry

Deeper into the project

The english poetry generation is developed with the help of an open-sourse library known as aitextgen. The famous GPT-2 transformer is used in this project, finetuned on Shakespeares poems and sonnets alone. The hindi poetry generation is built with tensorflow keras. The front-end is simply handled by streamlit.

Here is an example of how aitextgen is fine tuned. Here is an example on how to train your own model using tensorflow keras.

A peek into our project

hindiNLP

EnglishNLP

Installation

The app.py file should be installed and download the model from this link. The trained_model folder should specify the path to your downloaded model. And you have to install trained_model_hindi from this link and specify the path as above. The trained_model_hindi forlder contains the trained model, tokenizer and etc. Similarly the trained_model folder for english also contains the model and uses the default built in GPT-2 transformer. Finally streamlit run app.py in your terminal and enjoy the app.

FALLABOUT SRM

This is how Your code should look while running on local.

Future works

  1. Planning on including a translator to slide easily between languages.
  2. Introduce more poet based model in many languages.

Authors

  1. Paras Rawat
  2. Daketi Yatin
Repo for Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization

ESACL: Enhanced Seq2Seq Autoencoder via Contrastive Learning for AbstractiveText Summarization This repo is for our paper "Enhanced Seq2Seq Autoencode

Rachel Zheng 14 Nov 01, 2022
FastFormers - highly efficient transformer models for NLU

FastFormers FastFormers provides a set of recipes and methods to achieve highly efficient inference of Transformer models for Natural Language Underst

Microsoft 678 Jan 05, 2023
This simple Python program calculates a love score based on your and your crush's full names in English

This simple Python program calculates a love score based on your and your crush's full names in English. There is no logic or reason in the calculation behind the love score. The calculation could ha

p.katekomol 1 Jan 24, 2022
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding

Bethge Lab 61 Dec 21, 2022
Calibre recipe to convert latest issue of Analyse & Kritik into an ebook

Calibre Recipe für "Analyse & Kritik" Dies ist ein "Recipe" für die Konvertierung der aktuellen Ausgabe der Zeitung Analyse & Kritik in ein Ebook. Es

Henning 3 Jan 04, 2022
A natural language modeling framework based on PyTorch

Overview PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapi

Facebook Research 6.4k Dec 27, 2022
BERT score for text generation

BERTScore Automatic Evaluation Metric described in the paper BERTScore: Evaluating Text Generation with BERT (ICLR 2020). News: Features to appear in

Tianyi 1k Jan 08, 2023
A simple Streamlit App to classify swahili news into different categories.

Swahili News Classifier Streamlit App A simple app to classify swahili news into different categories. Installation Install all streamlit requirements

Davis David 4 May 01, 2022
The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

thai_sentiment The naive sentiment classification function based on NBSVM trained on wisesight_sentiment วิธีติดตั้ง pip install thai_sentiment==0.1.3

Charin 7 Dec 08, 2022
🎐 a python library for doing approximate and phonetic matching of strings.

jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk James Turk 1.8k Dec 21, 2022

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

Rishikesh (ऋषिकेश) 33 Sep 22, 2022
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 A repository part of the MarIA project. Corpora 📃 Corpora Number of documents Number of tokens Size (GB) BNE 201,080,084

Plan de Tecnologías del Lenguaje - Gobierno de España 203 Dec 20, 2022
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Wav2Vec2CTC With KenLM Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription. Assuming you'v

farisalasmary 65 Sep 21, 2022
Codes to pre-train Japanese T5 models

t5-japanese Codes to pre-train a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts. The model is available at https://hug

Megagon Labs 37 Dec 25, 2022
基于pytorch_rnn的古诗词生成

pytorch_peot_rnn 基于pytorch_rnn的古诗词生成 说明 config.py里面含有训练、测试、预测的参数,更改后运行: python main.py 预测结果 if config.do_predict: result = trainer.generate('丽日照残春')

西西嘛呦 3 May 26, 2022
Tools, wrappers, etc... for data science with a concentration on text processing

Rosetta Tools for data science with a focus on text processing. Focuses on "medium data", i.e. data too big to fit into memory but too small to necess

207 Nov 22, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
Milaan Parmar / Милан пармар / _米兰 帕尔马 170 Dec 13, 2022
A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

multitask-learning-transformers A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You

Shahrukh Khan 48 Jan 02, 2023