Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Last update: Feb 07, 2022

Related tags

Overview

NLP-Summarizer

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

This project aimed to provide insight and explanations to current limitations on Natural Language Processing models by exploring the Transformer model, the latest state-of-the-art NLP solution, as well as discussing possible use cases for such tools in a domestic and workplace environment. An in-depth explanation of the architecture and the limitations it aims to solve was provided, as well as how it can be used to infer various tasks. Numerous use cases of NLP were also explored and how tools such as this can be extremely useful and have a massive impact on today’s society, both domestically and in the workplace. Three specific Transformer models were implemented using a GUI to evaluate their effectiveness. The final artefact provides a user with an interaction between the models for document summarisation tasks of variable output lengths.

Working Example

Following example created using another student's project introduction, original word count was ~1000.

Initial GUI

After Summarization

Getting Started

All code is ran using Python version 3.8.8
The artefact to be operated in it's entirety requires ~20GB of available space for downloads of the pre-trained models.

!pip install transformers
!pip install spacy==2.0.12
!pip install torch
!pip install tk

Runtime will be displayed as an output in console

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Related tags

Overview

NLP-Summarizer

Working Example

Initial GUI

After Summarization

Owner

Samuel Sharkey

Python port of Google's libphonenumber

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

A Facebook Messenger Chatbot using NLP

The tool to make NLP datasets ready to use

AMUSE - financial summarization

precise iris segmentation

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

Crowd sourced training data for Rasa NLU models

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

NLP applications using deep learning.

A simple version of DeTR

A collection of models for image - text generation in ACM MM 2021.

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Semantic search for quotes.

Augmenty is an augmentation library based on spaCy for augmenting texts.