Google's Meena transformer chatbot implementation

Last update: Dec 25, 2022

Overview

Meena chatbot

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

For this implementation I used the tensor2tensor deep learning library, using an evolved transformer model as described in the paper.

The training set used is the OpenSubtitles corpus in the Italian language. Many other languages are available here.

Model

Similarly to the work done in the paper, this model consists of 1 encoder block and 12 decoder blocks for a total of 108M parameters. The optimizer used is Adafactor with the same training rate schedule as described in the paper.

Training

Here are the results after training the model on 40M sentences of the OpenSubtitles dataset in the italian language. The learning rate starts at 0.01 and remains constant for 10k steps then decay with the inverse square root of the number of steps.

Here's the plot of the evaluation loss during training.

The final perplexity score is 10.4 which is very close to the perplexity score achieved by Google's meena chatbot 10.2.

The paper shows a correlation between perplexity score and the Sensibleness and Specificity Average which is correlated with the "human likeness" of the chatbot. Our perplexity score shows that our bot is better than other chatbots such as Cleverbot and DialoGPT:

The dataset used however does not represent well normal conversations between humans. However Opensubtitles provide very large datasets in many languages.

Run pretrained model

Simply run notebook meena_chatbot_inference.ipynb.

Otherwise download the following model and extract it. Set proper MODEL_DIR and CHECKPOINT_NAME in predict.py and run main.py

Pretrained model checkpoint

Italian, 108M parameters, 200k steps, 40M sentences

Train a new model

For training simply run the ipython notebook on Google Colab, the model will be saved on Google Drive. At the end of the execution you can interact with the chatbot.

Export the model

The model can be exported by copying the following files in a folder:

hparams.json
The trained model checkpoint
The vocabulary .subwords file

and run main.py after setting the proper model directory.

Serving

server.py provides a simple HTTP API for serving the chatbot.

Google's Meena transformer chatbot implementation

Related tags

Overview

Meena chatbot

Model

Training

Run pretrained model

Pretrained model checkpoint

Train a new model

Export the model

Serving

Owner

Francesco Pham

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library.

Sample data associated with the Aurora-BP study

Material for GW4SHM workshop, 16/03/2022.

Use Tensorflow2.7.0 Build OpenAI'GPT-2

Chinese version of GPT2 training code, using BERT tokenizer.

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP

Dé op-de-vlucht Pieton vertaler. Wereldwijd gebruikt door meer dan 1.000+ succesvolle bedrijven!

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

A tool helps build a talk preview image by combining the given background image and talk event description

A versatile token stream for handwritten parsers.

Let Xiao Ai speakers control third-party devices

MRC approach for Aspect-based Sentiment Analysis (ABSA)

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

The first online catalogue for Arabic NLP datasets.

A list of NLP(Natural Language Processing) tutorials

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

Chinese NER with albert/electra or other bert descendable model (keras)

Open-World Entity Segmentation

🏆 • 5050 most frequent words in 109 languages