Findings of ACL 2021

Last update: Feb 24, 2022

Overview

Assessing Dialogue Systems with Distribution Distances

We propose to measure the performance of a dialogue system by computing the distributionwise distance between its generated conversations and real-world conversations.

To appear in Findings of ACL 2021.

Note that this is not an officially supported Tencent product.

1. Configuratin

This repository requires the packages:

pytorch
huggingface/transformers.

2. Usage

To evaluate the system-level human correlations of metrics:

python eval_metric.py \
  --data_path ./datasets/convai2_annotation.json \
  --metric fbd \
  --sample_num 10 \
  --model_type roberta-base \
  --batch_size 32

Currently, our repo supports the common metrics used in text generation field, inclduing bleu, meteor, rouge, greedy, average, extrema, bert_score, fbd and prd.

Here are some details of the six corpura compared in the main paper:

File Name	Dataset Name	Num. of Samples	Reference
`personam_annotation.json`	Persona(M)	60	Shikib/usr
`dailyh_annotation.json`	Daily(H)	150	li3cmz/GRADE
`convai2_annotation.json`	Convai2	150	li3cmz/GRADE
`empathetic_annotation.json`	Empathetic	150	li3cmz/GRADE
`dailyz_annotation.json`	Daily(Z)	100	ZHAOTING/dialog-processing
`personaz_annotation.json`	Persona(Z)	150	ZHAOTING/dialog-processing

Citation

If you use this research/codebase/dataset, please cite our paper:

@article{xiang2021assessing,
  title={Assessing Dialogue Systems with Distribution Distances},
  author={Xiang, Jiannan and Liu, Yahui and Cai, Deng and Li, Huayang and Lian, Defu and Liu, Lemao},
  journal={arXiv preprint arXiv:2105.02573},
  year={2021}
}

Other related papers:

[1] FID, GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, NIPS 2017
[2] PRD, Assessing Generative Models via Precision and Recall, NIPS 2018
[3] BERTScore, BERTScore: Evaluating Text Generation with BERT, ICLR 2020

Findings of ACL 2021

Related tags

Overview

Assessing Dialogue Systems with Distribution Distances

1. Configuratin

2. Usage

Citation

Owner

Yahui Liu

This simple Python program calculates a love score based on your and your crush's full names in English

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

This program do translate english words to portuguese

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

中文生成式预训练模型

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.

Telegram bot to auto post messages of one channel in another channel as soon as it is posted, without the forwarded tag.

Binaural Speech Synthesis

A demo for end-to-end English and Chinese text spotting using ABCNet.

A fast and easy implementation of Transformer with PyTorch.

Nmt - TensorFlow Neural Machine Translation Tutorial

A Telegram bot to add notes to Flomo.

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Data manipulation and transformation for audio signal processing, powered by PyTorch

American Sign Language (ASL) to Text Converter