Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Overview

Introduction

This is a PyTorch implementation of the following research papers:

The code is developed by Facebook AI Research.

The code trains neural networks to hold negotiations in natural language, and allows reinforcement learning self play and rollout-based planning.

Citation

If you want to use this code in your research, please cite:

@inproceedings{DBLP:conf/icml/YaratsL18,
  author    = {Denis Yarats and
               Mike Lewis},
  title     = {Hierarchical Text Generation and Planning for Strategic Dialogue},
  booktitle = {Proceedings of the 35th International Conference on Machine Learning,
               {ICML} 2018, Stockholmsm{\"{a}}ssan, Stockholm, Sweden, July
               10-15, 2018},
  pages     = {5587--5595},
  year      = {2018},
  crossref  = {DBLP:conf/icml/2018},
  url       = {http://proceedings.mlr.press/v80/yarats18a.html},
  timestamp = {Fri, 13 Jul 2018 14:58:25 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/icml/YaratsL18},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Dataset

We release our dataset together with the code, you can find it under data/negotiate. This dataset consists of 5808 dialogues, based on 2236 unique scenarios. Take a look at §2.3 of the paper to learn about data collection.

Each dialogue is converted into two training examples in the dataset, showing the complete conversation from the perspective of each agent. The perspectives differ on their input goals, output choice, and in special tokens marking whether a statement was read or written. See §3.1 for the details on data representation.

# Perspective of Agent 1
<input> 1 4 4 1 1 2 </input>
<dialogue> THEM: i would like 4 hats and you can have the rest . <eos> YOU: deal <eos> THEM: <selection> </dialogue>
<output> item0=1 item1=0 item2=1 item0=0 item1=4 item2=0 </output> 
<partner_input> 1 0 4 2 1 2 </partner_input>

# Perspective of Agent 2
<input> 1 0 4 2 1 2 </input>
<dialogue> YOU: i would like 4 hats and you can have the rest . <eos> THEM: deal <eos> YOU: <selection> </dialogue>
<output> item0=0 item1=4 item2=0 item0=1 item1=0 item2=1 </output>
<partner_input> 1 4 4 1 1 2 </partner_input>

Setup

All code was developed with Python 3.0 on CentOS Linux 7, and tested on Ubuntu 16.04. In addition, we used PyTorch 1.0.0, CUDA 9.0, and Visdom 0.1.8.4.

We recommend to use Anaconda. In order to set up a working environment follow the steps below:

# Install anaconda
conda create -n py30 python=3 anaconda
# Activate environment
source activate py30
# Install PyTorch
conda install pytorch torchvision cuda90 -c pytorch
# Install Visdom if you want to use visualization
pip install visdom

Usage

Supervised Training

Action Classifier

We use an action classifier to compare performance of various models. The action classifier is described in section 3 of (2). It can be trained by running the following command:

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 7 \
--min_lr 1e-05 \
--model_type selection_model \
--momentum 0.1 \
--nembed_ctx 128 \
--nembed_word 128 \
--nhid_attn 128 \
--nhid_ctx 64 \
--nhid_lang 128 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--skip_values \
--sep_sel \
--model_file selection_model.th

Baseline RNN Model

This is the baseline RNN model that we describe in (1):

python train.py \
--cuda \
--bsz 16 \
--clip 0.5 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--model_type rnn_model \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 30 \
--min_lr 1e-07 \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_attn 64 \
--nhid_ctx 64 \
--nhid_lang 128 \
--nhid_sel 128 \
--sel_weight 0.6 \
--unk_threshold 20 \
--sep_sel \
--model_file rnn_model.th

Hierarchical Latent Model

In this section we provide guidelines on how to train the hierarchical latent model from (2). The final model requires two sub-models: the clustering model, which learns compact representations over intents; and the language model, which translates intent representations into language. Please read sections 5 and 6 of (2) for more details.

Clustering Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.2 \
--init_range 0.3 \
--lr 0.001 \
--max_epoch 15 \
--min_lr 1e-05 \
--model_type latent_clustering_model \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--skip_values \
--nhid_cluster 256 \
--selection_model_file selection_model.th \
--model_file clustering_model.th

Language Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 15 \
--min_lr 1e-05 \
--model_type latent_clustering_language_model \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--nhid_cluster 256 \
--skip_values \
--selection_model_file selection_model.th \
--cluster_model_file clustering_model.th \
--model_file clustering_language_model.th

Full Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.2 \
--init_range 0.3 \
--lr 0.001 \
--max_epoch 10 \
--min_lr 1e-05 \
--model_type latent_clustering_prediction_model \
--momentum 0.2 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--selection_model_file selection_model.th \
--lang_model_file clustering_language_model.th \
--model_file full_model.th

Selfplay

If you want to have two pretrained models to negotiate against each another, use selfplay.py. For example, lets have two rnn models to play against each other:

python selfplay.py \
--cuda \
--alice_model_file rnn_model.th \
--bob_model_file rnn_model.th \
--context_file data/negotiate/selfplay.txt  \
--temperature 0.5 \
--selection_model_file selection_model.th

The script will output generated dialogues, as well as some statistics. For example:

================================================================================
Alice : book=(count:3 value:1) hat=(count:1 value:5) ball=(count:1 value:2)
Bob   : book=(count:3 value:1) hat=(count:1 value:1) ball=(count:1 value:6)
--------------------------------------------------------------------------------
Alice : i would like the hat and the ball . <eos>
Bob   : i need the ball and the hat <eos>
Alice : i can give you the ball and one book . <eos>
Bob   : i can't make a deal without the ball <eos>
Alice : okay then i will take the hat and the ball <eos>
Bob   : okay , that's fine . <eos>
Alice : <selection>
Alice : book=0 hat=1 ball=1 book=3 hat=0 ball=0
Bob   : book=3 hat=0 ball=0 book=0 hat=1 ball=1
--------------------------------------------------------------------------------
Agreement!
Alice : 7 points
Bob   : 3 points
--------------------------------------------------------------------------------
dialog_len=4.47 sent_len=6.93 agree=86.67% advantage=3.14 time=2.069s comb_rew=10.93 alice_rew=6.93 alice_sel=60.00% alice_unique=26 bob_rew=4.00 bob_sel=40.00% bob_unique=25 full_match=0.78 
--------------------------------------------------------------------------------
debug: 3 1 1 5 1 2 item0=0 item1=1 item2=1
debug: 3 1 1 1 1 6 item0=3 item1=0 item2=0
================================================================================

Reinforcement Learning

To fine-tune a pretrained model with RL use the reinforce.py script:

python reinforce.py \
--cuda \
--alice_model_file rnn_model.th \
--bob_model_file rnn_model.th \
--output_model_file rnn_rl_model.th \
--context_file data/negotiate/selfplay.txt  \
--temperature 0.5 \
--verbose \
--log_file rnn_rl.log \
--sv_train_freq 4 \
--nepoch 4 \
--selection_model_file selection_model.th  \
--rl_lr 0.00001 \
--rl_clip 0.0001 \
--sep_sel

License

This project is licenced under CC-by-NC, see the LICENSE file for details.

Owner
Facebook Research
Facebook Research
Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Tokenizer Le Tokenizer est un analyseur lexicale, il permet, comme Flex and Yacc par exemple, de tokenizer du code, c'est à dire transformer du code e

Manolo 1 Aug 15, 2022
GSoC'2021 | TensorFlow implementation of Wav2Vec2

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Vasudev Gupta 73 Nov 28, 2022
Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Realistic Few-Shot Relation Extraction This repository contains code to reproduce the results in the paper "Towards Realistic Few-Shot Relation Extrac

Bloomberg 8 Nov 09, 2022
Behavioral Testing of Clinical NLP Models

Behavioral Testing of Clinical NLP Models This repository contains code for testing the behavior of clinical prediction models based on patient letter

Betty van Aken 2 Sep 20, 2022
vits chinese, tts chinese, tts mandarin

vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统

AmorTX 12 Dec 14, 2022
Arabic speech recognition, classification and text-to-speech.

klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr

ARBML 177 Dec 27, 2022
Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

1.1k Dec 27, 2022
An official repository for tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a University of Edinburgh master's course.

PMR computer tutorials on HMMs (2021-2022) This is a repository for computer tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a Univer

Vaidotas Šimkus 10 Dec 06, 2022
Pipeline for fast building text classification TF-IDF + LogReg baselines.

Text Classification Baseline Pipeline for fast building text classification TF-IDF + LogReg baselines. Usage Instead of writing custom code for specif

Dani El-Ayyass 57 Dec 07, 2022
This library is testing the ethics of language models by using natural adversarial texts.

prompt2slip This library is testing the ethics of language models by using natural adversarial texts. This tool allows for short and simple code and v

9 Dec 28, 2021
Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

BiQQLSTM_HS Code and data for paper: Title: An Effective, Robust and Fairness-awareHate Speech Detection Framework. Authors: Guanyi Mou and Kyumin Lee

Guanyi Mou 2 Dec 27, 2022
Machine learning models from Singapore's NLP research community

SG-NLP Machine learning models from Singapore's natural language processing (NLP) research community. sgnlp is a Python package that allows you to eas

AI Singapore | AI Makerspace 21 Dec 17, 2022
Python utility library for compositing PDF documents with reportlab.

pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s

Michael Gale 1 Jan 06, 2022
New Modeling The Background CodeBase

Modeling the Background for Incremental Learning in Semantic Segmentation This is the updated official PyTorch implementation of our work: "Modeling t

Fabio Cermelli 9 Dec 28, 2022
Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra. What is Lightning Tran

Pytorch Lightning 581 Dec 21, 2022
用Resnet101+GPT搭建一个玩王者荣耀的AI

基于pytorch框架用resnet101加GPT搭建AI玩王者荣耀 本源码模型主要用了SamLynnEvans Transformer 的源码的解码部分。以及pytorch自带的预训练模型"resnet101-5d3b4d8f.pth"

冯泉荔 2.2k Jan 03, 2023
CLIPfa: Connecting Farsi Text and Images

CLIPfa: Connecting Farsi Text and Images OpenAI released the paper Learning Transferable Visual Models From Natural Language Supervision in which they

Sajjad Ayoubi 66 Dec 14, 2022
A python script that will use hydra to get user and password to login to ssh, ftp, and telnet

Hydra-Auto-Hack A python script that will use hydra to get user and password to login to ssh, ftp, and telnet Project Description This python script w

2 Jan 16, 2022
Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Ceaser-Cipher The Caesar Cipher technique is one of the earliest and simplest me

Lateefah Ajadi 2 May 12, 2022
This repository describes our reproducible framework for assessing self-supervised representation learning from speech

LeBenchmark: a reproducible framework for assessing SSL from speech Self-Supervised Learning (SSL) using huge unlabeled data has been successfully exp

49 Aug 24, 2022