PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

Overview

UMS for Multi-turn Response Selection

PWC

Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection.

@inproceedings{whang2021ums,
  title={Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection},
  author={Whang, Taesun and Lee, Dongyub and Oh, Dongsuk and Lee, Chanhee and Han, Kijong and Lee, Dong-hun and Lee, Saebyeok},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2021}
}

This code is reimplemented as a fork of huggingface/transformers and taesunwhang/BERT-ResSel.

alt text

Setup and Dependencies

This code is implemented using PyTorch v1.6.0, and provides out of the box support with CUDA 10.1 and CuDNN 7.6.5.

Anaconda / Miniconda is the recommended to set up this codebase.

Anaconda or Miniconda

Clone this repository and create an environment:

git clone https://www.github.com/taesunwhang/UMS-ResSel
conda create -n ums_ressel python=3.7

# activate the environment and install all dependencies
conda activate ums_ressel
cd UMS-ResSel

# https://pytorch.org
pip install torch==1.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Preparing Data and Checkpoints

Pre- and Post-trained Checkpoints

We provide following pre- and post-trained checkpoints.

sh scripts/download_pretrained_checkpoints.sh

Data pkls for Fine-tuning (Response Selection)

Original version for each dataset is availble in Ubuntu Corpus V1, Douban Corpus, and E-Commerce Corpus, respectively.

sh scripts/download_datasets.sh

Domain-specific Post-Training

Post-training Creation

Data for post-training BERT
#Ubuntu Corpus V1
sh scripts/create_bert_post_data_creation_ubuntu.sh
#Douban Corpus
sh scripts/create_bert_post_data_creation_douban.sh
#E-commerce Corpus
sh scripts/create_bert_post_data_creation_e-commerce.sh
Data for post-training ELECTRA
sh scripts/download_electra_post_training_pkl.sh

Post-training Examples

BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post_training --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-base-uncased --bert_checkpoint_path bert-base-uncased-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --training_type post_training
ELECTRA+ (e.g., Douban Corpus)
python3 main.py --model electra_post_training --task_name douban --data_dir data/electra_post_training --bert_pretrained electra-base-chinese --bert_checkpoint_path electra-base-chinese-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --training_type post_training

Training Response Selection Models

Model Arguments

BERT-Base
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 bert-base-uncased bert-base-uncased-pytorch_model.bin
douban
e-commerce
data/douban
data/e-commerce
bert-base-wwm-chinese bert-base-wwm-chinese_model.bin
BERT-Post
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 bert-post-uncased bert-post-uncased-pytorch_model.pth
douban data/douban bert-post-douban bert-post-douban-pytorch_model.pth
e-commerce data/e-commerce bert-post-ecommerce bert-post-ecommerce-pytorch_model.pth
ELECTRA-Base
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 electra-base electra-base-pytorch_model.bin
douban
e-commerce
data/douban
data/e-commerce
electra-base-chinese electra-base-chinese-pytorch_model.bin
ELECTRA-Post
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 electra-post electra-post-pytorch_model.pth
douban data/douban electra-post-douban electra-post-douban-pytorch_model.pth
e-commerce data/e-commerce electra-post-ecommerce electra-post-ecommerce-pytorch_model.pth

Fine-tuning Examples

BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-post-uncased --bert_checkpoint_path bert-post-uncased-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir
UMS BERT+ (e.g., Douban Corpus)
python3 main.py --model bert_post --task_name douban --data_dir data/douban --bert_pretrained bert-post-douban --bert_checkpoint_path bert-post-douban-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --multi_task_type "ins,del,srch"
UMS ELECTRA (e.g., E-Commerce)
python3 main.py --model electra_base --task_name e-commerce --data_dir data/e-commerce --bert_pretrained electra-base-chinese --bert_checkpoint_path electra-base-chinese-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --multi_task_type "ins,del,srch"

Evaluation

To evaluate the model, set --evaluate to /path/to/checkpoints

UMS BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-post-uncased --bert_checkpoint_path bert-post-uncased-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --evaluate /path/to/checkpoints --multi_task_type "ins,del,srch"

Performance

We provide model checkpoints of UMS-BERT+, which obtained new state-of-the-art, for each dataset.

Ubuntu [email protected] [email protected] [email protected]
UMS-BERT+ 0.875 0.942 0.988
Douban MAP MRR [email protected] [email protected] [email protected] [email protected]
UMS-BERT+ 0.625 0.664 0.499 0.318 0.482 0.858
E-Commerce [email protected] [email protected] [email protected]
UMS-BERT+ 0.762 0.905 0.986
Owner
Taesun Whang
Interested in NLP, Dialogue System, Multimodal Learning. Currently attending Master's course in Dept. of Computer Science and Engineering, Korea University.
Taesun Whang
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS 2021), and the code to generate simulation results.

Scalable Intervention Target Estimation in Linear Models Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS

0 Oct 25, 2021
Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

What is Detectron2-FC Detectron2-FC a fast construction platform of neural network algorithm based on detectron2. We have been working hard in two dir

董晋宗 9 Jun 06, 2022
Code for "Unsupervised State Representation Learning in Atari"

Unsupervised State Representation Learning in Atari Ankesh Anand*, Evan Racah*, Sherjil Ozair*, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm This

Mila 217 Jan 03, 2023
Official Repository for Machine Learning class - Physics Without Frontiers 2021

PWF 2021 Física Sin Fronteras es un proyecto del Centro Internacional de Física Teórica (ICTP) en Trieste Italia. El ICTP es un centro dedicado a fome

36 Aug 06, 2022
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. 👉 https:

Gaurav 16 Oct 29, 2022
Repository for GNSS-based position estimation using a Deep Neural Network

Code repository accompanying our work on 'Improving GNSS Positioning using Neural Network-based Corrections'. In this paper, we present a Deep Neural

32 Dec 13, 2022
An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

PYPARSVD This implementation allows for a singular value decomposition which is: Distributed using MPI4Py Streaming - data can be shown in batches to

Romit Maulik 44 Dec 31, 2022
Pytoydl: A toy deep learning framework built upon numpy.

Documents: https://pytoydl.readthedocs.io/zh/latest/ Pytoydl A toy deep learning framework built upon numpy. You can star this repository to keep trac

28 Dec 10, 2022
Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Bootstrap Your Own Latent (BYOL), in Pytorch Practical implementation of an astoundingly simple method for self-supervised learning that achieves a ne

Phil Wang 1.4k Dec 29, 2022
Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Graph Convolutional Networks for Temporal Action Localization This repo holds the codes and models for the PGCN framework presented on ICCV 2019 Graph

Runhao Zeng 318 Dec 06, 2022
HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

Google Research 6 Jul 07, 2022
Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Jun Chen 139 Dec 21, 2022
Medical image analysis framework merging ANTsPy and deep learning

ANTsPyNet A collection of deep learning architectures and applications ported to the python language and tools for basic medical image processing. Bas

Advanced Normalization Tools Ecosystem 118 Dec 24, 2022
Experiments with Fourier layers on simulation data.

Factorized Fourier Neural Operators This repository contains the code to reproduce the results in our NeurIPS 2021 ML4PS workshop paper, Factorized Fo

Alasdair Tran 57 Dec 25, 2022
A simple python program that can be used to implement user authentication tokens into your program...

token-generator A simple python module that can be used by developers to implement user authentication tokens into your program... code examples creat

octo 6 Apr 18, 2022
Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

Bingchen Zhao 83 Dec 15, 2022
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
Nb workflows - A workflow platform which allows you to run parameterized notebooks programmatically

NB Workflows Description If SQL is a lingua franca for querying data, Jupyter sh

Xavier Petit 6 Aug 18, 2022
The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

Orange 383 Dec 16, 2022