Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Overview

pair-emnlp2020

Official repository for the paper:

Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation

If you find our work useful, please cite:

@inproceedings{hua-wang-2020-pair,
    title = "PAIR: Planning and Iterative Refinement in Pre-trained Transformersfor Long Text Generation",
    author = "Hua, Xinyu  and
      Wang, Lu",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
}

Requirements

  • Python 3.7
  • PyTorch 1.4.0
  • PyTorchLightning 0.9.0
  • transformers 3.3.0
  • numpy
  • tqdm
  • pycorenlp (for preprocessing nytimes data)
  • nltk (for preprocessing nytimes data)

Data

We release the data sets in the following link(1.2G uncompressed) Please download and uncompress the file, and put under ./data directory. For opinion and news domains, the The New York Times Annotated Corpus is licensed by LDC. We therefore only provide the ids for train/dev/test. Please follow the instructions to generate the dataset.

Text Planning

To train a BERT planner:

cd planning
python train.py \
    --data-path=../data/ \
    --domain=[arggen,opinion,news] \
    --exp-name=demo \
    --save-interval=1 \ # how frequent to save checkpoints 
    --max-epoch=30 \
    --lr=5e-4 \
    --warmup-updates=5000 \
    --train-set=train \
    --valid-set=dev \
    --tensorboard-logdir=tboard/ \
    --predict-keyphrase-offset \
    --max-samples=32 \ # max number of samples per batch
    [--quiet] \ # whether to print intermediate information

The checkpoints will be dumped to checkpoints/planning/[domain]/[exp-name]. Tensorboard will be available under planning/tboard/.

To run inference using a trained model, with greedy decoding:

cd planning
python decode.py \
    --data-path=../data/ \
    --domain=arggen \
    --test-set=test \
    --max-samples=32 \
    --predict-keyphrase-offset \
    --exp-name=demo \
    [--quiet]

The results will be saved to planning/output/.

Iterative Refinement

We provide implementations for four different setups:

  • Seq2seq: prompt -> tgt
  • KPSeq2seq: prompt + kp-set -> tgt
  • PAIR-light: prompt + kp-plan + masks -> tgt
  • PAIR-full: prompt + kp-plan + template -> tgt

To train a model:

cd refinement
python train.py \
    --domain=[arggen,opinion,news] \
    --setup=[seq2seq,kpseq2seq,pair-light,pair-full] \
    --train-set=train \
    --valid-set=dev \
    --train-batch-size=10 \
    --valid-batch-size=5 \
    --num-train-epochs=20 \
    --ckpt-dir=../checkpoints/[domain]/[setup]/demo \
    --tensorboard-dir=demo \
    [--quiet]

To run iterative refinement:

cd refinement
python generate.py \
    --domain=[arggen,opinion,news] \
    --setup=[seq2seq,kpseq2seq,pair-light,pair-full] \
    --test-set=test \
    --output-name=test_demo \
    --enforce-template-strategy=flexible \
    --do-sampling \
    --sampling-topk=100 \
    --sampling-topp=0.9 \
    --sample-times=3 \
    --ckpt-dir=../checkpoints/[domain]/[setup]/demo

Contact

Xinyu Hua (hua.x [at] northeastern.edu)

License

See the LICENSE file for details.

Owner
Xinyu Hua
PhD student at Northeastern University
Xinyu Hua
Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Open nsfw model This repo contains code for running Not Suitable for Work (NSFW) classification deep neural network Caffe models. Please refer our blo

Yahoo 5.6k Jan 05, 2023
Instance-Dependent Partial Label Learning

Instance-Dependent Partial Label Learning Installation pip install -r requirements.txt Run the Demo benchmark-random mnist python -u main.py --gpu 0 -

17 Dec 29, 2022
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 50 Oct 28, 2022
Display, filter and search log messages in your terminal

Textualog Display, filter and search logging messages in the terminal. This project is powered by rich and textual. Some of the ideas and code in this

Rik Huygen 24 Dec 10, 2022
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022
这是一个利用facenet和retinaface实现人脸识别的库,可以进行在线的人脸识别。

Facenet+Retinaface:人脸识别模型在Keras当中的实现 目录 注意事项 Attention 所需环境 Environment 文件下载 Download 预测步骤 How2predict 参考资料 Reference 注意事项 该库中包含了两个网络,分别是retinaface和fa

Bubbliiiing 31 Nov 15, 2022
This is the repository of shape matching algorithm Iterative Rotations and Assignments (IRA)

Description This is the repository of shape matching algorithm Iterative Rotations and Assignments (IRA), described in the publication [1]. Directory

MAMMASMIAS Consortium 6 Nov 14, 2022
Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Learning to Execute (L2E) Official code base for completely reproducing all results reported in I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learnin

3 May 18, 2022
Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Deep Learning - All You Need to Know Sponsorship To support maintaining and upgrading this project, please kindly consider Sponsoring the project deve

Instill AI 4.4k Dec 26, 2022
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Better Cross-Domain Feature Disentanglement and Manipulation with Improved PuppetGAN Quite cool... Right? Introduction This repo contains a TensorFlow

Giorgos Karantonis 5 Aug 25, 2022
Assginment for UofT CSC420: Intro to Image Understanding

Run the code Open edge_detection.ipynb in google colab. Upload image1.jpg,image2.jpg and my_image.jpg to '/content/drive/My Drive'. chooose 'Run all'

Ziyi-Zhou 1 Feb 24, 2022
A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

DGC-Net: Dense Geometric Correspondence Network This is a PyTorch implementation of our work "DGC-Net: Dense Geometric Correspondence Network" TL;DR A

191 Dec 16, 2022
Distributionally robust neural networks for group shifts

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization This code implements the g

151 Dec 25, 2022
Awesome Human Pose Estimation

Human Pose Estimation Related Publication

Zhe Wang 1.2k Dec 26, 2022
GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management

Bitcoin Realized Volatility Forecasting with GARCH and Multivariate LSTM Author: Chi Bui This Repository Repository Directory ├── README.md

Chi Bui 113 Dec 29, 2022
Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Bib Parser Convenient script to parse .bib files with the ACM Digital Library li

Mehtab Iqbal (Shahan) 1 Jan 26, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It or

airctic 789 Dec 29, 2022
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

MeTAL - Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning (ICCV2021 Oral) Sungyong Baik, Janghoon Choi, Heewon Kim, Dohee Cho, Jaes

Sungyong Baik 44 Dec 29, 2022
Neural network-based build time estimation for additive manufacturing

Neural network-based build time estimation for additive manufacturing Oh, Y., Sharp, M., Sprock, T., & Kwon, S. (2021). Neural network-based build tim

Yosep 1 Nov 15, 2021