"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

Overview

undirected-generation-dev

This repo contains the source code of the models described in the following paper

  • "Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021. (paper).

The basic code structure was adapted from the NYU dl4mt-seqgen. We also use the pybleu from fairseq to calculate BLEU scores during the reinforcement learning.

0. Preparation

0.1 Dependencies

  • PyTorch 1.4.0/1.6.0/1.8.0

0.2 Data

The WMT'14 De-En data and the pretrained De-En MLM model are provided in the dl4mt-seqgen.

  • Download WMT'14 De-En valid/test data.
  • Then organize the data in data/ and make sure it follows such a structure:
------ data
--------- de-en
------------ train.de-en.de.pth
------------ train.de-en.en.pth
------------ valid.de-en.de.pth
------------ valid.de-en.en.pth
------------ test.de-en.de.pth
------------ test.de-en.en.pth
  • Download pretrained models.
  • Then organize the pretrained masked language models in models/ make sure it follows such a structure:
------ models
--------- best-valid_en-de_mt_bleu.pth
--------- best-valid_de-en_mt_bleu.pth

2. Training the order policy network with reinforcement learning

Train a policy network to predict the generation order for a pretrained De-En masked language model:

./train_scripts/train_order_rl_deen.sh
  • By defaults, the model checkpoints will be saved in models/learned_order_deen_uniform_4gpu/00_maxlen30_minlen5_bsz32.
  • By using this script, we are only training the model on De-En sentence pairs where both the German and English sentences with a maximum length of 30 and a minimum length of 5. You can change the training parameters max_len and min_len to change the length limits.

3. Decode the undirected generation model with learned orders

  • Set the MODEL_CKPT parameter to the corresponding path found under models/00_maxlen30_minlen5_bsz32. For example:
export MODEL_CKPT=wj8oc8kab4/checkpoint_epoch30+iter96875.pth
  • Evaluate the model on the SCAN MCD1 splits by running:
export MODEL_CKPT=...
./eval_scripts/generate-order-deen.sh $MODEL_CKPT

4. Decode the undirected generation model with heuristic orders

  • Left2Right
./eval_scripts/generate-deen.sh left_right_greedy_1iter
  • Least2Most
./eval_scripts/generate-deen.sh least_most_greedy_1iter
  • EasyFirst
./eval_scripts/generate-deen.sh easy_first_greedy_1iter
  • Uniform
./eval_scripts/generate-deen.sh uniform_greedy_1iter

Citation

@inproceedings{jiang-bansal-2021-learning-analyzing,
    title = "Learning and Analyzing Generation Order for Undirected Sequence Models",
    author = "Jiang, Yichen  and
      Bansal, Mohit",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.298",
    pages = "3513--3523",
}
Owner
Yichen Jiang
Yichen Jiang
Ensembling Off-the-shelf Models for GAN Training

Vision-aided GAN video (3m) | website | paper Can the collective knowledge from a large bank of pretrained vision models be leveraged to improve GAN t

345 Dec 28, 2022
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Laura Smith 70 Dec 07, 2022
curl-impersonate: A special compilation of curl that makes it impersonate Chrome & Firefox

curl-impersonate A special compilation of curl that makes it impersonate real browsers. It can impersonate the four major browsers: Chrome, Edge, Safa

lwthiker 1.9k Jan 03, 2023
Shuffle Attention for MobileNetV3

SA-MobileNetV3 Shuffle Attention for MobileNetV3 Train Run the following command for train model on your own dataset: python train.py --dataset mnist

Sajjad Aemmi 36 Dec 28, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

NPMs: Neural Parametric Models Project Page | Paper | ArXiv | Video NPMs: Neural Parametric Models for 3D Deformable Shapes Pablo Palafox, Aljaz Bozic

PabloPalafox 109 Nov 22, 2022
📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

Rahul Vigneswaran 1 Jan 17, 2022
Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

Note: this code is no longer actively maintained. However, feel free to use the Issues section to discuss the code with other users. Some users have u

Abi See 2.1k Jan 04, 2023
Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

smart_edu-autobooking Sistema di autoprenotazione per l'aula studio [email protected]

Davide Carnemolla 17 Jun 20, 2022
Turning SymPy expressions into PyTorch modules.

sympytorch A micro-library as a convenience for turning SymPy expressions into PyTorch Modules. All SymPy floats become trainable parameters. All SymP

Patrick Kidger 89 Dec 13, 2022
Totally Versatile Miscellanea for Pytorch

Totally Versatile Miscellania for PyTorch Thomas Viehmann [email protected] Thi

Thomas Viehmann 428 Dec 28, 2022
Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

Continual Reinforcement Learning This repository provides a simple way to run continual reinforcement learning experiments in PyTorch, including evalu

55 Dec 24, 2022
Semantic Segmentation in Pytorch

PyTorch Semantic Segmentation Introduction This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to

Hengshuang Zhao 1.2k Jan 01, 2023
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022
Easily Process a Batch of Cox Models

ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. ⏬

Shixiang Wang 15 May 23, 2022
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar

Adrian Wolny 1.3k Dec 28, 2022
Implement face detection, and age and gender classification, and emotion classification.

YOLO Keras Face Detection Implement Face detection, and Age and Gender Classification, and Emotion Classification. (image from wider face dataset) Ove

Chloe 10 Nov 14, 2022
Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration Project Page | Paper Yifan Peng*, Suyeon Choi*, Jongh

Stanford Computational Imaging Lab 19 Dec 11, 2022
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking This is an official implementation for NEAS presented in CVPR

Multimedia Research 19 Sep 08, 2022
(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

Nikos Kolotouros 209 Dec 13, 2022