On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

Last update: Nov 25, 2022

Related tags

Deep Learning PTvsBT

Overview

PTvsBT

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021)

Citation

Please cite as:

@inproceedings{liu-etal-2021-complementarity-pre,
    title = "On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation",
    author = "Liu, Xuebo  and
      Wang, Longyue  and
      Wong, Derek F.  and
      Ding, Liang  and
      Chao, Lidia S.  and
      Shi, Shuming  and
      Tu, Zhaopeng",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.247",
    pages = "2900--2907",
    abstract = "Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT). This paper takes the first step to investigate the complementarity between PT and BT. We introduce two probing tasks for PT and BT respectively and find that PT mainly contributes to the encoder module while BT brings more benefits to the decoder. Experimental results show that PT and BT are nicely complementary to each other, establishing state-of-the-art performances on the WMT16 English-Romanian and English-Russian benchmarks. Through extensive analyses on sentence originality and word frequency, we also demonstrate that combining Tagged BT with PT is more helpful to their complementarity, leading to better translation quality. Source code is freely available at https://github.com/SunbowLiu/PTvsBT.",
}

Requirements and Installation

This implementation is based on fairseq(v0.10.2)

PyTorch version >= 1.5.0
Python version >= 3.6

git clone https://github.com/SunbowLiu/PTvsBT
cd PTvsBT
git -C scripts clone https://github.com/moses-smt/mosesdecoder --depth 1
git -C scripts clone https://github.com/rsennrich/wmt16-scripts.git
git clone --branch v0.10.2 https://github.com/pytorch/fairseq.git
cd fairseq
pip install --editable .

Prepare pre-trained mBART and WMT16 Ro-En data from scratch with `prepare.sh`

sh prepare.sh

Train and test the model with `run.sh`

sh run.sh

We used 4*A100 GPUs (40GB). The batch size per step is 32k, i.e., max-tokens * update-freq * num-of-gpus = 32k.

Final Result

The model is expected to gain about 41.6 BLEU scores.

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

Related tags

Overview

PTvsBT

Citation

Requirements and Installation

Prepare pre-trained mBART and WMT16 Ro-En data from scratch with `prepare.sh`

Train and test the model with `run.sh`

Final Result

Owner

Sunbow Liu

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

prior-based-losses-for-medical-image-segmentation

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

Simple Dynamic Batching Inference

A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

CMT: Convolutional Neural Networks Meet Vision Transformers

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

Node for thenewboston digital currency network.

BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

maximal update parametrization (µP)

Implementation of PersonaGPT Dialog Model

A really easy-to-use and powerful sudoku solver.

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

[CVPR2021] De-rendering the World's Revolutionary Artefacts

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

Related tags

Overview

PTvsBT

Citation

Requirements and Installation

Prepare pre-trained mBART and WMT16 Ro-En data from scratch with prepare.sh

Train and test the model with run.sh

Final Result

Owner

Sunbow Liu

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

prior-based-losses-for-medical-image-segmentation

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

Simple Dynamic Batching Inference

A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

CMT: Convolutional Neural Networks Meet Vision Transformers

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

Node for thenewboston digital currency network.

BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

maximal update parametrization (µP)

Implementation of PersonaGPT Dialog Model

A really easy-to-use and powerful sudoku solver.

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

[CVPR2021] De-rendering the World's Revolutionary Artefacts

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

Prepare pre-trained mBART and WMT16 Ro-En data from scratch with `prepare.sh`

Train and test the model with `run.sh`