The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Last update: Dec 14, 2022

Related tags

Overview

Graformer

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer (also named BridgeTransformer in the code) is a sequence-to-sequence model mainly for Neural Machine Translation. We improve the multilingual translation by taking advantage of pre-trained (masked) language models, including pre-trained encoder (BERT) and pre-trained decoder (GPT). The code is based on Fairseq.

Examples

You can start with run/run.sh, with some minor modification. The corresponding scripts represent:

train a pre-trained BERT:
    run_arnold_multilingual_masked_lm_6e6d.sh

train a pre-trained GPT:
    run_arnold_multilingual_lm_6e6d.sh

train a Graformer:
    run_arnold_multilingual_graft_transformer_12e12d_ted.sh

inference from Graformer:
    run_arnold_multilingual_graft_inference_ted.sh

Released Models

We release our pre-trained mBERT and mGPT, along with the trained Graformer model in here.

Tensorflow Version

We will provide the tensorflow version in Neurst, a popular toolkit for sequence processing.

Citation

Please cite as:

@inproceedings{sun2021mulilingual,
    title = "Multilingual Translation via Grafting Pre-trained Language Models",
    author = "Sun, Zewei and Wang, Mingxuan and Li, Lei",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021"
}

Contact

If you have any questions, please feel free to contact me: [email protected]

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Related tags

Overview

Graformer

Examples

Released Models

Tensorflow Version

Citation

Contact

Owner

Interpretable Models for NLP using PyTorch

nlpcommon is a python Open Source Toolkit for text classification.

A versatile token stream for handwritten parsers.

NLP, before and after spaCy

Fast topic modeling platform

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Malware-Related Sentence Classification

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

RIDE automatically creates the package and boilerplate OOP Python node scripts as per your needs

Nested Named Entity Recognition

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

A Fast Command Analyser based on Dict and Pydantic

华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

This repository describes our reproducible framework for assessing self-supervised representation learning from speech

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Repository for the paper: VoiceMe: Personalized voice generation in TTS

History Aware Multimodal Transformer for Vision-and-Language Navigation