The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Last update: Nov 21, 2022

Overview

Language Models are Few-shot Multilingual Learners

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}

Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Related tags

Overview

Language Models are Few-shot Multilingual Learners

Paper

Setup Environment

GPU Machine

GPU Machine for Running GPT-J 6B Model

How to run

Zero-shot Cross-task

Finetune

Owner

Genta Indra Winata

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Chinese segmentation library

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"

Official Stanford NLP Python Library for Many Human Languages

A natural language modeling framework based on PyTorch

A minimal code for fairseq vq-wav2vec model inference.

Uses Google's gTTS module to easily create robo text readin' on command.

Big Bird: Transformers for Longer Sequences

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Signature remover is a NLP based solution which removes email signatures from the rest of the text.

Plugin repository for Macast

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

Tensorflow implementation of paper: Learning to Diagnose with LSTM Recurrent Neural Networks.

An open-source NLP library: fast text cleaning and preprocessing.

Search with BERT vectors in Solr and Elasticsearch

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

Blender addon - Scrub timeline from viewport with a shortcut