"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

Last update: Nov 16, 2022

Related tags

Text Data & NLP transformers-arithmetic

Overview

transformers-arithmetic

This repository contains the code to reproduce the experiments from the paper:

Nogueira, Jiang, Lin "Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

First, install the required packages:

pip install -r requirements.txt

The command below trains and evaluates a T5-base model on the task of adding up to 15-digits:

python main.py \
    --output_dir=. \
    --model_name_or_path=t5-base \
    --operation=addition \
    --orthography=10ebased \
    --balance_train \
    --balance_val \
    --train_size=100000 \
    --val_size=10000 \
    --test_size=10000 \
    --min_digits_train=2 \
    --max_digits_train=15 \
    --min_digits_test=2 \
    --max_digits_test=15 \
    --base_number=10 \
    --seed=1 \
    --train_batch_size=4 \
    --accumulate_grad_batches=32 \
    --val_batch_size=32 \
    --max_seq_length=512 \
    --num_workers=4 \
    --gpus=1 \
    --optimizer=AdamW \
    --lr=3e-4 \
    --weight_decay=5e-5 \
    --scheduler=StepLR \
    --t_0=2 \
    --t_mult=2 \
    --gamma=1.0 \
    --step_size=1000 \
    --max_epochs=20 \
    --check_val_every_n_epoch=2 \
    --amp_level=O0 \
    --precision=32 \
    --gradient_clip_val=1.0

This training should take 10 hours on a V100 GPU.

The exact match on the test set should be 1:

--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test_exact_match': 1.0000}
--------------------------------------------------------------------------------

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

Related tags

Overview

transformers-arithmetic

Owner

Castorini

A 10000+ hours dataset for Chinese speech recognition

This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

Skipgram Negative Sampling in PyTorch

숭실대학교 컴퓨터학부 전공종합설계프로젝트

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

ACL'2021: Learning Dense Representations of Phrases at Scale

Transformers implementation for Fall 2021 Clinic

Pretrain CPM - 大规模预训练语言模型的预训练代码

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

The training code for the 4th place model at MDX 2021 leaderboard A.

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

BERT, LDA, and TFIDF based keyword extraction in Python

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

keras implement of transformers for humans

Document processing using transformers