Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Overview

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

This repository contains the code in both PyTorch and TensorFlow for our paper

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov (*: equal contribution)

Preprint 2018

TensorFlow

  • The source code is in the tf/ folder, supporting (1) single-node multi-gpu training, and (2) multi-host TPU training.
  • Besides the source code, we also provide pretrained "TensorFlow" models with state-of-the-art (SoTA) performances reported in the paper.
  • Please refer to tf/README.md for details.

PyTorch

  • The source code is in the pytorch/ folder, supporting single-node multi-gpu training via the module nn.DataParallel.
  • Please refer to pytorch/README.md for details.

Results

Transformer-XL achieves new state-of-the-art results on multiple language modeling benchmarks. Transformer-XL is also the first to break through the 1.0 barrier on char-level language modeling. Below is a summary.

Method enwiki8 text8 One Billion Word WT-103 PTB (w/o finetuning)
Previous Best 1.06 1.13 23.7 20.5 55.5
Transformer-XL 0.99 1.08 21.8 18.3 54.5

Acknowledgement

A large portion of the getdata.sh script comes from the awd-lstm repo. Happy Language Modeling :)

Owner
Zhilin Yang
Zhilin Yang
CredData is a set of files including credentials in open source projects

CredData is a set of files including credentials in open source projects. CredData includes suspicious lines with manual review results and more information such as credential types for each suspicio

Samsung 19 Sep 07, 2022
[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

New Benchmarks for Learning on Non-Homophilous Graphs Here are the codes and datasets accompanying the paper: New Benchmarks for Learning on Non-Homop

94 Dec 21, 2022
code for modular summarization work published in ACL2021 by Krishna et al

This repository contains the code for running modular summarization pipelines as described in the publication Krishna K, Khosla K, Bigham J, Lipton ZC

Kundan Krishna 6 Jun 04, 2021
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

πŸ€— Contributing to OpenSpeech πŸ€— OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 03, 2023
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

Google Research 3k Dec 26, 2022
λ‚΄λΆ€ μž‘μ—…μš© django + vue(vuetify) boilerplate. μ§  ν•˜λ©΄ λŒμ•„κ°.

Pocket Galaxy μ•„μ£Ό κ°„λ‹¨ν•œ 개인용, ν˜Ήμ€ λ‚΄λΆ€μš© νˆ΄μ„ λ§Œλ“€μ–΄μ•Όν•˜λŠ”λ° 이왕이면 웹이 νŽΈν•˜μ£ ? κ·ΈλŸ΄λ•Œλ₯Ό μœ„ν•΄ λ§Œλ“€μ–΄λ‘” django와 vue(vuetify)둜 이뀄진 boilerplate μž…λ‹ˆλ‹€. 각 폴더에 μžˆλŠ” μ„€λͺ…μ„œλŒ€λ‘œ 싀행을 μ‹œν‚€λ©΄ 일단 λ‹Ήμž₯ λ­”κ°€κ°€ λŒμ•„κ°‘λ‹ˆ

Jamie J. Seol 16 Dec 03, 2021
Journalism AI – Quotes extraction for modular journalism

Quote extraction for modular journalism (JournalismAI collab 2021)

Journalism AI collab 2021 207 Dec 25, 2022
Sequence Modeling with Structured State Spaces

Structured State Spaces for Sequence Modeling This repository provides implementations and experiments for the following papers. S4 Efficiently Modeli

HazyResearch 902 Jan 06, 2023
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website β€’ Docs β€’ Twitter β€’ Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Yan Yuanmeng 478 Dec 25, 2022
Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

Agis pof 283 Nov 21, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 08, 2023
Utilize Korean BERT model in sentence-transformers library

ko-sentence-transformers 이 ν”„λ‘œμ νŠΈλŠ” KoBERT λͺ¨λΈμ„ sentence-transformers μ—μ„œ 보닀 μ‰½κ²Œ μ‚¬μš©ν•˜κΈ° μœ„ν•΄ λ§Œλ“€μ–΄μ‘ŒμŠ΅λ‹ˆλ‹€. Ko-Sentence-BERT-SKTBERT ν”„λ‘œμ νŠΈμ—μ„œλŠ” KoBERT λͺ¨λΈμ„ sentence-trans

Junghyun 40 Dec 20, 2022
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context This Repository contains the code on AVA of our ACM MM 2021 paper: LSTC: Boosting

Tencent YouTu Research 9 Oct 11, 2022
Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

Subhadeep Mandal 1 Feb 01, 2022
This is a simple item2vec implementation using gensim for recbole

recbole-item2vec-model This is a simple item2vec implementation using gensim for recbole( https://recbole.io ) Usage When you want to run experiment f

Yusuke Fukasawa 2 Oct 06, 2022
justCTF [*] 2020 challenges sources

justCTF [*] 2020 This repo contains sources for justCTF [*] 2020 challenges hosted by justCatTheFish. TLDR: Run a challenge with ./run.sh (requires Do

justCatTheFish 25 Dec 27, 2022
Python library for parsing resumes using natural language processing and machine learning

CVParser Python library for parsing resumes using natural language processing and machine learning. Setup Installation on Linux and Mac OS Follow the

nafiu 0 Jul 29, 2021
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 3.1k Jan 07, 2023
This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

12 Jan 20, 2022