Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Last update: Dec 29, 2022

Overview

N-Grammer - Pytorch

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Install

$ pip install n-grammer-pytorch

Usage

import torch
from n_grammer_pytorch import VQNgrammer

vq_ngram = VQNgrammer(
    num_clusters = 1024,             # number of clusters
    dim_per_head = 32,               # dimension per head
    num_heads = 16,                  # number of heads
    ngram_vocab_size = 768 * 256,    # ngram vocab size
    ngram_emb_dim = 16,              # ngram embedding dimension
    decay = 0.999                    # exponential moving decay value
)

x = torch.randn(1, 1024, 32 * 16)
vq_ngram(x) # (1, 1024, 32 * 16)

Learning Rates

Like product key memories, Ngrammer parameters need to have a higher learning rate (1e-2 was recommended in the paper). The repository offers an easy way to generate the parameter groups.

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_parameters

# this helper function, for your root model, finds all the VQNgrammer models and the embedding parameters
ngrammer_parameters, other_parameters = get_ngrammer_parameters(transformer)

optim = Adam([
    {'params': other_parameters},
    {'params': ngrammer_parameters, 'lr': 1e-2}
], lr = 3e-4)

Or, even more simply

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_param_groups

param_groups = get_ngrammer_param_groups(model) # automatically creates array of parameter settings with learning rate set at 1e-2 for ngrammer parameter values
optim = Adam(param_groups, lr = 3e-4)

Citations

@inproceedings{thai2020using,
    title   = {N-grammer: Augmenting Transformers with latent n-grams},
    author  = {Anonymous},
    year    = {2021},
    url     = {https://openreview.net/forum?id=GxjCYmQAody}
}

Transformers implementation for Fall 2021 Clinic

Installation Download miniconda3 if not already installed You can check by running typing conda in command prompt. Use conda to create an environment

1 Oct 28, 2021

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

ITTR - Pytorch Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image

17 Dec 23, 2022

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

141 Dec 30, 2022

KoBART model on huggingface transformers

KoBART-Transformers SKT에서 공개한 KoBART를 편리하게 사용할 수 있게 transformers로 포팅하였습니다. Install (Optional) BartModel과 PreTrainedTokenizerFast를 이용하면 설치하실 필요 없습니다. p

58 Dec 7, 2022

Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

457 Dec 23, 2022

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran

6.4k Jan 9, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

1.4k Feb 18, 2021

Comments

error when passing `concat_ngrams=False`
in this assert statement, the condition inside not(...) is actually the required condition

assert not (not concat_ngrams and dim_per_head == ngram_emb_dim), 'unigram head dimension must be equal to ngram embedding dimension if not concatting'

https://github.com/lucidrains/n-grammer-pytorch/blob/main/n_grammer_pytorch/n_grammer_pytorch.py#L149
opened by yiyixuxu 1

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Related tags

Overview

N-Grammer - Pytorch

Install

Usage

Learning Rates

Citations

You might also like...

Transformers implementation for Fall 2021 Clinic

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

KoBART model on huggingface transformers

Big Bird: Transformers for Longer Sequences

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spaCy plugin for Transformers , Udify, ELmo, etc.

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Comments

error when passing `concat_ngrams=False`

Releases(0.0.14a)

0.0.14a(Dec 4, 2022)

0.0.14(Dec 4, 2022)

0.0.12(Dec 4, 2021)

0.0.11(Dec 4, 2021)

0.0.10(Dec 4, 2021)

0.0.9(Dec 4, 2021)

0.0.8(Dec 4, 2021)

0.0.7(Dec 4, 2021)

0.0.6(Dec 3, 2021)

0.0.5(Dec 3, 2021)

0.0.4(Dec 3, 2021)

0.0.3(Dec 3, 2021)

0.0.2(Dec 3, 2021)

0.0.1a(Dec 3, 2021)

Owner

Phil Wang

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

novel deep learning research works with PaddlePaddle

Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization

Calibre recipe to convert latest issue of Analyse & Kritik into an ebook

Open-source offline translation library written in Python. Uses OpenNMT for translations

LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

A text augmentation tool for named entity recognition.

This library is testing the ethics of language models by using natural adversarial texts.

The source code of HeCo

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

Entity Disambiguation as text extraction (ACL 2022)

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

Levenshtein and Hamming distance computation

Py65 65816 - Add support for the 65C816 to py65

Biterm Topic Model (BTM): modeling topics in short texts

Creating a Feed of MISP Events from ThreatFox (by abuse.ch)

HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text