Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Last update: Nov 12, 2022

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

arXiv link: upcoming

To be published in Findings of NAACL 2022

Authors: Chin-Lun Fu*, Zih-Ching Chen*, Yun-Ru Lee, Hung-yi Lee

Overview

In this study, AdapterBias, a surprisingly simple yet effective adapter architecture, is proposed. AdapterBias adds a token-dependent shift to the hidden output of transformer layers to adapt to downstream tasks with only a vector and a linear layer.

Dataset

We use GLUE Benchmark as our dataset. You can download all datasets from the website.

Training

cd src
python exp.py \
    --adapter True \
    --GLUE_path <ur_GLUE_path> \
    --output_path <output_path> \
    --model <model name> \
    --task <the task u want to run> \
    --epoch 100 \
    --lr 0.0001 \
    --max_len 512 \
    --batch_size 32 \

-s or --seed specifies the random seed
-g or --GLUE_path specifies the path of your GLUE dataset.
-o or --output_path specifies the path of saved model and saved predicted file.
-m or --model specifies the pre-trained language model (PLM) you used in training.
- Some examples: bert-base, bert-large, roberta-base, roberta-large
-t or --task specifies the downstream task.
- Some examples: cola, mnli, qnli, qqp, mrpc, rte, sst, sts
-a or --adapter specifies whether you adding our AdapterBias in PLM
--share_alpha specifies whether you share the same alpha in AdapterBias in all transformer layers

Inference

After you run the training, you can automatically get the prediction file in <output_path>/result/. Also, the saved model is in <output_path>/model/.

Running all nine tasks of GLUE benchmark, you can sumbit the prediction files to the website.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Related tags

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Overview

Dataset

Training

Inference

Owner

Allen

Knowledge Graph,Question Answering System，基于知识图谱和向量检索的医疗诊断问答系统

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

All the code I wrote for Overwatch-related projects that I still own the rights to.

Dope Wars game engine on StarkNet L2 roll-up

Repository for the paper "Optimal Subarchitecture Extraction for BERT"

Generate text line images for training deep learning OCR model (e.g. CRNN)

Partially offline multi-language translator built upon Huggingface transformers.

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Python bot created with Selenium that can guess the daily Wordle word correct 96.8% of the time.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Open-World Entity Segmentation

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

Facilitating the design, comparison and sharing of deep text matching models.

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

HAN2HAN : Hangul Font Generation

Implementation of ProteinBERT in Pytorch

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

A minimal code for fairseq vq-wav2vec model inference.

Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

📔️ Generate a text-based journal from a template file.