Accelerating BERT Inference for Sequence Labeling via Early-Exit

Overview

Sequence-Labeling-Early-Exit

Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit

Requirement:

Please refer to requirements.txt

How to run?

For ontonotes (CN):

you should claim your dataset path in paths.py, and then

For the first stage training:

python -u main.py --device 0  --seed 100 --fast_ptm_name bert --lr 5e-5  --use_crf 0 --dataset ontonotes_cn --fix_ptm_epoch 2 --warmup_step 3000 --use_fastnlp_bert 0 --sampler bucket  --after_bert linear --use_char 0 --use_bigram 0 --gradient_clip_norm_other 5 --gradient_clip_norm_bert 1 --train_mode joint --test_mode joint --if_save 1 --warmup_schedule inverse_square --epoch 20 --joint_weighted 1 --ptm_lr_rate 0.1 --cls_common_lr_scale 0

Then find the exp_path in the corresponding fitlog entry, and self-sampling further train the model.

For the self-sampling training:

python -u further_train.py --seed 100 --msg fuxian --if_save 1 --warmup_schedule inverse_square --epoch 30 --keep_norm_same 1 --sandwich_small 2 --sandwich_full 4 --max_t_level_t -0.5 --train_mode joint_sample_copy --further 0 --flooding 1 --flooding_bias 0 --lr 1e-4 --ptm_lr_rate 0.1 --fix_ptm_epoch 2 --min_win_size 5 --copy_wordpiece all --ckpt_epoch 7 --exp_path 05_11_22_20_52.210103 --device 2 --max_threshold 0.25 --max_threshold_2 0.5

Then find the exp_path and best epoch in the corresponding fitlog entry, and use it for early-exit inference as:

speed 2X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 15 --threshold 0.1 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 3X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.15 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 4X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.25 --ckpt_epoch [ckpt_path] --exp_path [exp_path]


Other datasets' scripts coming soon

If you have any question, do not hesitate to ask it in issue. (English or Chinese both ok)

Owner
李孝男
a little bird
李孝男
Repo 4 basic seminar §How to make human machine readable"

WORK IN PROGRESS... Notebooks from the Seminar: Human Machine Readable WS21/22 Introduction into programming Georg Trogemann, Christian Heck, Mattis

experimental-informatics 3 May 29, 2022
Implementation of parameterized soft-exponential activation function.

Soft-Exponential-Activation-Function: Implementation of parameterized soft-exponential activation function. In this implementation, the parameters are

Shuvrajeet Das 1 Feb 23, 2022
Small-bets - Ergodic Experiment With Python

Ergodic Experiment Based on this video. Run this experiment with this command: p

Michael Brant 3 Jan 11, 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate

24 Dec 26, 2022
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

Daniele Grattarola 2.2k Jan 08, 2023
A high-level Python library for Quantum Natural Language Processing

lambeq About lambeq is a toolkit for quantum natural language processing (QNLP). Documentation: https://cqcl.github.io/lambeq/ User support: lambeq-su

Cambridge Quantum 315 Jan 01, 2023
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Dec 22, 2022
FANet - Real-time Semantic Segmentation with Fast Attention

FANet Real-time Semantic Segmentation with Fast Attention Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko , Stan Sc

Ping Hu 42 Nov 30, 2022
PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

PERIN: Permutation-invariant Semantic Parsing David Samuel & Milan Straka Charles University Faculty of Mathematics and Physics Institute of Formal an

ÚFAL 40 Jan 04, 2023
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. The Anti-Backdoor Learning

Yige-Li 51 Dec 07, 2022
Automatic caption evaluation metric based on typicality analysis.

SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs

Joshua Feinglass 6 Jan 09, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Jan 05, 2023
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Api's bulid in Flask perfom to manage Todo Task.

Citymall-task Api's bulid in Flask perfom to manage Todo Task. Installation Requrements : Python: 3.10.0 MongoDB create .env file with variables DB_UR

Aisha Tayyaba 1 Dec 17, 2021
A demo of how to use JAX to create a simple gravity simulation

JAX Gravity This repo contains a demo of how to use JAX to create a simple gravity simulation. It uses JAX's experimental ode package to solve the dif

Cristian Garcia 16 Sep 22, 2022
CT Based COVID 19 Diagnose by Image Processing and Deep Learning

This project proposed the deep learning and image processing method to undertake the diagnosis on 2D CT image and 3D CT volume.

1 Feb 08, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

41 Oct 27, 2022
some academic posters as references. May we have in-person poster session soon!

some academic posters as references. May we have in-person poster session soon!

Bolei Zhou 472 Jan 06, 2023