Source code for paper: Knowledge Inheritance for Pre-trained Language Models

Overview

Knowledge-Inheritance

Source code paper: Knowledge Inheritance for Pre-trained Language Models (preprint). The trained model parameters (in Fairseq format) can be downloaded from Tsinghua Cloud. You can use convert_fairseq_to_huggingface.py to convert the Fairseq format into Huggingface's transformers format easily.

We refer the downstream performance evaluation to the implementation of Fairseq (GLUE tasks) and Don't Stop Pre-training (ACL-ARC / CHEMPROT).

If you have any question, feel free to contact us ([email protected]).

1. Available Pretrained Models

WB domain: Wikipedia + BookCorpus; CS domain: computer science papers; BIO domain: biomedical papers;

Models trained by self-learning

RoBERTa_WB_H_4
RoBERTa_WB_H_6
RoBERTa_WB_H_8
RoBERTa_WB_H_10
RoBERTa_WB_D_288
RoBERTa_WB_D_384
RoBERTa_WB_D_480
RoBERTa_WB_D_576
RoBERTa_WB_D_672
RoBERTa_WB_BASE
RoBERTa_WB_MEDIUM
RoBERTa_WB_BASE_PLUS
RoBERTa_WB_LARGE
GPT_WB_MEDIUM
GPT_WB_BASE
GPT_WB_BASE_PLUS
RoBERTa_CS_MEDIUM
RoBERTa_CS_BASE
RoBERTa_BIO_MEDIUM
RoBERTa_BIO_BASE

Models trained by Knowledge Inheritance

RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS
RoBERTa_WB_BASE -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE

arch

Owner
THUNLP
Natural Language Processing Lab at Tsinghua University
THUNLP
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022
GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-t

Generative Toolkit 4 Scientific Discovery 142 Dec 24, 2022
A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Graph Wavelet Neural Network ⠀⠀ A PyTorch implementation of Graph Wavelet Neural Network (ICLR 2019). Abstract We present graph wavelet neural network

Benedek Rozemberczki 490 Dec 16, 2022
Automatic caption evaluation metric based on typicality analysis.

SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs

Joshua Feinglass 6 Jan 09, 2022
A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
Self Driving RC Car Code

Derp Learning Derp Learning is a Python package that collects data, trains models, and then controls an RC car for track racing. Hardware You will nee

Not Karol 39 Dec 07, 2022
The code used for the free [email protected] Webinar series on Reinforcement Learning in Finance

Reinforcement Learning in Finance [email protected] Webinar This repository provides the code f

Yves Hilpisch 62 Dec 22, 2022
This is an unofficial PyTorch implementation of Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

Jungdae Kim 320 Jan 08, 2023
Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks This repository contains the code and pre-trained

Hassan Dbouk 7 Dec 05, 2022
Chess reinforcement learning by AlphaGo Zero methods.

About Chess reinforcement learning by AlphaGo Zero methods. This project is based on these main resources: DeepMind's Oct 19th publication: Mastering

Samuel 2k Dec 29, 2022
Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing CVPR 2021. Project page: https://kai-46.github.io/

Kai Zhang 141 Dec 14, 2022
Using image super resolution models with vapoursynth and speeding them up with TensorRT

vs-RealEsrganAnime-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Also a docker image since

4 Aug 23, 2022
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

deargen 11 Nov 19, 2022
We are More than Our JOints: Predicting How 3D Bodies Move

We are More than Our JOints: Predicting How 3D Bodies Move Citation This repo contains the official implementation of our paper MOJO: @inproceedings{Z

72 Oct 20, 2022
Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Website | ArXiv | Get Start | Video PIRenderer The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic

Ren Yurui 261 Jan 09, 2023
Very deep VAEs in JAX/Flax

Very Deep VAEs in JAX/Flax Implementation of the experiments in the paper Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on I

Jamie Townsend 42 Dec 12, 2022
A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Segnet is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This is implementation of http://arxiv.org/pdf/15

Pradyumna Reddy Chinthala 190 Dec 15, 2022