Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Last update: Dec 29, 2022

Related tags

Overview

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Code repo for paper Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations.

Dependencies

torch=1.8.1
transformers=4.9.0
sentence-transformers=2.0.0

Please view `requirements.txt' for more details.

Train

Self-distillation:

>> bash train_self_distill.sh 0

0 denotes GPU device index.

Mutual-distillation (two GPUs needed):

>> bash train_mutual_distill.sh 1,2

Train with your custom corpus:

>> CUDA_VISIBLE_DEVICES=0,1 python src/mutual_distill_parallel.py \
         --batch_size_bi_encoder 128 \
         --batch_size_cross_encoder 64 \
         --num_epochs_bi_encoder 10 \
         --num_epochs_cross_encoder 1 \
         --cycle 3 \
         --bi_encoder1_pooling_mode cls \
         --bi_encoder2_pooling_mode cls \
         --init_with_new_models \
         --task custom \
         --random_seed 2021 \
         --custom_corpus_path CORPUS_PATH

CORPUS_PATH should point to your custom corpus in which every line should be a sentence pair in the form of sent1||sent2.

Evaluate

>> python src/eval.py

Authors

Fangyu Liu: Main contributor

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Related tags

Overview

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Dependencies

Train

Evaluate

Authors

Security

License

Owner

Amazon

novel deep learning research works with PaddlePaddle

A simple AI that will give you si ple task and this is made with python

Zalo AI challenge 2021 task hum to song

Code for Motion Representations for Articulated Animation paper

Minimal But Practical Image Classifier Pipline Using Pytorch, Finetune on ResNet18, Got 99% Accuracy on Own Small Datasets.

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Machine Learning Toolkit for Kubernetes

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Stochastic gradient descent with model building

This repo contains the official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Datasets for new state-of-the-art challenge in disentanglement learning

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

Continuum Learning with GEM: Gradient Episodic Memory

Make your AirPlay devices as TTS speakers

Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Pytorch0.4.1 codes for InsightFace

Programming with Neural Surrogates of Programs

Face Mask Detection system based on computer vision and deep learning using OpenCV and Tensorflow/Keras