EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Related tags

Deep LearningEncT5
Overview

EncT5

(Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

About

  • Finetune T5 model for classification & regression by only using the encoder layers.
  • Implemented of Tokenizer and Model for EncT5.
  • Add BOS Token () for tokenizer, and use this token for classification & regression.
    • Need to resize embedding as vocab size is changed. (model.resize_token_embeddings())
  • BOS and EOS token will be automatically added as below.
    • single sequence: X
    • pair of sequences: A B

Requirements

Highly recommend to use the same version of transformers.

transformers==4.15.0
torch==1.8.1
sentencepiece==0.1.96
datasets==1.17.0
scikit-learn==0.24.2

How to Use

from enc_t5 import EncT5ForSequenceClassification, EncT5Tokenizer

model = EncT5ForSequenceClassification.from_pretrained("t5-base")
tokenizer = EncT5Tokenizer.from_pretrained("t5-base")

# Resize embedding size as we added bos token
if model.config.vocab_size < len(tokenizer.get_vocab()):
    model.resize_token_embeddings(len(tokenizer.get_vocab()))

Finetune on GLUE

Setup

  • Use T5 1.1 base for finetuning.
  • Evaluate on TPU. See run_glue_tpu.sh for more details.
  • Use AdamW optimizer instead of Adafactor.
  • Check best checkpoint on every epoch by using EarlyStoppingCallback.

Results

Metric Result (Paper) Result (Implementation)
CoLA Matthew 53.1 52.4
SST-2 Acc 94.0 94.5
MRPC F1/Acc 91.5/88.3 91.7/88.0
STS-B PCC/SCC 80.5/79.3 88.0/88.3
QQP F1/Acc 72.9/89.8 88.4/91.3
MNLI Mis/Matched 88.0/86.7 87.5/88.1
QNLI Acc 93.3 93.2
RTE Acc 67.8 69.7
You might also like...
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

 Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation
Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Fine-tuning StyleGAN2 for Cartoon Face Generation

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Fine-tuning StyleGAN2 for Cartoon Face Generation
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Comments
  • Enable tokenizer to be loaded by sentence-transformer

    Enable tokenizer to be loaded by sentence-transformer

    🚀 Feature Request

    Integration into sentence-transformer library.

    📎 Additional context

    I tried to load this tokenizer with sentence-transformer library but it failed. AutoTokenizer couldn't load this tokenizer. So, I simply added code to override save_pretrained and its dependencies so that this tokenizer is saved as T5Tokenizer, its super class.

            def save_pretrained(
            self,
            save_directory,
            legacy_format: Optional[bool] = None,
            filename_prefix: Optional[str] = None,
            push_to_hub: bool = False,
            **kwargs,
        ):
            if os.path.isfile(save_directory):
                logger.error(f"Provided path ({save_directory}) should be a directory, not a file")
                return
    
            if push_to_hub:
                commit_message = kwargs.pop("commit_message", None)
                repo = self._create_or_get_repo(save_directory, **kwargs)
    
            os.makedirs(save_directory, exist_ok=True)
    
            special_tokens_map_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + SPECIAL_TOKENS_MAP_FILE
            )
            tokenizer_config_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + TOKENIZER_CONFIG_FILE
            )
    
            tokenizer_config = copy.deepcopy(self.init_kwargs)
            if len(self.init_inputs) > 0:
                tokenizer_config["init_inputs"] = copy.deepcopy(self.init_inputs)
            for file_id in self.vocab_files_names.keys():
                tokenizer_config.pop(file_id, None)
    
            # Sanitize AddedTokens
            def convert_added_tokens(obj: Union[AddedToken, Any], add_type_field=True):
                if isinstance(obj, AddedToken):
                    out = obj.__getstate__()
                    if add_type_field:
                        out["__type"] = "AddedToken"
                    return out
                elif isinstance(obj, (list, tuple)):
                    return list(convert_added_tokens(o, add_type_field=add_type_field) for o in obj)
                elif isinstance(obj, dict):
                    return {k: convert_added_tokens(v, add_type_field=add_type_field) for k, v in obj.items()}
                return obj
    
            # add_type_field=True to allow dicts in the kwargs / differentiate from AddedToken serialization
            tokenizer_config = convert_added_tokens(tokenizer_config, add_type_field=True)
    
            # Add tokenizer class to the tokenizer config to be able to reload it with from_pretrained
            ############################################################################
            tokenizer_class = self.__class__.__base__.__name__
            ############################################################################
            # Remove the Fast at the end unless we have a special `PreTrainedTokenizerFast`
            if tokenizer_class.endswith("Fast") and tokenizer_class != "PreTrainedTokenizerFast":
                tokenizer_class = tokenizer_class[:-4]
            tokenizer_config["tokenizer_class"] = tokenizer_class
            if getattr(self, "_auto_map", None) is not None:
                tokenizer_config["auto_map"] = self._auto_map
            if getattr(self, "_processor_class", None) is not None:
                tokenizer_config["processor_class"] = self._processor_class
    
            # If we have a custom model, we copy the file defining it in the folder and set the attributes so it can be
            # loaded from the Hub.
            if self._auto_class is not None:
                custom_object_save(self, save_directory, config=tokenizer_config)
    
            with open(tokenizer_config_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(tokenizer_config, ensure_ascii=False))
            logger.info(f"tokenizer config file saved in {tokenizer_config_file}")
    
            # Sanitize AddedTokens in special_tokens_map
            write_dict = convert_added_tokens(self.special_tokens_map_extended, add_type_field=False)
            with open(special_tokens_map_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(write_dict, ensure_ascii=False))
            logger.info(f"Special tokens file saved in {special_tokens_map_file}")
    
            file_names = (tokenizer_config_file, special_tokens_map_file)
    
            save_files = self._save_pretrained(
                save_directory=save_directory,
                file_names=file_names,
                legacy_format=legacy_format,
                filename_prefix=filename_prefix,
            )
    
            if push_to_hub:
                url = self._push_to_hub(repo, commit_message=commit_message)
                logger.info(f"Tokenizer pushed to the hub in this commit: {url}")
    
            return save_files
    
    enhancement 
    opened by kwonmha 0
Releases(v1.0.0)
  • v1.0.0(Jan 22, 2022)

    What’s Changed

    :rocket: Features

    • Add GLUE Trainer (#2) @monologg
    • Add Template & EncT5 model and tokenizer (#1) @monologg

    :pencil: Documentation

    • Add readme & script (#3) @monologg
    Source code(tar.gz)
    Source code(zip)
Owner
Jangwon Park
Jangwon Park
Algebraic effect handlers in Python

PyEffect: Algebraic effects in Python What IDK. Usage effects.handle(operation, handlers=None) effects.set_handler(effect, handler) Supported effects

Greg Werbin 5 Dec 27, 2021
Learned image compression

Overview Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression. We first release the code for Variationa

Jiaheng Liu 163 Dec 04, 2022
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model. Designed sample dashboard with insights and recommendation for

Yash 2 Apr 07, 2022
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program

50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

komal_lamba 22 Dec 09, 2022
Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Amirsina Torfi 114 Dec 18, 2022
A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision This is the repository for our Paper/Contribution to the WI2022 in Nürnber

Maximilian Harl 6 Jan 17, 2022
Deep Learning GPU Training System

DIGITS DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, To

NVIDIA Corporation 4.1k Jan 03, 2023
RRL: Resnet as representation for Reinforcement Learning

Resnet as representation for Reinforcement Learning (RRL) is a simple yet effective approach for training behaviors directly from visual inputs. We demonstrate that features learned by standard image

Meta Research 21 Dec 07, 2022
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

2k Jan 05, 2023
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 07, 2023
Self-supervised Product Quantization for Deep Unsupervised Image Retrieval - ICCV2021

Self-supervised Product Quantization for Deep Unsupervised Image Retrieval Pytorch implementation of SPQ Accepted to ICCV 2021 - paper Young Kyun Jang

Young Kyun Jang 71 Dec 27, 2022
Film review classification

Film review classification Решение задачи классификации отзывов на фильмы на положительные и отрицательные с помощью рекуррентных нейронных сетей 1. З

Nikita Dukin 3 Jan 21, 2022
Implementation of the GBST block from the Charformer paper, in Pytorch

Charformer - Pytorch Implementation of the GBST (gradient-based subword tokenization) module from the Charformer paper, in Pytorch. The paper proposes

Phil Wang 105 Dec 26, 2022
PyTorch reimplementation of Diffusion Models

PyTorch pretrained Diffusion Models A PyTorch reimplementation of Denoising Diffusion Probabilistic Models with checkpoints converted from the author'

Patrick Esser 265 Jan 01, 2023
MediaPipe is a an open-source framework from Google for building multimodal

MediaPipe is a an open-source framework from Google for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. It is

Bhavishya Pandit 3 Sep 30, 2022
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

356 Dec 23, 2022
Official code of "Mitigating the Mutual Error Amplification for Semi-Supervised Object Detection"

CrossTeaching-SSOD 0. Introduction Official code of "Mitigating the Mutual Error Amplification for Semi-Supervised Object Detection" This repo include

Bruno Ma 9 Nov 29, 2022
Research - dataset and code for 2016 paper Learning a Driving Simulator

the people's comma the paper Learning a Driving Simulator the comma.ai driving dataset 7 and a quarter hours of largely highway driving. Enough to tra

comma.ai 4.1k Jan 02, 2023
Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

Aerial Depth Completion This work is described in the letter "Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation", by Lucas

ETHZ V4RL 70 Dec 22, 2022