Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Related tags

Deep Learningt-zero
Overview

T-Zero

This repository serves primarily as codebase and instructions for training, evaluation and inference of T0.

T0 is the model developed in Multitask Prompted Training Enables Zero-Shot Task Generalization. In this paper, we demonstrate that massive multitask prompted fine-tuning is extremely effective to obtain task zero-shot generalization. T0 outperforms or matches GPT-3 while being 16x smaller.

While the codebase in this repository mainly reproduces and replicates the training and evaluation of T0, it will be useful for future research on massively multitask fine-tuning.

Contents

  • Training: reproducing (or replicating) the massively multitask fine-tuning
  • Evaluation: reproducing the main results reported in the paper
  • Inference: running inference with T0

Citation

If you find this resource useful, please cite the paper introducing T0:

@misc{sanh2021multitask,
      title={Multitask Prompted Training Enables Zero-Shot Task Generalization},
      author={Victor Sanh and Albert Webson and Colin Raffel and Stephen H. Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Teven Le Scao and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Stella Biderman and Leo Gao and Tali Bers and Thomas Wolf and Alexander M. Rush},
      year={2021},
      eprint={2110.08207},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Owner
BigScience Workshop
Research workshop on large language models - The Summer of Language Models 21
BigScience Workshop
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
Liver segmentation using MONAI and pytorch

Machine Learning use case in the field of Healthcare. In this project MONAI and pytorch frameworks are used for 3D Liver segmentation.

Abhishek Gajbhiye 2 May 30, 2022
Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Keras_cv_attention_models Keras_cv_attention_models Usage Basic Usage Layers Model surgery AotNet ResNetD ResNeXt ResNetQ BotNet VOLO ResNeSt HaloNet

319 Dec 28, 2022
Hyper-parameter optimization for sklearn

hyperopt-sklearn Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn. See how to use hyperopt-sklearn

1.4k Jan 01, 2023
A treasure chest for visual recognition powered by PaddlePaddle

简体中文 | English PaddleClas 简介 飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。 近期更新 2021.11.1 发布PP-ShiTu技术报告,新增饮料识别demo 2021.10.23 发

4.6k Dec 31, 2022
DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

DFFNet Paper DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation. Xiangyan Tang, Wenxuan Tu, Keqiu Li, J

4 Sep 23, 2022
SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

SPLADE 🍴 + 🥄 = 🔎 This repository contains the weights for four models as well as the code for running inference for our two papers: [v1]: SPLADE: S

NAVER 170 Dec 28, 2022
MPI-IS Mesh Processing Library

Perceiving Systems Mesh Package This package contains core functions for manipulating meshes and visualizing them. It requires Python 3.5+ and is supp

Max Planck Institute for Intelligent Systems 494 Jan 06, 2023
Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Renato Almeida de Oliveira 18 Aug 31, 2022
Predict stock movement with Machine Learning and Deep Learning algorithms

Project Overview Stock market movement prediction using LSTM Deep Neural Networks and machine learning algorithms Software and Library Requirements Th

Naz Delam 46 Sep 13, 2022
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
Code for "Universal inference meets random projections: a scalable test for log-concavity"

How to use this repository This repository contains code to replicate the results of "Universal inference meets random projections: a scalable test fo

Robin Dunn 0 Nov 21, 2021
Rohit Ingole 2 Mar 24, 2022
[Link]deep_portfolo - Use Reforcemet earg ad Supervsed learg to Optmze portfolo allocato []

rl_portfolio This Repository uses Reinforcement Learning and Supervised learning to Optimize portfolio allocation. The goal is to make profitable agen

Deepender Singla 165 Dec 02, 2022
Secure Distributed Training at Scale

Secure Distributed Training at Scale This repository contains the implementation of experiments from the paper "Secure Distributed Training at Scale"

Yandex Research 9 Jul 11, 2022
Implementation of various Vision Transformers I found interesting

Implementation of various Vision Transformers I found interesting

Kim Seonghyeon 78 Dec 06, 2022
DIRL: Domain-Invariant Representation Learning

DIRL: Domain-Invariant Representation Learning Domain-Invariant Representation Learning (DIRL) is a novel algorithm that semantically aligns both the

Ajay Tanwani 30 Nov 07, 2022
A package related to building quasi-fibration symmetries

qf A package related to building quasi-fibration symmetries. If you'd like to learn more about how it works, see the brief explanation and References

Paolo Boldi 1 Dec 01, 2021
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts

Yizhi Wang 146 Dec 18, 2022
FairMOT - A simple baseline for one-shot multi-object tracking

FairMOT - A simple baseline for one-shot multi-object tracking

Yifu Zhang 3.6k Jan 08, 2023