Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Overview

Optimum Transformers

Tests License PyPI

Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime.

Installation:

With PyPI:

pip install optimum-transformers

Or directly from GitHub:

pip install git+https://github.com/AlekseyKorshuk/optimum-transformers

Usage:

The pipeline API is similar to transformers pipeline with just a few differences which are explained below.

Just provide the path/url to the model, and it'll download the model if needed from the hub and automatically create onnx graph and run inference.

from optimum_transformers import pipeline

# Initialize a pipeline by passing the task name and 
# set onnx to True (default value is also True)
nlp = pipeline("sentiment-analysis", use_onnx=True)
nlp("Transformers and onnx runtime is an awesome combo!")
# [{'label': 'POSITIVE', 'score': 0.999721109867096}]  

Or provide a different model using the model argument.

from optimum_transformers import pipeline

nlp = pipeline("question-answering", model="deepset/roberta-base-squad2", use_onnx=True)
nlp(question="What is ONNX Runtime ?",
         context="ONNX Runtime is a highly performant single inference engine for multiple platforms and hardware")
# {'answer': 'highly performant single inference engine for multiple platforms and hardware', 'end': 94,
# 'score': 0.751201868057251, 'start': 18}
from optimum_transformers import pipeline

nlp = pipeline("ner", model="mys/electra-base-turkish-cased-ner", use_onnx=True, optimize=True,
                    grouped_entities=True)
nlp("adana kebap ülkemizin önemli lezzetlerinden biridir.")
# [{'entity_group': 'B-food', 'score': 0.869149774312973, 'word': 'adana kebap'}]

Set use_onnx to False for standard torch inference. Set optimize to True for quantize with ONNX. ( set use_onnx to True)

Supported pipelines

You can create Pipeline objects for the following down-stream tasks:

  • feature-extraction: Generates a tensor representation for the input sequence
  • ner and token-classification: Generates named entity mapping for each word in the input sequence.
  • sentiment-analysis: Gives the polarity (positive / negative) of the whole input sequence. Can be used for any text classification model.
  • question-answering: Provided some context and a question referring to the context, it will extract the answer to the question in the context.
  • text-classification: Classifies sequences according to a given number of classes from training.
  • zero-shot-classification: Classifies sequences according to a given number of classes directly in runtime.
  • fill-mask: The task of masking tokens in a sequence with a masking token, and prompting the model to fill that mask with an appropriate token.
  • text-generation: The task of generating text according to the previous text provided.

Calling the pipeline for the first time loads the model, creates the onnx graph, and caches it for future use. Due to this, the first load will take some time. Subsequent calls to the same model will load the onnx graph automatically from the cache.

Benchmarks

Note: For some reason, onnx is slow on colab notebook, so you won't notice any speed-up there. Benchmark it on your own hardware.

Check our example of benchmarking: example.

For detailed benchmarks and other information refer to this blog post and notebook.

Note: These results were collected on my local machine. So if you have high performance machine to benchmark, please contact me.

Benchmark sentiment-analysis pipeline

Benchmark zero-shot-classification pipeline

Benchmark token-classification pipeline

Benchmark question-answering pipeline

Benchmark fill-mask pipeline

About

Built by Aleksey Korshuk

Follow

Follow

Follow

🚀 If you want to contribute to this project OR create something cool together — contact me: link

Star this repository:

GitHub stars

Resources

You might also like...
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences an

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.
A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

sam4onnx A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx] ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX
ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX
ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

sne4onnx A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or

Simple ONNX operation generator. Simple Operation Generator for ONNX.
Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

Simple tool to combine(merge) onnx models.  Simple Network Combine Tool for ONNX.
Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

snc4onnx Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools 1.

Releases(v0.2.1-upd)
Owner
Aleksey Korshuk
Aleksey Korshuk
This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Data-Science-Intern-Challenge This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge. Summer 2022 Data Science Inte

1 Jan 11, 2022
[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Balanced MSE Code for the paper: Balanced MSE for Imbalanced Visual Regression Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu CVPR 2022 (Oral) News

Jiawei Ren 267 Jan 01, 2023
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

Predicting Student Attentiveness using OpenCV

Predicting-Student-Attentiveness-using-OpenCV The model will predict if a student is attentive or not through facial parameter received through the st

Johann Pinto 2 Aug 20, 2022
Episodic-memory - Ego4D Episodic Memory Benchmark

Ego4D Episodic Memory Benchmark EGO4D is the world's largest egocentric (first p

3 Feb 18, 2022
A pytorch &keras implementation and demo of Fastformer.

Fastformer Notes from the authors Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The

153 Dec 28, 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022
MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao

298 Jan 08, 2023
Deep Reinforcement Learning with pytorch & visdom

Deep Reinforcement Learning with pytorch & visdom Sample testings of trained agents (DQN on Breakout, A3C on Pong, DoubleDQN on CartPole, continuous A

Jingwei Zhang 783 Jan 04, 2023
Research code of ICCV 2021 paper "Mesh Graphormer"

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

Microsoft 251 Jan 08, 2023
Modification of convolutional neural net "UNET" for image segmentation in Keras framework

ZF_UNET_224 Pretrained Model Modification of convolutional neural net "UNET" for image segmentation in Keras framework Requirements Python 3.*, Keras

209 Nov 02, 2022
Yolact-keras实例分割模型在keras当中的实现

Yolact-keras实例分割模型在keras当中的实现 目录 性能情况 Performance 所需环境 Environment 文件下载 Download 训练步骤 How2train 预测步骤 How2predict 评估步骤 How2eval 参考资料 Reference 性能情况 训练数

Bubbliiiing 11 Dec 26, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network This is the official implementation of

azad 2 Jul 09, 2022
Generic ecosystem for feature extraction from aerial and satellite imagery

Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue. The main developers (@daniel-j-h, @bkowshik) are no l

Mapbox 1.9k Jan 06, 2023
Library for fast text representation and classification.

fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme

Facebook Research 24.1k Jan 01, 2023
Self-Supervised Learning for Domain Adaptation on Point-Clouds

Self-Supervised Learning for Domain Adaptation on Point-Clouds Introduction Self-supervised learning (SSL) allows to learn useful representations from

Idan Achituve 66 Dec 20, 2022
MediaPipeのPythonパッケージのサンプルです。2020/12/11時点でPython実装のある4機能(Hands、Pose、Face Mesh、Holistic)について用意しています。

mediapipe-python-sample MediaPipeのPythonパッケージのサンプルです。 2020/12/11時点でPython実装のある以下4機能について用意しています。 Hands Pose Face Mesh Holistic Requirement mediapipe 0.

KazuhitoTakahashi 217 Dec 12, 2022
Educational 2D SLAM implementation based on ICP and Pose Graph

slam-playground Educational 2D SLAM implementation based on ICP and Pose Graph How to use: Use keyboard arrow keys to navigate robot. Press 'r' to vie

Kirill 19 Dec 17, 2022
Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

EMOShip This repository contains the EMO-Film dataset described in the paper "Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis

1 Nov 18, 2022