This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

Overview

PyTorch Infer Utils

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

To install

git clone https://github.com/gorodnitskiy/pytorch_infer_utils.git
pip install /path/to/pytorch_infer_utils/

Export PyTorch model to ONNX

  • Check model for denormal weights to achieve better performance. Use load_weights_rounded_model func to load model with weights rounding:
    from pytorch_infer_utils import load_weights_rounded_model
    
    model = ModelClass()
    load_weights_rounded_model(
        model,
        "/path/to/model_state_dict",
        map_location=map_location
    )
    
  • Use ONNXExporter.torch2onnx method to export pytorch model to ONNX:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, 224, 224] # -1 means that is dynamic shape
    exporter.torch2onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Use ONNXExporter.optimize_onnx method to optimize ONNX via onnxoptimizer:
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Use ONNXExporter.optimize_onnx_sim method to optimize ONNX via onnx-simplifier. Be careful with onnx-simplifier not to lose dynamic shapes.
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx_sim("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Also, a method combined the above methods is available ONNXExporter.torch2optimized_onnx:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, -1, -1] # -1 means that is dynamic shape
    exporter.torch2optimized_onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Other params that can be used in class initialization:
    • default_shapes: default shapes if dimension is dynamic, default = [1, 3, 224, 224]
    • onnx_export_params:
      • export_params: store the trained parameter weights inside the model file, default = True
      • do_constant_folding: whether to execute constant folding for optimization, default = True
      • input_names: the model's input names, default = ["input"]
      • output_names: the model's output names, default = ["output"]
      • opset_version: the ONNX version to export the model to, default = 11
    • onnx_optimize_params:
      • fixed_point: use fixed point, default = False
      • passes: optimization passes, default = [ "eliminate_deadend", "eliminate_duplicate_initializer", "eliminate_identity", "eliminate_if_with_const_cond", "eliminate_nop_cast", "eliminate_nop_dropout", "eliminate_nop_flatten", "eliminate_nop_monotone_argmax", "eliminate_nop_pad", "eliminate_nop_transpose", "eliminate_unused_initializer", "extract_constant_to_initializer", "fuse_add_bias_into_conv", "fuse_bn_into_conv", "fuse_consecutive_concats", "fuse_consecutive_log_softmax", "fuse_consecutive_reduce_unsqueeze", "fuse_consecutive_squeezes", "fuse_consecutive_transposes", "fuse_matmul_add_bias_into_gemm", "fuse_pad_into_conv", "fuse_transpose_into_gemm", "lift_lexical_references", "nop" ]

Export ONNX to TensorRT

  • Check TensorRT health via check_tensorrt_health func
  • Use TRTEngineBuilder.build_engine method to export ONNX to TensorRT:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    # get engine by itself
    engine = exporter.build_engine("/path/to/model.onnx")
    # or save engine to /path/to/model.trt
    exporter.build_engine("/path/to/model.onnx", engine_path="/path/to/model.trt")
    
  • fp16_mode is available:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine("/path/to/model.onnx", fp16_mode=True)
    
  • int8_mode is available. It requires calibration_set of images as List[Any], load_image_func - func to correctly read and process images, max_image_shape - max image size as [C, H, W] to allocate correct size of memory:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine(
        "/path/to/model.onnx",
        int8_mode=True,
        calibration_set=calibration_set,
        max_image_shape=max_image_shape,
        load_image_func=load_image_func,
    )
    
  • Also, additional params for builder config builder.create_builder_config can be put to kwargs.
  • Other params that can be used in class initialization:
    • opt_shape_dict: optimal shapes, default = {'input': [[1, 3, 224, 224], [1, 3, 224, 224], [1, 3, 224, 224]]}
    • max_workspace_size: max workspace size, default = [1, 30]
    • stream_batch_size: batch size for forward network during transferring to int8, default = 100
    • cache_file: int8_mode cache filename, default = "model.trt.int8calibration"

Inference via onnxruntime on CPU and onnx_tensort on GPU

  • Base class ONNXWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: str,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
    ) -> None:
        """
        :param onnx_path: onnx-file path, required
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :type onnx_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            import onnx
            import onnx_tensorrt.backend as backend
    
            self.is_using_tensorrt = True
            model_proto = onnx.load(onnx_path)
            for gr_input in model_proto.graph.input:
                gr_input.type.tensor_type.shape.dim[0].dim_value = 1
    
            self.engine = backend.prepare(
                model_proto, device=f"CUDA:{gpu_device_id}"
            )
    
  • ONNXWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.engine.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Inference via onnxruntime on CPU and TensorRT on GPU

  • Base class TRTWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: Optional[str] = None,
        trt_path: Optional[str] = None,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
        fp16_mode: bool = False,
    ) -> None:
        """
        :param onnx_path: onnx-file path, default = None
        :param trt_path: onnx-file path, default = None
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param fp16_mode: use fp16_mode if class initializes only with
            onnx_path on GPU, default = False
        :type onnx_path: str
        :type trt_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        :type fp16_mode: bool
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            self.is_using_tensorrt = True
            if trt_path is None:
                builder = TRTEngineBuilder()
                trt_path = builder.build_engine(onnx_path, fp16_mode=fp16_mode)
    
            self.trt_session = TRTRunWrapper(trt_path)
    
  • TRTWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.trt_session.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Environment

TensorRT

  • TensorRT installing guide is here
  • Required CUDA-Runtime, CUDA-ToolKit
  • Also, required additional python packages not included to setup.cfg (it depends upon CUDA environment version):
    • pycuda
    • nvidia-tensorrt
    • nvidia-pyindex

onnx_tensorrt

  • onnx_tensorrt requires cuda-runtime and tensorrt.
  • To install:
    git clone --depth 1 --branch 21.02 https://github.com/onnx/onnx-tensorrt.git
    cd onnx-tensorrt
    cp -r onnx_tensorrt /usr/local/lib/python3.8/dist-packages
    cd ..
    rm -rf onnx-tensorrt
    
Owner
Alex Gorodnitskiy
Computer Vision Engineer 🤖
Alex Gorodnitskiy
MakeItTalk: Speaker-Aware Talking-Head Animation

MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation

Adobe Research 285 Jan 08, 2023
A collection of scripts I developed for personal and working projects.

A collection of scripts I developed for personal and working projects Table of contents Introduction Repository diagram structure List of scripts pyth

Gianluca Bianco 109 Dec 26, 2022
In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021. Balestriero et

Sean M. Hendryx 1 Jan 27, 2022
This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

0 Feb 02, 2022
Simultaneous Demand Prediction and Planning

Simultaneous Demand Prediction and Planning Dependencies Python packages: Pytorch, scikit-learn, Pandas, Numpy, PyYAML Data POI: data/poi Road network

Yizong Wang 1 Sep 01, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022
Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains

Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains This is an accompanying repository to the ICAIL 2021 pap

4 Dec 16, 2021
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

Max Woolf 81 Jan 04, 2023
Over9000 optimizer

Optimizers and tests Every result is avg of 20 runs. Dataset LR Schedule Imagenette size 128, 5 epoch Imagewoof size 128, 5 epoch Adam - baseline OneC

Mikhail Grankin 405 Nov 27, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 07, 2023
E2C implementation in PyTorch

Embed to Control implementation in PyTorch Paper can be found here: https://arxiv.org/abs/1506.07365 You will need a patched version of OpenAI Gym in

Yicheng Luo 42 Dec 12, 2022
Get started with Machine Learning with Python - An introduction with Python programming examples

Machine Learning With Python Get started with Machine Learning with Python An engaging introduction to Machine Learning with Python TL;DR Download all

Learn Python with Rune 130 Jan 02, 2023
PyTorch implementation of the ideas presented in the paper Interaction Grounded Learning (IGL)

Interaction Grounded Learning This repository contains a simple PyTorch implementation of the ideas presented in the paper Interaction Grounded Learni

Arthur Juliani 4 Aug 31, 2022
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

Fluke289_data_access A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required. Created from informa

3 Dec 08, 2022
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences forImage-Text Retrieval

NSGDC Some codes in this repo are copied/modified from opensource implementations made available by UNITER, PyTorch, HuggingFace, OpenNMT, and Nvidia.

Zhihao Fan 2 Nov 07, 2022
"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Graph Neural Controlled Differential Equations for Traffic Forecasting Setup Python environment for STG-NCDE Install python environment $ conda env cr

Jeongwhan Choi 55 Dec 28, 2022
PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images

wrist-d PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images note: Paper: Under Review at MPDI Diagnostics Submission Date: Novemb

Fatih UYSAL 5 Oct 12, 2022
With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function

With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function. At the momen

ChemEngAI 40 Dec 27, 2022
OpenVisionAPI server

🚀 Quick start An instance of ova-server is free and publicly available here: https://api.openvisionapi.com Checkout ova-client for a quick demo. Inst

Open Vision API 93 Nov 24, 2022