Serving PyTorch 1.0 Models as a Web Server in C++

Last update: Jan 04, 2023

Related tags

Overview

Serving PyTorch Models in C++

This repository contains various examples to perform inference using PyTorch C++ API.
Run git clone https://github.com/Wizaron/pytorch-cpp-inference in order to clone this repository.

Environment

Dockerfiles can be found at docker directory. There are two dockerfiles; one for cpu and the other for cuda10. In order to build docker image, you should go to docker/cpu or docker/cuda10 directory and run docker build -t <docker-image-name> ..
After creation of the docker image, you should create a docker container via docker run -v <directory-that-this-repository-resides>:<target-directory-in-docker-container> -p 8181:8181 -it <docker-image-name> (We will use 8181 to serve our PyTorch C++ model).
Inside docker container, go to the directory that this repository resides.
Download libtorch from PyTorch Website (CPU : https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcpu.zip - CUDA10 : https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.3.1.zip).
Unzip libtorch via unzip. This will create libtorch directory that contains torch shared libraries and headers.

Code Structure

models directory stores PyTorch models.
libtorch directory stores C++ torch headers and shared libraries to link the model against PyTorch.
utils directory stores various utility function to perform inference in C++.
inference-cpp directory stores codes to perform inference.

Exporting PyTorch ScriptModule

In order to export torch.jit.ScriptModule of ResNet18 to perform C++ inference, go to models/resnet directory and run python3 resnet.py. It will download pretrained ResNet18 model on ImageNet and create models/resnet_model_cpu.pth and (optionally) models/resnet_model_gpu.pth which we will use in C++ inference.

Serving the C++ Model

We can either serve the model as a single executable or as a web server.

Single Executable

In order to build a single executable for inference:
1. Go to inference-cpp/cnn-classification directory.
2. Run ./build.sh in order to build executable, named as predict.
3. Run the executable via ./predict <path-to-image> <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}>.
4. Example: ./predict image.jpeg ../../models/resnet/resnet_model_cpu.pth ../../models/resnet/labels.txt false

Web Server

In order to build a web server for production:
1. Go to inference-cpp/cnn-classification/server directory.
2. Run ./build.sh in order to build web server, named as predict.
3. Run the binary via ./predict <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}> (It will serve the model on http://localhost:8181/predict).
4. Example: ./predict ../../../models/resnet/resnet_model_cpu.pth ../../../models/resnet/labels.txt false
5. In order to make a request, open a new tab and run python test_api.py (It will make a request to localhost:8181/predict).

Serving PyTorch 1.0 Models as a Web Server in C++

Related tags

Overview

Serving PyTorch Models in C++

Environment

Code Structure

Exporting PyTorch ScriptModule

Serving the C++ Model

Single Executable

Web Server

Acknowledgement

Owner

Onur Kaplan

codes for IKM (arXiv2021, Submitted to IEEE Trans)

Code for Transformer Hawkes Process, ICML 2020.

Tooling for GANs in TensorFlow

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

TensorFlow-based neural network library

The MLOps platform for innovators 🚀

A large-image collection explorer and fast classification tool

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

A simple code to convert image format and channel as well as resizing and renaming multiple images.

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

An imperfect information game is a type of game with asymmetric information

A modern pure-Python library for reading PDF files

An open source Jetson Nano baseboard and tools to design your own.

Curating a dataset for bioimage transfer learning

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

[ICCV2021] Learning to Track Objects from Unlabeled Videos