Inferoxy is a service for quick deploying and using dockerized Computer Vision models.

Overview

Inferoxy

codecov

What is it?

Inferoxy is a service for quick deploying and using dockerized Computer Vision models. It's a core of EORA's Computer Vision platform Vision Hub that runs on top of AWS EKS.

Why use it?

You should use it if:

  • You want to simplify deploying Computer Vision models with an appropriate Data Science stack to production: all you need to do is to build a Docker image with your model including any pre- and post-processing steps and push it into an accessible registry
  • You have only one machine or cluster for inference (CPU/GPU)
  • You want automatic batching for multi-GPU/multi-node setup
  • Model versioning

Architecture

Overall architecture

Inferoxy is built using message broker pattern.

  • Roughly speaking, it accepts user requests through different interfaces which we call "bridges". Multiple bridges can run simultaneously. Current supported bridges are REST API, gRPC and ZeroMQ
  • The requests are carefully split into batches and processed on a single multi-GPU machine or a multi-node cluster
  • The models to be deployed are managed through Model Manager that communicates with Redis to store/retrieve models information such as Docker image URL, maximum batch size value, etc.

Batching

Batching

One of the core Inferoxy's features is the batching mechanism.

  • For batch processing it's taken into consideration that different models can utilize different batch sizes and that some models can process a series of batches from a specific user, e.g. for video processing tasks. The latter models are called "stateful" models while models which don't depend on user state are called "stateless"
  • Multiple copies of the same model can run on different machines while only one copy can run on the same GPU device. So, to increase models efficiency it's recommended to set batch size for models to be as high as possible
  • A user of the stateful model reserves the whole copy of the model and releases it when his task is finished.
  • Users of the stateless models can use the same copy of the model simultaneously
  • Numpy tensors of RGB images with metadata are all going through ZeroMQ to the models and the results are also read from ZeroMQ socket

Cluster management

Cluster

The cluster management consists of keeping track of the running copies of the models, load analysis, health checking and alerting.

Requirements

You can run Inferoxy locally on a single machine or k8s cluster. To run Inferoxy, you should have a minimum of 4GB RAM and CPU or GPU device depending on your speed/cost trade-off.

Basic commands

Local run

To run locally you should use Inferoxy Docker image. The last version you can find here.

docker pull public.registry.visionhub.ru/inferoxy:v1.0.4

After image is pulled we need to make basic configuration using .env file

# .env
CLOUD_CLIENT=docker
TASK_MANAGER_DOCKER_CONFIG_NETWORK=inferoxy
TASK_MANAGER_DOCKER_CONFIG_REGISTRY=
TASK_MANAGER_DOCKER_CONFIG_LOGIN=
TASK_MANAGER_DOCKER_CONFIG_PASSWORD=
MODEL_STORAGE_DATABASE_HOST=redis
MODEL_STORAGE_DATABASE_PORT=6379
MODEL_STORAGE_DATABASE_NUMBER=0
LOGGING_LEVEL=INFO

The next step is to create inferoxy Docker network.

docker network create inferoxy

Now we should run Redis in this network. Redis is needed to store information about your models.

docker run --network inferoxy --name redis redis:latest 

Create models.yaml file with simple set of models. You can read about models.yaml in documentation

stub:
  address: public.registry.visionhub.ru/models/stub:v5
  batch_size: 256
  run_on_gpu: False
  stateless: True

Now we can start Inferoxy:

docker run --env-file .env 
	-v /var/run/docker.sock:/var/run/docker.sock \
	-p 7787:7787 -p 7788:7788 -p 8000:8000 -p 8698:8698\
	--name inferoxy --rm \
	--network inferoxy \
	-v $(pwd)/models.yaml:/etc/inferoxy/models.yaml \
	public.registry.visionhub.ru/inferoxy:${INFEROXY_VERSION}

Documentation

You can find the full documentation here

Discord

Join our community in Discord server to discuss stuff related to Inferoxy usage and development

Bash-based Python-venv convenience wrapper

venvrc Bash-based Python-venv convenience wrapper. Demo Install Copy venvrc file to ~/.venvrc, and add the following line to your ~/.bashrc file: # so

1 Dec 29, 2022
Automate SSH in python easily!

RedExpect RedExpect makes automating remote machines over SSH very easy to do and is very fast in doing exactly what you ask of it. Based on ssh2-pyth

Red_M 19 Dec 17, 2022
Travis CI testing a Dockerfile based on Palantir's remix of Apache Cassandra, testing IaC, and testing integration health of Debian

Testing Palantir's remix of Apache Cassandra with Snyk & Travis CI This repository is to show Travis CI testing a Dockerfile based on Palantir's remix

Montana Mendy 1 Dec 20, 2021
Python IMDB Docker - A docker tutorial to containerize a python script.

Python_IMDB_Docker A docker tutorial to containerize a python script. Build the docker in the current directory: docker build -t python-imdb . Run the

Sarthak Babbar 1 Dec 30, 2021
Ansible Collection: A collection of Ansible Modules and Lookup Plugins (MLP) from Linuxfabrik.

ansible_mlp An Ansible collection of Ansible Modules and Lookup Plugins (MLP) from Linuxfabrik. Ansible Bitwarden Item Lookup Plugin Returns a passwor

Linuxfabrik 2 Feb 07, 2022
Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Cilium 2.4k Jan 04, 2023
A repository containing a short tutorial for Docker (with Python).

Docker Tutorial for IFT 6758 Lab In this repository, we examine the advtanges of virtualization, what Docker is and how we can deploy simple programs

Arka Mukherjee 0 Dec 14, 2021
Manage your SSH like a boss.

--- storm is a command line tool to manage your ssh connections. features adding, editing, deleting, listing, searching across your SSHConfig. command

Emre Yılmaz 3.9k Jan 03, 2023
Webinar oficial Zabbix Brasil. Uma série de 4 aulas sobre API do Zabbix.

Repositório de scripts do Webinar de API do Zabbix Webinar oficial Zabbix Brasil. Uma série de 4 aulas sobre API do Zabbix. Nossos encontros [x] 04/11

Robert Silva 7 Mar 31, 2022
Bitnami Docker Image for Python using snapshots for the system packages repositories

Python Snapshot packaged by Bitnami What is Python Snapshot? Python is a programming language that lets you work quickly and integrate systems more ef

Bitnami 1 Jan 13, 2022
A job launching library for docker, EC2, GCP, etc.

doodad A library for packaging dependencies and launching scripts (with a focus on python) on different platforms using Docker. Currently supported pl

Justin Fu 55 Aug 27, 2022
A Blazing fast Security Auditing tool for Kubernetes

A Blazing fast Security Auditing tool for kubernetes!! Basic Overview Kubestriker performs numerous in depth checks on kubernetes infra to identify th

Vasant Chinnipilli 934 Jan 04, 2023
MLops tools review for execution on multiple cluster types: slurm, kubernetes, dask...

MLops tools review focused on execution using multiple cluster types: slurm, kubernetes, dask...

4 Nov 30, 2022
Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

iterable-subprocess Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be

Department for International Trade 5 Jul 10, 2022
Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:

Latest Salt Documentation Open an issue (bug report, feature request, etc.) Salt is the world’s fastest, most intelligent and scalable automation engi

SaltStack 12.9k Jan 04, 2023
gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.

Gunicorn Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model ported from Ruby's Unicorn project. The Gunicorn

Benoit Chesneau 8.7k Jan 08, 2023
Big data on k8s

# microsoft azure # https://docs.microsoft.com/en-us/cli/azure/install-azure-cli az account set --subscription [] az aks get-credentials --resource-g

Luan Moreno 22 Dec 24, 2022
Automatically capture your Ookla Speedtest metrics and display them in a Grafana dashboard

Speedtest All-In-One Automatically capture your Ookla Speedtest metrics and display them in a Grafana dashboard. Getting Started About This Code This

Aaron Melton 2 Feb 22, 2022
Let's learn how to build, release and operate your containerized applications to Amazon ECS and AWS Fargate using AWS Copilot.

🚀 Welcome to AWS Copilot Workshop In this workshop, you'll learn how to build, release and operate your containerised applications to Amazon ECS and

Donnie Prakoso 15 Jul 14, 2022