The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

Overview

GitHub Contributors Image

MLOps

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

In this paradigm, teams can easily collaborate in models, with clear tracking of the data throughout the process of cleaning, processing, and feature creation. Automating every repetitive process avoids human error and reduces the delivery time, ensuring the team keeps focusing on the Business Problem.

Some benefits:

  • Versioning data and code, making models to be auditable and reproducible.

  • Automated tests and building ensuring quality functioning of artifacts and availability for the delivery pipelines.

  • Makes it easier and faster the deployment of new models by using an automated cycle.

The MLOps Project

The MLOps project is a path to learning how to implement a study case aiming to be testable and reproducible within the CI/CD methodology, using the best programming practices.

The scope of this project is delimited as you can see in the image below.

We will select the best tool to implement every step, integrate them, and build a Machine Learning Orchestrator. That said, in the end, new ML experiments will be easily made, and delivered as simples as typing a terminal command or clicking on a button!


Prerequisites

For mlops_project to work correctly, first, you should install the prerequisites

Contributing

Have an idea of how to improve this project but don't know how to start, try to contribute

You can understand the project organization here

How to use?

If you are interested just in using this package, follow the steps below.

  1. Clone the repository

    Open a terminal (if you are using Windows, make sure of using the git bash) navigate to the desired destination folder and clone the repository,

    git clone https://github.com/Schots/mlops_project.git

    The Makefile on the root folder defines a set of functions needed to automate repetitive processes in this project. Type "make" in the terminal and see the available functions.


  1. Create an environment & Install requirements

    Create a Python virtual environment for the MLOps project on your local machine. Use any tool you desire. Activate the environment and install the requirements using make:

    make requirements
  2. Download data

    To download the raw dataset, use the get_data

    make get_data

    type the dataset name when prompted. The zip file with data will be downloaded and unzipped under the data/raw folder


Project based on the cookiecutter data science project template. #cookiecutterdatascience

Owner
Maykon Schots
Maykon Schots
Applied Machine Learning for Graduate Program in Computer Science (PPGCC)

Applied Machine Learning for Graduate Program in Computer Science (PPGCC) - Federal University of Santa Catarina

Jônatas Negri Grandini 1 Dec 22, 2021
Banpei is a Python package of the anomaly detection.

Banpei Banpei is a Python package of the anomaly detection. Anomaly detection is a technique used to identify unusual patterns that do not conform to

Hirofumi Tsuruta 282 Jan 03, 2023
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
Python library for multilinear algebra and tensor factorizations

scikit-tensor is a Python module for multilinear algebra and tensor factorizations

Maximilian Nickel 394 Dec 09, 2022
Machine-care - A simple python script to take care of simple maintenance tasks

Machine care An simple python script to take care of simple maintenance tasks fo

2 Jul 10, 2022
A Pythonic framework for threat modeling

pytm: A Pythonic framework for threat modeling Introduction Traditional threat modeling too often comes late to the party, or sometimes not at all. In

Izar Tarandach 644 Dec 20, 2022
Retrieve annotated intron sequences and classify them as minor (U12-type) or major (U2-type)

(intron I nterrogator and C lassifier) intronIC is a program that can be used to classify intron sequences as minor (U12-type) or major (U2-type), usi

Graham Larue 4 Jul 26, 2022
A Python package to preprocess time series

Disclaimer: This package is WIP. Do not take any APIs for granted. tspreprocess Time series can contain noise, may be sampled under a non fitting rate

Maximilian Christ 57 Dec 17, 2022
A Time Series Library for Apache Spark

Flint: A Time Series Library for Apache Spark The ability to analyze time series data at scale is critical for the success of finance and IoT applicat

Two Sigma 970 Jan 04, 2023
虚拟货币(BTC、ETH)炒币量化系统项目。在一版本的基础上加入了趋势判断

🎉 第二版本 🎉 (现货趋势网格) 介绍 在第一版本的基础上 趋势判断,不在固定点位开单,选择更优的开仓点位 优势: 🎉 简单易上手 安全(不用将api_secret告诉他人) 如何启动 修改app目录下的authorization文件

幸福村的码农 250 Jan 07, 2023
Customers Segmentation with RFM Scores and K-means

Customer Segmentation with RFM Scores and K-means RFM Segmentation table: K-Means Clustering: Business Problem Rule-based customer segmentation machin

5 Aug 10, 2022
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Seldon Core: Blazing Fast, Industry-Ready ML An open source platform to deploy your machine learning models on Kubernetes at massive scale. Overview S

Seldon 3.5k Jan 01, 2023
A Python step-by-step primer for Machine Learning and Optimization

early-ML Presentation General Machine Learning tutorials A Python step-by-step primer for Machine Learning and Optimization This github repository gat

Dimitri Bettebghor 8 Dec 01, 2022
A collection of neat and practical data science and machine learning projects

Data Science A collection of neat and practical data science and machine learning projects Explore the docs » Report Bug · Request Feature Table of Co

Will Fong 2 Dec 10, 2021
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.4k Jan 15, 2022
Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models.

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow scikit-learn's functionality wit

Soledad Galli 33 Dec 27, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
[HELP REQUESTED] Generalized Additive Models in Python

pyGAM Generalized Additive Models in Python. Documentation Official pyGAM Documentation: Read the Docs Building interpretable models with Generalized

daniel servén 747 Jan 05, 2023
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

A unified Data Analytics and AI platform for distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray What is Analytics Zoo? Analytics Zo

2.5k Dec 28, 2022
Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máquina.

Estatistica para Ciência de Dados e Machine Learning Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máqui

Renan Barbosa 1 Jan 10, 2022