Model Training as a CI/CD System

This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed workflow in the below section, but it is about rebuilding and redeploying (continuous integration) the currently deployed machine learning pipeline based on changes in code. Such changes could happen in the training data, data pre-processing logic, model architecture and training code, custom pipeline components, and so on.

Workflow #1

We create initial code, or we make some changes in the existing codebase for pipeline.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- Cloud Build containerizes the current codebase. This is an optional step. If you have any custom components unchanges, this step might be omitted.
  - The Cloud Build compiles a new pipeline. It creates an updated docker image, and it uploads the new docker image to GCR
- If there is any codes changed in data preprocessing, modeling, training steps, we only have to upload those source files to designated GCS bucket
The final step of the Cloud Build is to execute a pipeline run on Vertex AI

Workflow #2

Workflow in a nutshell

We create initial code, or we make some changes in the existing codebase for modules.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- If there is any codes changed in data preprocessing and models, we only have to upload those source files to designated GCS bucket.
The final step of the Cloud Build is to execute a pipeline run on Vertex AI. Trainer and Transform TFX components will look up the changed modules accordingly.

Acknowledgements

ML-GDE program for providing GCP credits.

Demonstration of the Model Training as a CI/CD System in Vertex AI

Related tags

Overview

Model Training as a CI/CD System

Workflow #1

Workflow #2

Workflow in a nutshell

Acknowledgements

Owner

Chansung Park

Keras implementation of AdaBound

Pytorch implementation of Learning with Opponent-Learning Awareness

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Quantized tflite models for ailia TFLite Runtime

A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets

Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

A Python Package for Convex Regression and Frontier Estimation

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

Final project for Intro to CS class.

ICCV2021 - A New Journey from SDRTV to HDRTV.

Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Unsupervised Pre-training for Person Re-identification (LUPerson)