Cloud-based recommendation system

Overview

Cloud-based recommendation system

This project is based on cloud services to create data lake, ETL process, train and deploy learning model to implement a recommendation system.

Purpose

One Web app can return if the consumer will buy the product or not when providing user ID and corresponding product SKU.

Services

This project will use services:

AWS: lambda function, Step functions, Glue (job,notebook,crawler), Athena, SNS, S3, Sagemaker, IAM, Dynamodb, API Gateway.

Confluent cloud (kafka) for streaming data.

Project description

  1. Create a bucket on S3 as the storage location of the data lake, store the raw data in the bucket (raw data zone), and then return the data after ETL to the same bucket (curated zone).

  2. Preview the data, determine the data is useful and meaningful for our project. Use AWS Glue crawler to grab corresponding data catalog (in created database and generated table info). Use Athena to do SQL query. This like Apache Hive, it does not change raw data, but do operations above the raw data.

  3. Create and store stream data. Create a kafka topic on Clonfluent cloud and set schema registry for the corresponding stream data, schema sets as confluent_cloud_kafka-->confluent_kafka_topic_schema.json. Set the kafka producer as confluent_cloud_kafka-->confluent_kafka_producer_lambda.py to push stream data to corresponding kafka topic in different partitions (because this project does not have exact source giving real stream data, we produce stream data manually). Set the consumer (confluent connector with AWS lambda) as confluent_cloud_kafka-->confluent_kafka_consumer_lambda.py to poll the stream data in kafka topic and store them in Dynamodb table.

  4. ETL process. Use lambda function to do data transformation operations based on SQL, corresponding scripts in file lambda_functions(ETL). Create Glue job to integrate new dataset and store in curated zone in data lake, scripts is in glue_job-->glue_job_ETL.py. Use step fuctions to orchestrate ETL workflow based on above lambda functions, ASL script is in step_function(workflow)-->step_functions_for_curated.json.

    This part is based on spark, and it is similar with the project in repo: https://github.com/Yi-Ding111/spark-ETL-based-databricks-aws.

  5. Train learning model (XGBoost). Use sagemaker notebook instance to do some kinds more operations like: EDA and feature engineering, use XGBoost framework to train the data, adjust parameters and try different attributes combinations to find the best one. Scripts is in sagemaker-->xgboost_deploy_sagemaker.ipynb.

  6. Deploy learning model. Get deploy endpoint after machine learning. Create lambda function to invoke the sagemaker endpoint to use the trained model, scripts is in sagemaker-->endpoint_interact_lambda.py. Let the lambda function integrate with API gatway (proxy integration) as the backend. Deploy the API gatewat and use the invoked URL for web applications to do interactions.

  7. Store the application output. Use SNS to publish the output to lambda and update the information into Dynamodb table, scripts is in sagemaker-->prediction_store_dynamodb.py


Acknowledgement

This project is completed with the guidance from Leo Lee (JR academy)


Author: YI DING, Leo Lee

Created at: Dec 2021

Contact: [email protected]

Owner
Yi Ding
Yi Ding
Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

57 Nov 03, 2022
Recommender systems are the systems that are designed to recommend things to the user based on many different factors

Recommender systems are the systems that are designed to recommend things to the user based on many different factors. The recommender system deals with a large volume of information present by filte

Happy N. Monday 3 Feb 15, 2022
Plex-recommender - Get movie recommendations based on your current PleX library

plex-recommender Description: Get movie/tv recommendations based on your current

5 Jul 19, 2022
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021
Self-supervised Graph Learning for Recommendation

SGL This is our Tensorflow implementation for our SIGIR 2021 paper: Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian,and Xing

151 Dec 20, 2022
Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

Xin Xia 43 Jan 04, 2023
Fast Python Collaborative Filtering for Implicit Feedback Datasets

Implicit Fast Python Collaborative Filtering for Implicit Datasets. This project provides fast Python implementations of several different popular rec

Ben Frederickson 3k Dec 31, 2022
Cross-Domain Recommendation via Preference Propagation GraphNet.

PPGN Codes for CIKM 2019 paper Cross-Domain Recommendation via Preference Propagation GraphNet. Citation Please cite our paper if you find this code u

Information Retrieval Group, Wuhan University, China 20 Dec 15, 2022
A library of Recommender Systems

A library of Recommender Systems This repository provides a summary of our research on Recommender Systems. It includes our code base on different rec

MilaGraph 980 Jan 05, 2023
Codes for AAAI'21 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'

DHCN Codes for AAAI 2021 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'. Please note that the default link

Xin Xia 124 Dec 14, 2022
Movies/TV Recommender

recommender Movies/TV Recommender. Recommends Movies, TV Shows, Actors, Directors, Writers. Setup Create file API_KEY and paste your TMDB API key in i

Aviem Zur 3 Apr 22, 2022
Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks

SR-HGNN ICDM-2020 《Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks》 Environments python 3.8 pytorch-1.6 DGL 0.5.

xhc 9 Nov 12, 2022
fastFM: A Library for Factorization Machines

Citing fastFM The library fastFM is an academic project. The time and resources spent developing fastFM are therefore justified by the number of citat

1k Dec 24, 2022
This is our implementation of GHCF: Graph Heterogeneous Collaborative Filtering (AAAI 2021)

GHCF This is our implementation of the paper: Chong Chen, Weizhi Ma, Min Zhang, Zhaowei Wang, Xiuqiang He, Chenyang Wang, Yiqun Liu and Shaoping Ma. 2

Chong Chen 53 Dec 05, 2022
Spotify API Recommnder System

This project will access your last listened songs on Spotify using its API, then it will request the user to select 5 favorite songs in that list, on which the API will proceed to make 50 recommendat

Kevin Luke 1 Dec 14, 2021
Deep recommender models using PyTorch.

Spotlight uses PyTorch to build both deep and shallow recommender models. By providing both a slew of building blocks for loss functions (various poin

Maciej Kula 2.8k Dec 29, 2022
Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

MGNN-SPred This is our Tensorflow implementation for the paper: WenWang,Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, and Hongyuan Zha. 2020. Bey

Wen Wang 18 Jan 02, 2023
Code for MB-GMN, SIGIR 2021

MB-GMN Code for MB-GMN, SIGIR 2021 For Beibei data, run python .\labcode.py For Tmall data, run python .\labcode.py --data tmall --rank 2 For IJCAI

32 Dec 04, 2022
A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

Nicolas Hug 5.7k Jan 01, 2023
An Efficient and Effective Framework for Session-based Social Recommendation

SEFrame This repository contains the code for the paper "An Efficient and Effective Framework for Session-based Social Recommendation". Requirements P

Tianwen CHEN 23 Oct 26, 2022