Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Overview

Modern Data Lake Storage Layers

This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache Iceberg, and Delta Lake.

Specifically, there's a CloudFormation template to create an EMR cluster and EMR Studio with the necessary requirements and Jupyter notebooks with the example walkthroughs.

You can view the corresponding blog post and video

Pre-requisites

You'll need an AWS Account in which you have administrator privileges and the ability to deploy a CloudFormation template. The template will create an EMR Cluster and S3 bucket that will incur charges - be sure to either shut down the cluster when done or delete the CloudFormation stack. In order to delete the CloudFormation stack, you'll need to:

  • Manually delete any EMR Studio Workspaces you created
  • Manually empty the S3 bucket created by CloudFormation
  • Manually delete the VPC created by CloudFormation due to auto-created rules

Overview

The included CloudFormation template creates a new VPC and EMR Cluster for you to be able to run the notebooks. An EMR Studio is also created and you can find the Studio URL in the Outputs tab of your CloudFormation Stack.

Once the stack is done creating, you'll need to navigate to EMR Studio and create a new workspace attached to the "data-lakes" cluster.

Inside the workspace you either upload each notebook individually from the notebooks/ folder or simply connect to this repository by using the "Git" icon on the left-hand side.

An advanced api client for python botters.

[ALPHA] pybotters An advanced api client for python botters. 📌 Description pybottersは仮想通貨botter向けのPythonライブラリです。複数取引所に対応した非同期APIクライアントであり、bot開発により素晴ら

261 Dec 31, 2022
Coronavirus whatsapp chatbot to give real time info on covid

Covy Developed a coronavirus whatsapp chatbot which gives case counts in a particular district, city, state or country. It also predicts future cases

Devinco (Rachit) 0 Oct 03, 2021
Aws-lambda-requests-wrapper - Request/Response wrapper for AWS Lambda with API Gateway

AWS Lambda Requests Wrapper Request/Response wrapper for AWS Lambda with API Gat

1 May 20, 2022
HackZ-Token-Grabber-V2 - HackZ Token Grabber V2

HackZ-Token-Grabber-V2 was made by Love ❌ code ✅ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ ‎ 🌟

! ™NightMare 2 Mar 01, 2022
arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaplex Candy Machine.

arweave-nft-uploader arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaple

0xEnrico 84 Dec 26, 2022
🤖 Fast and simple bot to transform links from Amazon into a nice post with your referral link in Telegram 🛒

AmazonBot 🤖 Fast and simple bot to transform links from Amazon into a nice post with your referral link in Telegram 🛒 Prerequisites You need Python

Alternative Profit 3 Dec 25, 2022
Home Assistant Hilo Integration via HACS

BETA This is a beta release. There will be some bugs, issues, etc. Please bear with us and open issues in the repo. Hilo Hilo integration for Home Ass

66 Dec 23, 2022
Info gathering | API hacketarget.com

InfoFetch Info gathering | API hackertarget.com set-up: apt-get install python3 pip3 install requests apt-get install git git clone https://github.com

Muhammed Rizad 4 Nov 22, 2021
Open Resource Calculator Module for Python

Calculator Module for Python By White Night Install #

White_Night_awa 4 Aug 22, 2022
Neko is An Anime themed advance Telegram group management bot.

NekoRobot A modular telegram Python bot running on python3 with an sqlalchemy, mongodb database. ╒═══「 Status 」 Maintained Support Group Included Free

Lovely Boy 22 Jan 05, 2023
API para realizar parser de frases

NLP API Simple api to parse and apply some preprocessing steps in portuguses phrases (pt_BR) This api uses the great FastAPI and spaCy packages! Usage

⟠ Rodolfo De Nadai 1 Dec 28, 2021
Data portal client and server for NMDC.

NMDC Server and Client Portal Getting started with Docker install ldc install submodules via git submodule update --init --recursive In order to popul

National Microbiome Data Collaborative 7 Dec 14, 2022
Discord Bot for League of Legends live match tracker

SABot Dicord Bot for League of Legends match auto tracker Features: Search Summoners statistics in League of Legends. Auto-notifications provide when

Jungyu Choi 4 Sep 27, 2022
A Python script for rendering glTF files with V-Ray App SDK

V-Ray glTF viewer Overview The V-Ray glTF viewer is a set of Python scripts for the V-Ray App SDK that allow the parsing and rendering of glTF (.gltf

Chaos 24 Dec 05, 2022
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python

Definitive Guide to Creating a SQL Database on Cloud with AWS and Python An easy-to-follow comprehensive guide on integrating Amazon RDS, MySQL Workbe

Kenneth Leung 6 Aug 17, 2022
in-progress decompilation of Gauntlet Legends decompression code on the N64

Gauntlet-Legends A in-progress decompilation of Gauntlet-Legends (N64) decompression code. This project currently supports the US release. Building (L

6 Jul 23, 2022
Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity

PyPortfolioOpt has recently been published in the Journal of Open Source Software 🎉 PyPortfolioOpt is a library that implements portfolio optimizatio

Robert Martin 3.2k Jan 02, 2023
this is an op music pyrogram music bot.

amanrajputpytgcallmusic this is an op music pyrogram music bot..... this bot user music bot can play music without being admin...... TG-MusicPlayer A

2 Dec 27, 2021
Bot developed in python, 100% open-source, compatible with Windows and Linux.

Bombcrypto Bot [Family JOW] Bot desenvolvido em python, 100% do código é aberto, para aqueles que tenham conhecimento validarem que não existe nenhum

Renato Maia 71 Dec 20, 2022
Opasium AI was specifically designed for the Opasium Games discord only. It is a bot that covers the basic functions of any other bot.

OpasiumAI Opasium AI was specifically designed for the Opasium Games discord only. It is a bot that covers the basic functions of any other bot. Insta

Dan 3 Oct 15, 2021