Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

Overview

time-series-kafka-demo

img

Mock stream producer for time series data using Kafka.

I walk through this tutorial and others here on GitHub and on my Medium blog. Here is a friend link for open access to the article on Towards Data Science: Make a mock “real-time” data stream with Python and Kafka. I'll always add friend links on my GitHub tutorials for free Medium access if you don't have a paid Medium membership (referral link).

If you find any of this useful, I always appreciate contributions to my Saturday morning fancy coffee fund!

This repo demos how to convert a csv file of timestamped data into a real-time stream useful for testing streaming analytics. An example input file with random time series data and a script for generating the file are included in the data directory.

The producer and consumer Python scripts use Confluent's Kafka client for Python, which is installed in the Docker image built with the accompanying Dockerfile, if you choose to use it.

Requires Docker and Docker Compose.

Usage

Clone repo and cd into directory.

git clone https://github.com/mtpatter/time-series-kafka-demo.git
cd time-series-kafka-demo

Start the Kafka broker

docker compose up --build

Build a Docker image (optionally, for the producer and consumer)

From the main root directory:

docker build -t "kafkacsv" .

If you want to use Docker for the python scripts, this should now work:

docker run -it --rm kafkacsv python bin/sendStream.py -h

Start a consumer

To start a consumer for printing all messages in real-time from the stream "my-stream":

python bin/processStream.py my-stream

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/processStream.py my-stream

Produce a time series stream

Send time series from data/data.csv to topic “my-stream”, and speed it up by a factor of 10.

python bin/sendStream.py data/data.csv my-stream --speed 10

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/sendStream.py data/data.csv my-stream --speed 10

Shut down and clean up

Stop the consumer with Return and Ctrl+C.

Shutdown Kafka broker system:

docker compose down
Software engineering course project. Secondhand trading system.

PigeonSale Software engineering course project. Secondhand trading system. Documentation API doumenatation: list of APIs Backend documentation: notes

Harry Lee 1 Sep 01, 2022
Workbench to integrate pyoptools with freecad, that means basically optics ray tracing capabilities for FreeCAD.

freecad-pyoptools Workbench to integrate pyoptools with freecad, that means basically optics ray tracing capabilities for FreeCAD. Requirements It req

Combustión Ingenieros SAS 12 Nov 16, 2022
The blazing-fast Discord bot.

Wavy Wavy is an open-source multipurpose Discord bot built with pycord. Wavy is still in development, so use it at your own risk. Tools and services u

Wavy 7 Dec 27, 2022
Repository for learning Python (Python Tutorial)

Repository for learning Python (Python Tutorial) Languages and Tools 🧰 Overview 📑 Repository for learning Python (Python Tutorial) Languages and Too

Swiftman 2 Aug 22, 2022
Generate a backend and frontend stack using Python and json-ld, including interactive API documentation.

d4 - Base Project Generator Generate a backend and frontend stack using Python and json-ld, including interactive API documentation. d4? What is d4 fo

Markus Leist 3 May 03, 2022
Python Deep Dive Course - Accompanying Materials

Python Deep Dive Various Jupyter notebooks and Python sources associated with my Udemy Python 3 Deep Dive course series: Part 1: Mainly functional pro

Fred Baptiste 1.1k Dec 30, 2022
Dev Centric Tools for Mkdocs Based Documentation

docutools MkDocs Documentation Tools For Developers This repo is providing a set of plugins for mkdocs material compatible documentation. It is meant

Axiros GmbH 14 Sep 10, 2022
Deduplicating archiver with compression and authenticated encryption.

More screencasts: installation, advanced usage What is BorgBackup? BorgBackup (short: Borg) is a deduplicating backup program. Optionally, it supports

BorgBackup 9k Jan 09, 2023
Swagger Documentation Generator for Django REST Framework: deprecated

Django REST Swagger: deprecated (2019-06-04) This project is no longer being maintained. Please consider drf-yasg as an alternative/successor. I haven

Marc Gibbons 2.6k Jan 03, 2023
Python Eacc is a minimalist but flexible Lexer/Parser tool in Python.

Python Eacc is a parsing tool it implements a flexible lexer and a straightforward approach to analyze documents.

Iury de oliveira gomes figueiredo 60 Nov 16, 2022
Speed up Sphinx builds by selectively removing toctrees from some pages

Remove toctrees from Sphinx pages Improve your Sphinx build time by selectively removing TocTree objects from pages. This is useful if your documentat

Executable Books 8 Jan 04, 2023
A repository of links with advice related to grad school applications, research, phd etc

A repository of links with advice related to grad school applications, research, phd etc

Shaily Bhatt 946 Dec 30, 2022
This programm checks your knowlege about the capital of Japan

Introduction This programm checks your knowlege about the capital of Japan. Now, what does it actually do? After you run the programm you get asked wh

1 Dec 16, 2021
A module filled with many useful functions and modules in various subjects.

Usefulpy Check out the Usefulpy site Usefulpy site is not always up to date Download and Import download and install with with pip download usefulpyth

Austin Garcia 1 Dec 28, 2021
A Material Design theme for MkDocs

A Material Design theme for MkDocs Create a branded static site from a set of Markdown files to host the documentation of your Open Source or commerci

Martin Donath 12.3k Jan 04, 2023
A website for courses of Major Computer Science, NKU

A website for courses of Major Computer Science, NKU

Sakura 0 Oct 06, 2022
Plover jyutping - Plover plugin for Jyutping input

Plover plugin for Jyutping Installation Navigate to the repo directory: cd plove

Samuel Lo 1 Mar 17, 2022
Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

Xanadu Quantum Codebook The Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane. This reposit

Xanadu 43 Dec 09, 2022
Automated Integration Testing and Live Documentation for your API

Automated Integration Testing and Live Documentation for your API

ScanAPI 1.3k Dec 30, 2022
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

applied-ml Curated papers, articles, and blogs on data science & machine learning in production. ⚙️ Figuring out how to implement your ML project? Lea

Eugene Yan 22.1k Jan 03, 2023