This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

Overview

Parquet benchmarks

This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

The results on Azure's Standard D4s v3 (4 vcpus, 16 GiB memory) are available here.

Read uncompressed

read uncompressed i64

read uncompressed bool

read uncompressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

read uncompressed dict utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Read compressed (snappy)

read compressed i64

read compressed bool

read compressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Write uncompressed

write uncompressed i64

write uncompressed bool

write uncompressed utf8

Write compressed (snappy)

write compressed i64

write compressed bool

write compressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Run benchmarks

To reproduce, use

python3 -m venv venv
venv/bin/pip install -U pip
venv/bin/pip install pyarrow

# create files
venv/bin/python write_parquet.py

# run benchmarks
venv/bin/python run.py

# print results to stdout as csv
venv/bin/python summarize.py

Details

The benchmark reads a single column from a file pre-loaded into memory, decompresses and deserializes the column to an arrow array.

The benchmark includes different configurations:

  • dictionary-encoded vs plain encoding
  • single page vs multiple pages
  • compressed vs uncompressed
  • different types:
    • i64
    • bool
    • utf8
Test utility for validating OpenAPI documentation

DRF OpenAPI Tester This is a test utility to validate DRF Test Responses against OpenAPI 2 and 3 schema. It has built-in support for: OpenAPI 2/3 yaml

snok 103 Dec 21, 2022
A web scraping using Selenium Webdriver

Savee - Images Downloader Project using Selenium Webdriver to download images from someone's profile on https:www.savee.it website. Usage The project

Caio Eduardo Lobo 1 Dec 17, 2021
Python Projects - Few Python projects with Testing using Pytest

Python_Projects Few Python projects : Fast_API_Docker_PyTest- Just a simple auto

Tal Mogendorff 1 Jan 22, 2022
Testing - Instrumenting Sanic framework with Opentelemetry

sanic-otel-splunk Testing - Instrumenting Sanic framework with Opentelemetry Test with python 3.8.10, sanic 20.12.2 Step to instrument pip install -r

Donler 1 Nov 26, 2021
Sixpack is a language-agnostic a/b-testing framework

Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries

1.7k Dec 24, 2022
Mypy static type checker plugin for Pytest

pytest-mypy Mypy static type checker plugin for pytest Features Runs the mypy static type checker on your source files as part of your pytest test run

Dan Bader 218 Jan 03, 2023
UX Analytics & A/B Testing

UX Analytics & A/B Testing

Marvin EDORH 1 Sep 07, 2021
The Good Old Days. | Testing Out A New Module-

The-Good-Old-Days. The Good Old Days. | Testing Out A New Module- Installation Asciimatics supports Python versions 2 & 3. For the precise list of tes

Syntax. 2 Jun 08, 2022
A automated browsing experience.

browser-automation This app is an automated browsing technique where one has to enter the required information, it's just like searching for Animals o

Ojas Barawal 3 Aug 04, 2021
Implement unittest, removing all global variable and returning values

Implement unittest, removing all global variable and returning values

Placide 1 Nov 01, 2021
Airspeed Velocity: A simple Python benchmarking tool with web-based reporting

airspeed velocity airspeed velocity (asv) is a tool for benchmarking Python packages over their lifetime. It is primarily designed to benchmark a sing

745 Dec 28, 2022
A Simple Unit Test Matcher Library for Python 3

pychoir - Python Test Matchers for humans Super duper low cognitive overhead matching for Python developers reading or writing tests. Implemented in p

Antti Kajander 15 Sep 14, 2022
Penetration testing

Penetration testing

3 Jan 11, 2022
WomboAI Art Generator

WomboAI Art Generator Automate AI art generation using wombot.art. Also integrated into SnailBot for you to try out. Setup Install Python Go to the py

nbee 7 Dec 03, 2022
Selenium Manager

SeleniumManager I'm fed up with always having to struggle unnecessarily when I have to use Selenium on a new machine, so I made this little python mod

Victor Vague 1 Dec 24, 2021
API Test Automation with Requests and Pytest

api-testing-requests-pytest Install Make sure you have Python 3 installed on your machine. Then: 1.Install pipenv sudo apt-get install pipenv 2.Go to

Sulaiman Haque 2 Nov 21, 2021
Command line driven CI frontend and development task automation tool.

tox automation project Command line driven CI frontend and development task automation tool At its core tox provides a convenient way to run arbitrary

tox development team 3.1k Jan 04, 2023
A simple asynchronous TCP/IP Connect Port Scanner in Python 3

Python 3 Asynchronous TCP/IP Connect Port Scanner A simple pure-Python TCP Connect port scanner. This application leverages the use of Python's Standa

70 Jan 03, 2023
Lightweight, scriptable browser as a service with an HTTP API

Splash - A javascript rendering service Splash is a javascript rendering service with an HTTP API. It's a lightweight browser with an HTTP API, implem

Scrapinghub 3.8k Jan 03, 2023
Akulaku Create NewProduct Automation using Selenium Python

Akulaku-Create-NewProduct-Automation Akulaku Create NewProduct Automation using Selenium Python Usage: 1. Install Python 3.9 2. Open CMD on Bot Folde

Rahul Joshua Damanik 1 Nov 22, 2021