Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Last update: Dec 12, 2022

Related tags

Deep Learning data-analysis-speedup

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Material for my talk at the PyConDE & PyData Berlin 2022

Description

Your data analysis pipeline works. Nice.
Could it be faster? Probably.
Do you need to parallelize? Not yet.

We'll go through optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs. This walkthrough shows tools and strategies to identify and mitigate bottlenecks, and demonstrate them in an example. The 5 steps cover:

Identifying bottlenecks: Profiling
Efficient IO
Vectorization
Memory & Precision Tradeoffs
Jit-ting with numba

This talk is suited for data scientists on a beginner and intermediate level, typically working with a numpy/scipy/… stack or similar. The talk gives strategies & concrete suggestions how to speed up an existing analysis pipeline, which is demonstrated practically on an example, showing the gained speed improvements of each step.

Installation & Usage

python3 -m pip install poetry
poetry install
poetry run python -m jupyterlab

Dev

./format.sh

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Related tags

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Description

Installation & Usage

Dev

Owner

Jonathan Striebel

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Utilizes Pose Estimation to offer sprinters cues based on an image of their running form.

LSUN Dataset Documentation and Demo Code

Federated_learning codes used for the the paper "Evaluation of Federated Learning Aggregation Algorithms" and "A Federated Learning Aggregation Algorithm for Pervasive Computing: Evaluation and Comparison"

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

v objective diffusion inference code for PyTorch.

Learning Continuous Image Representation with Local Implicit Image Function

An updated version of virtual model making

A light-weight image labelling tool for Python designed for creating segmentation data sets.

PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

Learning Continuous Signed Distance Functions for Shape Representation

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

Pytorch implement of 'Unmixing based PAN guided fusion network for hyperspectral imagery'

Semantic Segmentation with Pytorch-Lightning

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

Dilated Convolution for Semantic Image Segmentation

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks