Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Last update: Dec 01, 2021

Related tags

Overview

opendata

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

import asyncio
from opendata.sources.bikeshare.bay_wheels import trips as bay_wheels

trips_df, _ = asyncio.run(bay_wheels.async_load(trip_sample_rate=1000))

len(trips_df.index)
# 8731

trips_df.columns
# Index(['started_at', 'ended_at', 'start_station_id', 'end_station_id',
#        'start_station_name', 'end_station_name', 'rideable_type', 'ride_id',
#        'start_lat', 'start_lng', 'end_lat', 'end_lng', 'gender', 'user_type',
#        'bike_id', 'birth_year'],
#       dtype='object')

An example analysis can be found here: https://observablehq.com/@brady/bikeshare

Supports sampling and local file caching to improve performance.

Markets supported

import opendata.sources.bikeshare.bay_wheels
import opendata.sources.bikeshare.bixi
import opendata.sources.bikeshare.divvy
import opendata.sources.bikeshare.capital_bikeshare
import opendata.sources.bikeshare.citi_bike
import opendata.sources.bikeshare.cogo
import opendata.sources.bikeshare.niceride
import opendata.sources.bikeshare.bluebikes
import opendata.sources.bikeshare.metro_bike_share
import opendata.sources.bikeshare.indego

Bootstrap

Set up your environment

brew install chromedriver
brew install python3
python3 -m pip install pre-commit

pre-commit install --install-hooks
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Entering virtualenv

python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Usage

Try the test export to CSV:

python3 test.py

Updating pip requirements

pip-compile

Pre-commit setup

pre-commit install --install-hooks

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Related tags

Overview

opendata

Markets supported

Bootstrap

Entering virtualenv

Usage

Updating pip requirements

Pre-commit setup

Bikeshare markets to add

USA

World

Owner

Brady Law

Conduits - A Declarative Pipelining Tool For Pandas

ped-crash-techvol: Texas Ped Crash Tech Volume Pack

The Dash Enterprise App Gallery "Oil & Gas Wells" example

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Titanic data analysis for python

Very useful and necessary functions that simplify working with data

Data Competition: automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly

Python library for creating data pipelines with chain functional programming

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Building house price data pipelines with Apache Beam and Spark on GCP

An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

Calculate multilateral price indices in Python (with Pandas and PySpark).

This tool parses log data and allows to define analysis pipelines for anomaly detection.

Single-Cell Analysis in Python. Scales to >1M cells.

A columnar data container that can be compressed.

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Useful tool for inserting DataFrames into the Excel sheet.

Pandas and Spark DataFrame comparison for humans

COVID-19 deaths statistics around the world