Falcon: Interactive Visual Analysis for Big Data

Last update: Dec 27, 2022

Related tags

Overview

Falcon: Interactive Visual Analysis for Big Data

Crossfilter millions of records without latencies. This project is work in progress and not documented yet. Please get in touch if you have questions.

The largest experiments we have done so far is 10M flights in the browser and ~180M flights or ~1.7B stars when connected to OmniSciDB (formerly known as MapD).

We have written a paper about the research behind Falcon. Please cite us if you use Falcon in a publication.

@inproceedings{moritz2019falcon,
  doi = {10.1145/3290605},
  year  = {2019},
  publisher = {{ACM} Press},
  author = {Dominik Moritz and Bill Howe and Jeffrey Heer},
  title = {Falcon: Balancing Interactive Latency and Resolution Sensitivity for Scalable Linked Visualizations},
  booktitle = {Proceedings of the 2019 {CHI} Conference on Human Factors in Computing Systems  - {CHI} {\textquotesingle}19}
}

Demos

1M flights in the browser: https://vega.github.io/falcon/flights/
7M flights in OmniSci Core: https://vega.github.io/falcon/flights-mapd/
500k weather records: https://vega.github.io/falcon/weather/

Usage

Install with yarn add falcon-vis. You can use two query engines. First ArrowDB reading data from Apache Arrow. This engine works completely in the browser and scales up to ten million rows. Second, MapDDB, which connects to OmniSci Core. The indexes are created as ndarrays. Check out the examples to see how to set up an app with your own data. More documentation will follow.

Features

Zoom

You can zoom histograms. Falcon automatically re-bins the data.

Show and hide unfiltered data

The original counts without filters, can be displayed behind the filtered counts to provide context. Hiding the unfiltered data shows the relative distribution of the data.

With unfiltered data.

Without unfiltered data.

Circles or Color Heatmap

Heatmap with circles (default). Can show the data without filters.

Heatmap with colored cells.

Vertical bar, horizontal bar, or text for counts

Horizontal bar.

Vertical bar.

Text only.

Timeline visualization

You can visualize the timeline of brush interactions in Falcon.

Falcon with 1.7 Billion Stars from the GAIA Dataset

The GAIA spacecraft measured the positions and distances of stars with unprecedented precision. It collected about 1.7 billion objects, mainly stars, but also planets, comets, asteroids and quasars among others. Below, we show the dataset loaded in Falcon (with OmniSci Core). There is also a video of me interacting with the dataset through Falcon.

Developers

Install the dependencies with yarn. Then run yarn start to start the flight demo with in memory data. Have a look at the other script commands in package.json.

Experiments

First version that turned out to be too complicated is at https://github.com/vega/falcon/tree/complex and the client-server version is at https://github.com/vega/falcon/tree/client-server.

Falcon: Interactive Visual Analysis for Big Data

Related tags

Overview

Falcon: Interactive Visual Analysis for Big Data

Demos

Usage

Features

Zoom

Show and hide unfiltered data

Circles or Color Heatmap

Vertical bar, horizontal bar, or text for counts

Timeline visualization

Falcon with 1.7 Billion Stars from the GAIA Dataset

Developers

Experiments

Owner

Vega

Data Analysis for First Year Laboratory at Imperial College, London.

CRISP: Critical Path Analysis of Microservice Traces

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.

Python Kalman filtering and optimal estimation library. Implements Kalman filter, particle filter, Extended Kalman filter, Unscented Kalman filter, g-h (alpha-beta), least squares, H Infinity, smoothers, and more. Has companion book 'Kalman and Bayesian Filters in Python'.

A Python package for modular causal inference analysis and model evaluations

A stock analysis app with streamlit

Very basic but functional Kakuro solver written in Python.

DataPrep — The easiest way to prepare data in Python

simple way to build the declarative and destributed data pipelines with python

A model checker for verifying properties in epistemic models

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

4CAT: Capture and Analysis Toolkit

COVID-19 deaths statistics around the world

Vectorizers for a range of different data types

A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset

Example Of Splunk Search Query With Python And Splunk Python SDK

Approximate Nearest Neighbor Search for Sparse Data in Python!

EOD Historical Data Python Library (Unofficial)

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Pandas and Spark DataFrame comparison for humans