Focus on Algorithm Design, Not on Data Wrangling

Overview

The visual data management platform from Zensors.


Join for free at app.datatap.dev.

The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.


Documentation

Full documentation is available at docs.datatap.dev.

Features

  • Begin training instantly
  • 🔥 Works with all major ML frameworks (Pytorch, TensorFlow, etc.)
  • 🛰️ Real-time streaming to avoid large dataset downloads
  • 🌐 Universal data format for simple data exchange
  • 🎨 Combine data from multiples sources into a single dataset easily
  • 🧮 Rich ML utilities to compute PR-curves, confusion matrices, and accuracy metrics.
  • 💽 Free access to a variety of open datasets.

Getting Started (Platform)

To begin, select a dataset from the dataTap repository.

Then copy the starter code based on your library preference.

Paste the starter code and start training.

Getting Started (API)

Install the client library.

pip install datatap

Register at app.datatap.dev. Then, go to Settings > Api Keys to find your personal API key.

export DATATAP_API_KEY="XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXX"

Start using open datasets instantly.

from datatap import Api

api = Api()
coco = api.get_default_database().get_repository("_/coco")
dataset = coco.get_dataset("latest")
print("COCO: ", dataset)

Data Streaming Example

import itertools
from datatap import Api

api = Api()
dataset = (api
    .get_default_database()
    .get_repository("_/wider-person")
    .get_dataset("latest")
)

training_stream = dataset_version.stream_split("training")
for annotation in itertools.islice(training_stream, 5):
    print("Received annotation:", annotation)

More Examples

Support and FAQ

Q. How do I resolve a missing API Key?

If you see the error Exception: No API key available. Either provide it or use the [DATATAP_API_KEY] environment variable, then the dataTap library was not able to find your API key. You can find your API key on app.datatap.dev under settings. You can either set it as an environment variable or as the first argument to the Api constructor.

Q. Can dataTap be used offline?

Some functionality can be used offline, such as the droplet utilities and metrics. However, repository access and dataset streaming require internet access, even for local databases.

Q. Is dataTap accepting contributions?

dataTap currently uses a separate code review system for managing contributions. The team is looking into switching that system to GitHub to allow public contributions. Until then, we will actively monitor the GitHub issue tracker to help accomodate the community's needs.

Q. How can I get help using dataTap?

You can post a question in the issue tracker. The dataTap team actively monitors the repository, and will try to get back to you as soon as possible.

matplotlib: plotting with Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Check out our home page for more inform

Matplotlib Developers 16.7k Jan 08, 2023
Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

PAIR code 7.1k Jan 07, 2023
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Generate a 3D Skyline in STL format and a OpenSCAD file from Gitlab contributions

Your Gitlab's contributions in a 3D Skyline gitlab-skyline is a Python command to generate a skyline figure from Gitlab contributions as Github did at

Félix Gómez 70 Dec 22, 2022
A minimalistic wrapper around PyOpenGL to save development time

glpy glpy is pyOpenGl wrapper which lets you work with pyOpenGl easily.It is not meant to be a replacement for pyOpenGl but runs on top of pyOpenGl to

Abhinav 9 Apr 02, 2022
Smarthome Dashboard with Grafana & InfluxDB

Smarthome Dashboard with Grafana & InfluxDB This is a complete overhaul of my Raspberry Dashboard done with Flask. I switched from sqlite to InfluxDB

6 Oct 20, 2022
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes

{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew

Peter Stephan 0 Jan 12, 2022
✅ Today I Learn

Today I Learn EDA numpy_100ex numpy_0~10 airline_satisfaction_prediction BERT_naver_movie_classification NLP_prepare NLP_Tweet_Emotion_Recognition tex

Yeonghoo_Ahn 3 Dec 15, 2022
Python package that generates hardware pinout diagrams as SVG images

PinOut A Python package that generates hardware pinout diagrams as SVG images. The package is designed to be quite flexible and works well for general

336 Dec 20, 2022
The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualizing NFT data from OpenSea, using PostgreSQL and TimescaleDB.

Timescale NFT Starter Kit The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualiz

Timescale 102 Dec 24, 2022
flask extension for integration with the awesome pydantic package

Flask-Pydantic Flask extension for integration of the awesome pydantic package with Flask. Installation python3 -m pip install Flask-Pydantic Basics v

249 Jan 06, 2023
Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns.

Make Complex Heatmaps Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns. H

Zuguang Gu 973 Jan 09, 2023
A simple agent-based model used to teach the basics of OOP in my lectures

Pydemic A simple agent-based model of a pandemic. This is used to teach basic principles of object-oriented programming to master students. It is not

Fabien Maussion 2 Jun 08, 2022
PanGraphViewer -- show panenome graph in an easy way

PanGraphViewer -- show panenome graph in an easy way Table of Contents Versions and dependences Desktop-based panGraphViewer Library installation for

16 Dec 17, 2022
HW 2: Visualizing interesting datasets

HW 2: Visualizing interesting datasets Check out the project instructions here! Mean Earnings per Hour for Males and Females My first graph uses data

7 Oct 27, 2021
kyle's vision of how datadog's python client should look

kyle's datadog python vision/proposal not for production use See examples/comprehensive.py for a mostly working example of the proposed API. 📈 🐶 ❤️

Kyle Verhoog 2 Nov 21, 2021
LinkedIn connections analyzer

LinkedIn Connections Analyzer 🔗 https://linkedin-analzyer.herokuapp.com Hey hey 👋 , welcome to my LinkedIn connections analyzer. I recently found ou

Okkar Min 5 Sep 13, 2022
A central task in drug discovery is searching, screening, and organizing large chemical databases

A central task in drug discovery is searching, screening, and organizing large chemical databases. Here, we implement clustering on molecular similarity. We support multiple methods to provide a inte

NVIDIA Corporation 124 Jan 07, 2023
A data visualization curriculum of interactive notebooks.

A data visualization curriculum of interactive notebooks, using Vega-Lite and Altair. This repository contains a series of Python-based Jupyter notebooks.

UW Interactive Data Lab 1.2k Dec 30, 2022
Decision Border Visualizer for Classification Algorithms

dbv Decision Border Visualizer for Classification Algorithms Project description A python package for Machine Learning Engineers who want to visualize

Sven Eschlbeck 1 Nov 01, 2021