A framework for feature exploration in Data Science

Related tags

ScienceBeehive
Overview

Beehive

A framework for feature exploration in Data Science

Background

What do we do when we finish one episode of feature exploration in a jupyter notebook?
We overwrite the notebook or clone the notebook for the next episode of feature exploration.

I say this is inefficient, thus this project.

Purpose

To help documentation for grouping all episode of feature exploration in a single notebook.

Project Elaboration

terms

episode: one set of feature trained on a model.
e.g.
episode_1 = (sepal_length, sepal_width) | (target_class)
episode_2 = (sepal_length, sepal_width, petal_length, petal_width) | (target_class)
episode_3 = (sepal_length_normalized, sepal_width_normalized, petal_length_normalized, petal_width_normalized) | (target_class)

get to know the classes

Combee

One Combee should represent one model and its features includes: (metrics, confusion matrix, feature importances)
(sklearn-based)

CombeKeras

One Combee should represent one model and its features includes: (metrics, confusion matrix)
(keras-based)
needed because Keras has different features and functions compared to sklearn

Vespiqueen

One Vespiqueen should represent one episode.
Vespiqueen memorizes the tranformation of data.
Vespiqueen governs all the Combees.
Can be used to pick which model is best in a Vespiqueen.

Beehive

Beehive governs the whole Vespiqueen.
Beehive is used to print all the metrics across Vespiqueen (or episodes)
Can also be used to pick new features from the last or best vespiqueen for the next episode.

How to Use

Owner
Steven IJ
Steven IJ
A framework for feature exploration in Data Science

Beehive A framework for feature exploration in Data Science Background What do we do when we finish one episode of feature exploration in a jupyter no

Steven IJ 1 Jan 03, 2022
Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations

DGFEM Maxwell Equations Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations. Work in progress. Currently, the 1D Maxwell equati

Rafael de la Fuente 9 Aug 16, 2022
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Kenji Hiranabe 3.2k Jan 08, 2023
3D medical imaging reconstruction software

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

443 Jan 01, 2023
Program that estimates antiderivatives utilising Maclaurin series.

AntiderivativeEstimator Program that estimates antiderivatives utilising Maclaurin series. Setup: Needs Python 3 and Git installed and added to PATH.

James Watson 3 Aug 04, 2021
A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

Arnav Jindal 2 Nov 14, 2021
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
ReproZip is a tool that simplifies the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science.

ReproZip ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used comm

267 Jan 01, 2023
OPEM (Open Source PEM Fuel Cell Simulation Tool)

Table of contents What is PEM? Overview Installation Usage Executable Library Telegram Bot Try OPEM in Your Browser! MATLAB Issues & Bug Reports Contr

ECSIM 133 Jan 04, 2023
3D visualization of scientific data in Python

Mayavi: 3D visualization of scientific data in Python Mayavi docs: http://docs.enthought.com/mayavi/mayavi/ TVTK docs: http://docs.enthought.com/mayav

Enthought, Inc. 1.1k Jan 06, 2023
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

6.4k Jan 02, 2023
collection of interesting Computer Science resources

collection of interesting Computer Science resources

Kirill Bobyrev 137 Dec 22, 2022
PennyLane is a cross-platform Python library for differentiable programming of quantum computers.

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

PennyLaneAI 1.6k Jan 04, 2023
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Dec 31, 2022
Python Data Science Handbook: full text in Jupyter Notebooks

Python Data Science Handbook This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks. How to Use th

Jake Vanderplas 36.9k Dec 28, 2022
Float2Binary - A simple python class which finds the binary representation of a floating-point number.

Float2Binary A simple python class which finds the binary representation of a floating-point number. You can find a class in IEEE754.py file with the

Bora Canbula 3 Dec 14, 2021
An interactive explorer for single-cell transcriptomics data

an interactive explorer for single-cell transcriptomics data cellxgene (pronounced "cell-by-gene") is an interactive data explorer for single-cell tra

Chan Zuckerberg Initiative 424 Dec 15, 2022
artisan: visual scope for coffee roasters

Artisan Visual scope for coffee roasters WARNING: pre-release builds may not work. Use at your own risk. Summary Artisan is a software that helps coff

Artisan – Visual Scope for Coffee Roasters 705 Jan 05, 2023
Animation engine for explanatory math videos

Manim is an engine for precise programatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This repo

Grant Sanderson 48.9k Jan 03, 2023
PsychoPy is an open-source package for creating experiments in behavioral science.

PsychoPy is an open-source package for creating experiments in behavioral science. It aims to provide a single package that is: precise enoug

PsychoPy 1.3k Dec 31, 2022