Accelerating model creation and evaluation.

Last update: Dec 06, 2021

Overview

EmeraldML

A machine learning library for streamlining the process of
(1) cleaning and splitting data,
(2) training, optimizing, and testing various models based on the task, and
(3) scoring and ranking them
during the exploratory phase for an elementary analysis of which models perform better for a specific dataset.

Installation

Dependencies

Python (>= 3.7)
NumPy (>= 1.21.2)
pandas (>= 1.3.3)
scikit-learn (>= 0.24.2)
statsmodels (>= 0.12.2)

User installation

pip install emeraldml

Development

Source code

You can check the latest sources with the command:

git clone https://github.com/yu3ufff/emeraldml.git

Demo

Getting the data:

import pandas as pd
audi = pd.read_csv('audi.csv')
audi.head()

|    | model   |   year |   price | transmission   |   mileage | fuelType   |   tax |   mpg |   engineSize |
|---:|:--------|-------:|--------:|:---------------|----------:|:-----------|------:|------:|-------------:|
|  0 | A1      |   2017 |   12500 | Manual         |     15735 | Petrol     |   150 |  55.4 |          1.4 |
|  1 | A6      |   2016 |   16500 | Automatic      |     36203 | Diesel     |    20 |  64.2 |          2   |
|  2 | A1      |   2016 |   11000 | Manual         |     29946 | Petrol     |    30 |  55.4 |          1.4 |
|  3 | A4      |   2017 |   16800 | Automatic      |     25952 | Diesel     |   145 |  67.3 |          2   |
|  4 | A3      |   2019 |   17300 | Manual         |      1998 | Petrol     |   145 |  49.6 |          1   |

Using EmeraldML:

import emerald
from emerald.boa import RegressionBoa

rboa = RegressionBoa(random_state=3)
rboa.hunt(data=audi, target='price')
rboa.ladder

[(OptimalRFRegressor, 0.9624889664024406),
 (OptimalDTRegressor, 0.9514992411732952),
 (OptimalKNRegressor, 0.9511411883559433),
 (OptimalLinearRegression, 0.8876961846248467),
 (OptimalABRegressor, 0.8491539140007975)]

for i in range(len(rboa)):
    print(rboa.model(i))

RandomForestRegressor(min_samples_split=5, n_estimators=500, random_state=3)
DecisionTreeRegressor(max_depth=15, min_samples_split=10, random_state=3)
KNeighborsRegressor(n_neighbors=3, p=1)
LinearRegression()
AdaBoostRegressor(learning_rate=0.1, n_estimators=100, random_state=3)

Accelerating model creation and evaluation.

Related tags

Overview

EmeraldML

Installation

Dependencies

User installation

Development

Source code

Demo

Owner

Yusuf

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

Machine Learning for Time-Series with Python.Published by Packt

Scikit learn library models to account for data and concept drift.

stability-selection - A scikit-learn compatible implementation of stability selection

Python library for multilinear algebra and tensor factorizations

Anytime Learning At Macroscale

Python Research Framework

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

pure-predict: Machine learning prediction in pure Python

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch

Merlion: A Machine Learning Framework for Time Series Intelligence

Machine Learning Techniques using python.

slim-python is a package to learn customized scoring systems for decision-making problems.

MIT-Machine Learning with Python–From Linear Models to Deep Learning

Responsible Machine Learning with Python

This jupyter notebook project was completed by me and my friend using the dataset from Kaggle

As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

A Lightweight Hyperparameter Optimization Tool 🚀

Solve automatic numerical differentiation problems in one or more variables.