Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

Overview

AutoML in Healthcare Review

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

Selected highlights from the 2020 AutoML Review [https://doi.org/10.1016/j.artmed.2020.101822] that reviewed over 2,160 works related to the field of automated machine learning.

The curated list of automated feature engineering tools for Automated Machine Learning

Full details in https://www.sciencedirect.com/science/article/pii/S0933365719310437?via%3Dihub#tbl0005

Method Work Feature Engineering Technique Used by how many works
Deep Feature Synthesis LINK Expand-Reduce 151
Explore Kit LINK Expand-Reduce 53
One Button Machine LINK Expand-Reduce 32
AutoLearn LINK Expand-Reduce 16
GP Feature Construction LINK Genetic Programming 68
Cognito LINK Hierarchical Greedy Search 38
RLFE LINK Reinforcement Learning 21
LFE LINK Meta-Learning 34

Automated machine learning pipeline optimizers

Full details in https://www.sciencedirect.com/science/article/pii/S0933365719310437?via%3Dihub#tbl0010

Method Work Optimization Algorithm Data Pre-Processing Feature Engineering Model Selection Hyperparameter Optimization Ensemble Learning Meta-Learning Used by how many works
Auto-Weka LINK Bayesian Optimization (SMAC) ✔️ ✔️ ✔️ 703
Auto-Sklearn LINK Joint Bayesian Optimization and Bandit Search (BOHB) ✔️ ✔️ ✔️ ✔️ ✔️ 542
TPOT LINK Evolutionary Algorithm ✔️ ✔️ ✔️ ✔️ 84
TuPAQ LINK Bandit Search ✔️ ✔️ 94
ATM LINK Joint Bayesian Optimization and Bandit Search ✔️ ✔️ ✔️ 29
Automatic Frankensteining LINK Bayesian Optimization ✔️ ✔️ ✔️ 12
ML-Plan LINK Hierarchical Task Networks (HTN) ✔️ ✔️ ✔️ 24
Autostacker LINK Evolutionary Algorithm ✔️ ✔️ ✔️ 18
AlphaD3M LINK Reinforcement Learning/Monte Carlo Tree Search ✔️ ✔️ ✔️ 8
Collaborative Filtering LINK Probabilistic Matrix Factorization ✔️ ✔️ ✔️ ✔️ 29

Neural Architecture Search algorithms, based on performance on the CIFAR-10 dataset

Full details in https://www.sciencedirect.com/science/article/pii/S0933365719310437?via%3Dihub#tbl0015

NAS Algorithm Work Search Space Search Strategy Performance Estimation Strategy Number of Parameters Search Time (GPU-days) Test Error (%)
Large-scale Evolution LINK Feed-Forward Networks Evolutionary Algorithm Naive Training and Validation 5.4M 2600 5.4
EAS LINK Feed-Forward Networks Reinforcement Learning and Network Morphism Short Training and Validation 23.4M 10 4.23
Hierarchical Evolution LINK Cell Motifs Evolutionary Algorithm Training and Validation on proposed CNN Cell 15.7M 300 3.75
NAS v3 LINK Multi-branched Networks Reinforcement Learning Naive Training and Validation 37.4M 22400 3.65
PNAS LINK Cell Motifs Sequential Model-Based Optimization (SMBO) Performance Prediction 3.2M 225 3.41
ENAS LINK Cell Motifs Reinforcement Learning One Shot 4.6M 0.45 2.89
ResNet + Regularization LINK HUMAN BASELINE HUMAN BASELINE HUMAN BASELINE 26.2M - 2.86
DARTS LINK Cell Motifs Gradient-Based Optimization Training and Validation on proposed CNN Cell 3.4M 4 2.83
NASNet-A LINK Cell Motifs Reinforcement Learning Naive Training and Validation 3.3M 2000 2.65
EENA LINK Cell Motifs Evolutionary Algorithm Performance Prediction 8.5M 0.65 2.56
Path-Level EAS LINK Cell Motifs Reinforcement Learning Short Training and Validation 14.3M 200 2.30
NAO LINK Cell Motifs Gradient-Based Optimization Performance Prediction 128M 200 2.11
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Scott Lundberg 18.2k Jan 02, 2023
Predicting diabetes over a five year period using logistic regression and the Pima First-Nation dataset

Diabetes This script uses the Pima First Nations dataset to create a model to predict whether or not an individual will develop Diabetes Mellitus Type

1 Mar 28, 2022
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 663 Dec 31, 2022
Dive into Machine Learning

Dive into Machine Learning Hi there! You might find this guide helpful if: You know Python or you're learning it 🐍 You're new to Machine Learning You

Michael Floering 11.1k Jan 03, 2023
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 363 Dec 14, 2022
A linear equation solver using gaussian elimination. Implemented for fun and learning/teaching.

A linear equation solver using gaussian elimination. Implemented for fun and learning/teaching. The solver will solve equations of the type: A can be

Sanjeet N. Dasharath 3 Feb 15, 2022
Conducted ANOVA and Logistic regression analysis using matplot library to visualize the result.

Intro-to-Data-Science Conducted ANOVA and Logistic regression analysis. Project ANOVA The main aim of this project is to perform One-Way ANOVA analysi

Chris Yuan 1 Feb 06, 2022
A single Python file with some tools for visualizing machine learning in the terminal.

Machine Learning Visualization Tools A single Python file with some tools for visualizing machine learning in the terminal. This demo is composed of t

Bram Wasti 35 Dec 29, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 03, 2023
Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset

kemalgunay 5 Apr 21, 2022
Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application (with docker-compose).

Philip May 2 Dec 03, 2021
A simple python program that draws a tree for incrementing values using the Collatz Conjecture.

Collatz Conjecture A simple python program that draws a tree for incrementing values using the Collatz Conjecture. Values which can be edited: Length

davidgasinski 1 Oct 28, 2021
Combines Bayesian analyses from many datasets.

PosteriorStacker Combines Bayesian analyses from many datasets. Introduction Method Tutorial Output plot and files Introduction Fitting a model to a d

Johannes Buchner 19 Feb 13, 2022
This is the material used in my free Persian course: Machine Learning with Python

This is the material used in my free Persian course: Machine Learning with Python

Yara Mohamadi 4 Aug 07, 2022
slim-python is a package to learn customized scoring systems for decision-making problems.

slim-python is a package to learn customized scoring systems for decision-making problems. These are simple decision aids that let users make yes-no p

Berk Ustun 37 Nov 02, 2022
Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

X - Data Science Initiative 410 Dec 14, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 09, 2022
LightGBM + Optuna: no brainer

AutoLGBM LightGBM + Optuna: no brainer auto train lightgbm directly from CSV files auto tune lightgbm using optuna auto serve best lightgbm model usin

Rishiraj Acharya 22 Dec 15, 2022
A library to generate synthetic time series data by easy-to-use factors and generator

timeseries-generator This repository consists of a python packages that generates synthetic time series dataset in a generic way (under /timeseries_ge

Nike Inc. 87 Dec 20, 2022