A simple and lightweight genetic algorithm for optimization of any machine learning model

Overview

geneticml

Actions Status PyPI License

This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model.

Installation

Use pip to install the package from PyPI:

pip install geneticml

Usage

This package provides a easy way to create estimators and perform the optimization with genetic algorithms. The example below describe in details how to create a simulation with genetic algorithms using evolutionary approach to train a sklearn.neural_network.MLPClassifier. A full list of examples could be found here.

from geneticml.optimizers import GeneticOptimizer
from geneticml.strategy import EvolutionaryStrategy
from geneticml.algorithms import EstimatorBuilder
from metrics import metric_accuracy
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

if __name__ == "__main__":

    seed = 11412

    # Creates an estimator
    estimator = EstimatorBuilder()\
        .of(model_type=MLPClassifier)\
        .fit_with(func=fit)\
        .predict_with(func=predict)\
        .build()

    # Defines a strategy for the optimization
    strategy = EvolutionaryStrategy(
        estimator_type=estimator,
        parameters=parameters,
        retain=0.4,
        random_select=0.1,
        mutate_chance=0.2,
        max_children=2,
        random_state=seed
    )

    # Creates the optimizer
    optimizer = GeneticOptimizer(strategy=strategy)

    # Loads the data
    data = load_iris()

    # Defines the metric
    metric = metric_accuracy
    greater_is_better = True

    # Create the simulation using the optimizer and the strategy
    models = optimizer.simulate(
        data=data.data, 
        target=data.target,
        generations=generations,
        population=population,
        evaluation_function=metric,
        greater_is_better=greater_is_better,
        verbose=True
    )

The estimator is the way you define an algorithm or a class that will be used for model instantiation

estimator = EstimatorBuilder().of(model_type=MLPClassifier).fit_with(func=fit).predict_with(func=predict).build()

You need to speficy a custom fit and predict functions. These functions need to use the same signature than the below ones. This happens because the algorithm is generic and needs to know how to perform the fit and predict functions for the models.

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

Custom strategy

You can create custom strategies for the optimizers by extending the geneticml.strategy.BaseStrategy and implementing the execute(...) function.

class MyCustomStrategy(BaseStrategy):
    def __init__(self, estimator_type: Type[BaseEstimator]) -> None:
        super().__init__(estimator_type)

    def execute(self, population: List[Type[T]]) -> List[T]:
        return population

The custom strategies will allow you to create optimization strategies to archive your goals. We currently have the evolutionary strategy but you can define your own :)

Custom optimizer

You can create custom optimizers by extending the geneticml.optimizers.BaseOptimizer and implementing the simulate(...) function.

class MyCustomOptimizer(BaseOptimizer):
    def __init__(self, strategy: Type[BaseStrategy]) -> None:
        super().__init__(strategy)

    def simulate(self, data, target, verbose: bool = True) -> List[T]:
        """
        Generate a network with the genetic algorithm.

        Parameters:
            data (?): The data used to train the algorithm
            target (?): The targets used to train the algorithm
            verbose (bool): True if should verbose or False if not

        Returns:
            (List[BaseEstimator]): A list with the final population sorted by their loss
        """
        estimators = self._strategy.create_population()
        for x in estimators:
            x.fit(data, target)
            y_pred = x.predict(target)
        pass 

Custom optimizers will let you define how you want your algorithm to optimize the selected strategy. You can also combine custom strategies and optimizers to archive your desire objective.

Testing

The following are the steps to create a virtual environment into a folder named "venv" and install the requirements.

# Create virtualenv
python3 -m venv venv
# activate virtualenv
source venv/bin/activate
# update packages
pip install --upgrade pip setuptools wheel
# install requirements
python setup.py install

Tests can be run with python setup.py test when the virtualenv is active.

Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide. There is also an overview on GitHub.

If you are simply looking to start working with the geneticml codebase, navigate to the GitHub "issues" tab and start looking through interesting issues. Or maybe through using geneticml you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!

Feel free to ask questions on the mailing the contributors.

Changelog

1.0.3 - Included pytorch example

1.0.2 - Minor fixes on naming

1.0.1 - README fixes

1.0.0 - First release

You might also like...
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku
Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku

Puesta en Producción de un modelo de aprendizaje automático con Flask y Heroku L

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Payment-Date-Prediction Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

Comments
  • feature/data_sampling

    feature/data_sampling

    We added support to run your own data sampling (e.g., imblearn.SMOTE) and use the genetic algorithms to find the best set parameters for them. Also, you can find the best set of parameters for your machine learning model at same time that find the best minority class size that maximizes the model score

    opened by albarsil 0
Releases(1.0.8)
Owner
Allan Barcelos
Allan Barcelos
scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022
#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

30 Days Of Streamlit 🎈 This is the official repo of #30DaysOfStreamlit — a 30-day social challenge for you to learn, build and deploy Streamlit apps.

Streamlit 53 Jan 02, 2023
决策树分类与回归模型的实现和可视化

DecisionTree 决策树分类与回归模型,以及可视化 DecisionTree ID3 C4.5 CART 分类 回归 决策树绘制 分类树 回归树 调参 剪枝 ID3 ID3决策树是最朴素的决策树分类器: 无剪枝 只支持离散属性 采用信息增益准则 在data.py中,我们记录了一个小的西瓜数据

Welt Xing 10 Oct 22, 2022
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 02, 2023
Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

sklearn-compatible Random Bits Forest Scikit-learn compatible wrapper of the Random Bits Forest program written by Wang et al., 2016, available as a b

Tamas Madl 8 Jul 24, 2021
A Python step-by-step primer for Machine Learning and Optimization

early-ML Presentation General Machine Learning tutorials A Python step-by-step primer for Machine Learning and Optimization This github repository gat

Dimitri Bettebghor 8 Dec 01, 2022
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
Timeseries analysis for neuroscience data

=================================================== Nitime: timeseries analysis for neuroscience data ===============================================

NIPY developers 212 Dec 09, 2022
Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining

**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** S

Sebastian Raschka 4k Dec 30, 2022
Extreme Learning Machine implementation in Python

Python-ELM v0.3 --- ARCHIVED March 2021 --- This is an implementation of the Extreme Learning Machine [1][2] in Python, based on scikit-learn. From

David C. Lambert 511 Dec 20, 2022
A linear regression model for house price prediction

Linear_Regression_Model A linear regression model for house price prediction. This code is using these packages, so please make sure your have install

ShawnWang 1 Nov 29, 2021
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 01, 2023
Forecast dynamically at scale with this unique package. pip install scalecast

🌄 Scalecast: Dynamic Forecasting at Scale About This package uses a scaleable forecasting approach in Python with common scikit-learn and statsmodels

Michael Keith 158 Jan 03, 2023
Decision Tree Regression algorithm implemented on Python from scratch.

Decision_Tree_Regression I implemented the decision tree regression algorithm on Python. Unlike regular linear regression, this algorithm is used when

1 Dec 22, 2021
TensorFlow implementation of an arbitrary order Factorization Machine

This is a TensorFlow implementation of an arbitrary order (=2) Factorization Machine based on paper Factorization Machines with libFM. It supports: d

Mikhail Trofimov 785 Dec 21, 2022
hgboost - Hyperoptimized Gradient Boosting

hgboost is short for Hyperoptimized Gradient Boosting and is a python package for hyperparameter optimization for xgboost, catboost and lightboost using cross-validation, and evaluating the results o

Erdogan Taskesen 34 Jan 03, 2023
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
The Emergence of Individuality

The Emergence of Individuality

16 Jul 20, 2022
A collection of Machine Learning Models To Web Api which are built on open source technologies/frameworks like Django, Flask.

Author Ibrahim Koné From-Machine-Learning-Models-To-WebAPI A collection of Machine Learning Models To Web Api which are built on open source technolog

Ibrahim Koné 2 May 24, 2022
Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Microsoft 43.4k Jan 04, 2023