Tools for Optuna, MLflow and the integration of both.

Last update: Nov 20, 2022

Overview

HPOflow - Sphinx DOC

Tools for Optuna, MLflow and the integration of both.

Detailed documentation with examples can be found here: Sphinx DOC

Maintainers
Installation
Support and Feedback
Reporting Security Vulnerabilities
Contribution
Code of Conduct
Licensing

Maintainers

This project is maintained by the One Conversation team of Deutsche Telekom AG.

The main components are:

hpoflow.OptunaMLflow:
A wrapper to use Optuna and log to MLflow at the same time.
hpoflow.OptunaMLflowCallback:
Class inheriting from transformers.TrainerCallback that integrates with OptunaMLflow to send the logs to MLflow and Optuna during model training.
hpoflow.SignificanceRepeatedTrainingPruner:
An Optuna pruner to use statistical significance (a t-test which serves as a heuristic) to stop unpromising trials early, avoiding unnecessary repeated training during cross validation.

Installation

HPOflow is available at the Python Package Index (PyPI). It can be installed with pip:

$ pip install hpoflow

Some additional dependencies might be necessary.

To use hpoflow.optuna_mlflow.OptunaMLflow:

$ pip install mlflow GitPython

To use hpoflow.optuna_transformers.OptunaMLflowCallback:

$ pip install mlflow GitPython transformers

To install all optional dependencies use:

$ pip install hpoflow[optional]

Support and Feedback

The following channels are available for discussions, feedback, and support requests:

Reporting Security Vulnerabilities

This project is built with security and data privacy in mind to ensure your data is safe. We are grateful for security researchers and users reporting a vulnerability to us, first. To ensure that your request is handled in a timely manner and non-disclosure of vulnerabilities can be assured, please follow the below guideline.

Please do not report security vulnerabilities directly on GitHub. GitHub Issues can be publicly seen and therefore would result in a direct disclosure.

Please address questions about data privacy, security concepts, and other media requests to the [email protected] mailbox.

Contribution

Our commitment to open source means that we are enabling - in fact encouraging - all interested parties to contribute and become part of our developer community.

Contribution and feedback is encouraged and always welcome. For more information about how to contribute, as well as additional contribution information, see our Contribution Guidelines.

Code of Conduct

This project has adopted the Contributor Covenant as our code of conduct. Please see the details in our Contributor Covenant Code of Conduct. All contributors must abide by the code of conduct.

Licensing

Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.

Comments

review README.md and CONTRIBUTING.md
Review README.md and CONTRIBUTING.md

is there something missing? maybe compare with optuna and transformers

spelling

idiomatic english

consistency

correctness

links ok?

...

PS: The real documentation is still missing and a know issue.
opened by PhilipMay 12
add typing in optuna_transformers

@twolffpiggott can you please tell me the type of this?

https://github.com/telekom/HPOflow/blob/e2b0943218af419a79ce95e60b67c9a4c2477349/hpoflow/optuna_transformers.py#L47

opened by PhilipMay 6
add `transformers.py`

@twolffpiggott should we add this here or to an other project we open source?

https://github.com/PhilipMay/mltb/blob/master/mltb/integration/transformers.py
enhancement

opened by PhilipMay 6
Create Sphinx documentation page
[x] setup

[x] make GH action

[x] setup page

[x] change styling to telekom style

~~switch to MD~~

[x] add more content

[x] link from README to page

[x] link from pypi to GH page

[x] add impressum

[x] remove strange mouse over image effect

~~add version info~~

documentation
opened by PhilipMay 4
Problems with direct `_imports.check()` call

When the __init__.py imports OMLflowCallback the optuna_transformers.py script is executed. That executes the _imports.check() call which then throws an exception if transformers or mlflow is not installed. But that should be avoided.

See here: https://github.com/telekom/HPOflow/blob/d1cce5cbc2a84634d1484a053286000dda05b681/hpoflow/optuna_transformers.py#L11-L17

The solution would be to put the _imports.check() call into the constructor. But that is not possible because OMLflowCallback inherits from transformers.

The only solution I have is to put OMLflowCallback into an factory function that creates an OMLflowCallback and does the _imports.check() in there.

@twolffpiggott what do you think?
bug

opened by PhilipMay 3
Flake8 ignore list for Black compatibility
Flake8 raises a warning for "E203" when it encounters a Black decision to insert whitespace before : in slicing syntax.

Black's behaviour is more correct here, so my suggestion is to add "E203" to the flake8 config ignore list.

i.e. in setup.cfg:

[flake8] ... extend-ignore = E203
opened by twolffpiggott 3
Simple Example?

I don't understand how to use this package. Could you provide a basic example? I don't understand the import_structure and how it relates to importing the modules? Thanks

opened by jmrichardson 2
WIP prefix in contrib file

Should this

Create Work In Progress [WIP] pull requests only if you need clarification or an explicit review before you can continue your work item.

be more like this

Add a [WIP] prefix on your pull request name if you need clarification or an explicit review before you can continue your work item.

documentation

opened by PhilipMay 2

Releases(0.1.4)

0.1.4(Aug 14, 2022)

Source code(tar.gz)
Source code(zip)
0.1.3(Jul 17, 2022)
add Python 3.10

Source code(tar.gz)
Source code(zip)
0.1.2(Aug 6, 2021)
add all Optuna versions - see https://github.com/telekom/HPOflow/issues/87

Source code(tar.gz)
Source code(zip)
0.1.1(Aug 3, 2021)

hotfix for " fix Optuna issue since new release" #87
Source code(tar.gz)
Source code(zip)
0.1.0(Jul 20, 2021)
First real Release: 0.1.0

Main components are:

OptunaMLflow - Wrapper to log to Optuna and MLflow at the same time.

OptunaMLflowCallback - Integration of Optuna and MLflow for Transformers.

SignificanceRepeatedTrainingPruner - Optuna pruner which uses statistical significance as an heuristic for decision-making.

Source code(tar.gz)
Source code(zip)
0.1.0rc3(Jul 15, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Telekom Open Source Software

published by Deutsche Telekom AG and partner companies

GitHub Repository https://telekom.github.io/HPOflow/

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

8.4k Dec 30, 2022

MachineLearningStocks is designed to be an intuitive and highly extensible template project applying machine learning to making stock predictions.

Using python and scikit-learn to make stock predictions

1.3k Jan 03, 2023

An AutoML survey focusing on practical systems.

This project is a community effort in constructing and maintaining an up-to-date beginner-friendly introduction to AutoML, focusing on practical systems. AutoML is a big field, and continues to grow

16 Aug 14, 2022

Cryptocurrency price prediction and exceptions in python

Cryptocurrency price prediction and exceptions in python This is a coursework on foundations of computing module Through this coursework i worked on m

1 Nov 07, 2021

Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Payment-Date-Prediction Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

15 Sep 09, 2022

ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

648 Dec 16, 2022

It is a forest of random projection trees

rpforest rpforest is a Python library for approximate nearest neighbours search: finding points in a high-dimensional space that are close to a given

211 Dec 29, 2022

All-in-one web-based development environment for machine learning

All-in-one web-based development environment for machine learning Getting Started • Features & Screenshots • Support • Report a Bug • FAQ • Known Issu

3 Feb 03, 2021

Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

21 Jun 25, 2022

Stats, linear algebra and einops for xarray

xarray-einstats Stats, linear algebra and einops for xarray ⚠️ Caution: This project is still in a very early development stage Installation To instal

30 Dec 28, 2022

Lightweight Machine Learning Experiment Logging 📖

Simple logging of statistics, model checkpoints, plots and other objects for your Machine Learning Experiments (MLE). Furthermore, the MLELogger comes with smooth multi-seed result aggregation and co

65 Dec 08, 2022

The project's goal is to show a real world application of image segmentation using k means algorithm

2 Jan 22, 2022

Production Grade Machine Learning Service

This project is made to help you scale from a basic Machine Learning project for research purposes to a production grade Machine Learning web service

10 Apr 04, 2022

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment to test the algorithm

59 Dec 09, 2022

DistML is a Ray extension library to support large-scale distributed ML training on heterogeneous multi-node multi-GPU clusters

27 Aug 19, 2022

Tools for Optuna, MLflow and the integration of both.

Related tags

Overview

HPOflow - Sphinx DOC

Table of Contents

Maintainers

Installation

Support and Feedback

Reporting Security Vulnerabilities

Contribution

Code of Conduct

Licensing

Comments

Releases(0.1.4)

0.1.4(Aug 14, 2022)

0.1.3(Jul 17, 2022)

0.1.2(Aug 6, 2021)

0.1.1(Aug 3, 2021)

0.1.0(Jul 20, 2021)

First real Release: 0.1.0

0.1.0rc3(Jul 15, 2021)

Owner

Telekom Open Source Software

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

MachineLearningStocks is designed to be an intuitive and highly extensible template project applying machine learning to making stock predictions.

An AutoML survey focusing on practical systems.

Cryptocurrency price prediction and exceptions in python

Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

ThunderGBM: Fast GBDTs and Random Forests on GPUs

It is a forest of random projection trees

All-in-one web-based development environment for machine learning

Examples and code for the Practical Machine Learning workshop series

Stats, linear algebra and einops for xarray

Lightweight Machine Learning Experiment Logging 📖

The project's goal is to show a real world application of image segmentation using k means algorithm

Production Grade Machine Learning Service

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

DistML is a Ray extension library to support large-scale distributed ML training on heterogeneous multi-node multi-GPU clusters

The Ultimate FREE Machine Learning Study Plan

A simple example of ML classification, cross validation, and visualization of feature importances

Spark development environment for k8s

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

NumPy-based implementation of a multilayer perceptron (MLP)