Traingenerator 🧙 A web app to generate template code for machine learning ✨

Overview

Traingenerator

🧙   A web app to generate template code for machine learning

Gitter Heroku Code style: black



🎉 Traingenerator is now live! 🎉

Try it out:
https://traingenerator.jrieke.com


Generate custom template code for PyTorch & sklearn, using a simple web UI built with streamlit. Traingenerator offers multiple options for preprocessing, model setup, training, and visualization (using Tensorboard or comet.ml). It exports to .py, Jupyter Notebook, or Google Colab. The perfect tool to jumpstart your next machine learning project!


For updates, follow me on Twitter, and if you like this project, please consider sponsoring




Adding new templates

You can add your own template in 4 easy steps (see below), without changing any code in the app itself. Your new template will be automatically discovered by Traingenerator and shown in the sidebar. That's it! 🎈

Want to share your magic? 🧙 PRs are welcome! Please have a look at CONTRIBUTING.md and write on Gitter.

Some ideas for new templates: Keras/Tensorflow, Pytorch Lightning, object detection, segmentation, text classification, ...

  1. Create a folder under ./templates. The folder name should be the task that your template solves (e.g. Image classification). Optionally, you can add a framework name (e.g. Image classification_PyTorch). Both names are automatically shown in the first two dropdowns in the sidebar (see image). Tip: Copy the example template to get started more quickly.
  2. Add a file sidebar.py to the folder (see example). It needs to contain a method show(), which displays all template-specific streamlit components in the sidebar (i.e. everything below Task) and returns a dictionary of user inputs.
  3. Add a file code-template.py.jinja to the folder (see example). This Jinja2 template is used to generate the code. You can write normal Python code in it and modify it (through Jinja) based on the user inputs in the sidebar (e.g. insert a parameter value from the sidebar or show different code parts based on the user's selection).
  4. Optional: Add a file test-inputs.yml to the folder (see example). This simple YAML file should define a few possible user inputs that can be used for testing. If you run pytest (see below), it will automatically pick up this file, render the code template with its values, and check that the generated code runs without errors. This file is optional – but it's required if you want to contribute your template to this repo.

Installation

Note: You only need to install Traingenerator if you want to contribute or run it locally. If you just want to use it, go here.

git clone https://github.com/jrieke/traingenerator.git
cd traingenerator
pip install -r requirements.txt

Optional: For the "Open in Colab" button to work you need to set up a Github repo where the notebook files can be stored (Colab can only open public files if they are on Github). After setting up the repo, create a file .env with content:

GITHUB_TOKEN=<your-github-access-token>
REPO_NAME=<user/notebooks-repo>

If you don't set this up, the app will still work but the "Open in Colab" button will only show an error message.

Running locally

streamlit run app/main.py

Make sure to run always from the traingenerator dir (not from the app dir), otherwise the app will not be able to find the templates.

Deploying to Heroku

First, install heroku and login. To create a new deployment, run inside traingenerator:

heroku create
git push heroku main
heroku open

To update the deployed app, commit your changes and run:

git push heroku main

Optional: If you set up a Github repo to enable the "Open in Colab" button (see above), you also need to run:

heroku config:set GITHUB_TOKEN=
   
    
heroku config:set REPO_NAME=
    

    
   

Testing

First, install pytest and required plugins via:

pip install -r requirements-dev.txt

To run all tests:

pytest ./tests

Note that this only tests the code templates (i.e. it renders them with different input values and makes sure that the code executes without error). The streamlit app itself is not tested at the moment.

You can also test an individual template by passing the name of the template dir to --template, e.g.:

pytest ./tests --template "Image classification_scikit-learn"

The mage image used in Traingenerator is from Twitter's Twemoji library and released under Creative Commons Attribution 4.0 International Public License.

Owner
Johannes Rieke
Product manager dev experience @streamlit
Johannes Rieke
Test symmetries with sklearn decision tree models

Test symmetries with sklearn decision tree models Setup Begin from an environment with a recent version of python 3. source setup.sh Leave the enviro

Rupert Tombs 2 Jul 19, 2022
This jupyter notebook project was completed by me and my friend using the dataset from Kaggle

ARM This jupyter notebook project was completed by me and my friend using the dataset from Kaggle. The world Happiness 2017, which ranks 155 countries

1 Jan 23, 2022
A collection of neat and practical data science and machine learning projects

Data Science A collection of neat and practical data science and machine learning projects Explore the docs » Report Bug · Request Feature Table of Co

Will Fong 2 Dec 10, 2021
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries. Documenta

2.5k Jan 07, 2023
A Multipurpose Library for Synthetic Time Series Generation in Python

TimeSynth Multipurpose Library for Synthetic Time Series Please cite as: J. R. Maat, A. Malali, and P. Protopapas, “TimeSynth: A Multipurpose Library

278 Dec 26, 2022
Time series changepoint detection

changepy Changepoint detection in time series in pure python Install pip install changepy Examples from changepy import pelt from cha

Rui Gil 92 Nov 08, 2022
Case studies with Bayesian methods

Case studies with Bayesian methods

Baze Petrushev 8 Nov 26, 2022
Transform ML models into a native code with zero dependencies

m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code

Bayes' Witnesses 2.3k Jan 03, 2023
Regularization and Feature Selection in Least Squares Temporal Difference Learning

Regularization and Feature Selection in Least Squares Temporal Difference Learning Description This is Python implementations of Least Angle Regressio

Mina Parham 0 Jan 18, 2022
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Daniel Formoso 5.7k Dec 30, 2022
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

Daniel Han-Chen 1.4k Jan 01, 2023
This is a Machine Learning model which predicts the presence of Diabetes in Patients

Diabetes Disease Prediction This is a machine Learning mode which tries to determine if a person has a diabetes or not. Data The dataset is in comma s

Edem Gold 4 Mar 16, 2022
Machine learning that just works, for effortless production applications

Machine learning that just works, for effortless production applications

Elisha Yadgaran 16 Sep 02, 2022
inding a method to objectively quantify skill versus chance in games, using reinforcement learning

Skill-vs-chance-games-analysis - Finding a method to objectively quantify skill versus chance in games, using reinforcement learning

Marcus Chiam 4 Nov 19, 2022
Extreme Learning Machine implementation in Python

Python-ELM v0.3 --- ARCHIVED March 2021 --- This is an implementation of the Extreme Learning Machine [1][2] in Python, based on scikit-learn. From

David C. Lambert 511 Dec 20, 2022
onelearn: Online learning in Python

onelearn: Online learning in Python Documentation | Reproduce experiments | onelearn stands for ONE-shot LEARNning. It is a small python package for o

15 Nov 06, 2022
Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

X - Data Science Initiative 410 Dec 14, 2022
A collection of interactive machine-learning experiments: 🏋️models training + 🎨models demo

🤖 Interactive Machine Learning experiments: 🏋️models training + 🎨models demo

Oleksii Trekhleb 1.4k Jan 06, 2023
An easier way to build neural search on the cloud

Jina is geared towards building search systems for any kind of data, including text, images, audio, video and many more. With the modular design & multi-layer abstraction, you can leverage the effici

Jina AI 17k Jan 01, 2023
The Simpsons and Machine Learning: What makes an Episode Great?

The Simpsons and Machine Learning: What makes an Episode Great? Check out my Medium article on this! PROBLEM: The Simpsons has had a decline in qualit

1 Nov 02, 2021