Relevance Vector Machine implementation using the scikit-learn API.

Overview

scikit-rvm

https://travis-ci.org/JamesRitchie/scikit-rvm.svg?branch=master https://coveralls.io/repos/JamesRitchie/scikit-rvm/badge.svg?branch=master&service=github

scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API.

Quickstart

With NumPy, SciPy and scikit-learn available in your environment, install with:

pip install https://github.com/JamesRitchie/scikit-rvm/archive/master.zip

Regression is done with the RVR class:

>>> from skrvm import RVR
>>> X = [[0, 0], [2, 2]]
>>> y = [0.5, 2.5 ]
>>> clf = RVR(kernel='linear')
>>> clf.fit(X, y)
RVR(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='linear', n_iter=3000,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.49995187])

Classification is done with the RVC class:

>>> from skrvm import RVC
>>> from sklearn.datasets import load_iris
>>> clf = RVC()
>>> clf.fit(iris.data, iris.target)
RVC(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='rbf', n_iter=3000, n_iter_posterior=50,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.score(iris.data, iris.target)
0.97999999999999998

Theory

The RVM is a sparse Bayesian analogue to the Support Vector Machine, with a number of advantages:

  • It provides probabilistic estimates, as opposed to the SVM's point estimates.
  • Typically provides a sparser solution than the SVM, which tends to have the number of support vectors grow linearly with the size of the training set.
  • Does not need a complexity parameter to be selected in order to avoid overfitting.

However it is more expensive to train than the SVM, although prediction is faster and no cross-validation runs are required.

The RVM's original creator Mike Tipping provides a selection of papers offering detailed insight into the formulation of the RVM (and sparse Bayesian learning in general) on a dedicated page, along with a Matlab implementation.

Most of this implementation was written working from Section 7.2 of Christopher M. Bishops's Pattern Recognition and Machine Learning.

Contributors

Future Improvements

  • Implement the fast Sequential Sparse Bayesian Learning Algorithm outlined in Section 7.2.3 of Pattern Recognition and Machine Learning
  • Handle ill-conditioning errors more gracefully.
  • Implement more kernel choices.
  • Create more detailed examples with IPython notebooks.
Owner
James Ritchie
Postgraduate research student in machine learning
James Ritchie
database for artificial intelligence/machine learning data

AIDB v0.0.1 database for artificial intelligence/machine learning data Overview aidb is a database designed for large dataset for machine learning pro

Aarush Gupta 1 Oct 24, 2021
MLR - Machine Learning Research

Machine Learning Research 1. Project Topic 1.1. Exsiting research Benmark: https://paperswithcode.com/sota ACL anthology for NLP papers: http://www.ac

Charles 69 Oct 20, 2022
A Multipurpose Library for Synthetic Time Series Generation in Python

TimeSynth Multipurpose Library for Synthetic Time Series Please cite as: J. R. Maat, A. Malali, and P. Protopapas, “TimeSynth: A Multipurpose Library

278 Dec 26, 2022
pymc-learn: Practical Probabilistic Machine Learning in Python

pymc-learn: Practical Probabilistic Machine Learning in Python Contents: Github repo What is pymc-learn? Quick Install Quick Start Index What is pymc-

pymc-learn 196 Dec 07, 2022
ML Optimizers from scratch using JAX

Toy implementations of some popular ML optimizers using Python/JAX

Shreyansh Singh 38 Jul 29, 2022
MaD GUI is a basis for graphical annotation and computational analysis of time series data.

MaD GUI Machine Learning and Data Analytics Graphical User Interface MaD GUI is a basis for graphical annotation and computational analysis of time se

Machine Learning and Data Analytics Lab FAU 10 Dec 19, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 05, 2023
QML: A Python Toolkit for Quantum Machine Learning

QML is a Python2/3-compatible toolkit for representation learning of properties of molecules and solids.

176 Dec 09, 2022
Solve automatic numerical differentiation problems in one or more variables.

numdifftools The numdifftools library is a suite of tools written in _Python to solve automatic numerical differentiation problems in one or more vari

Per A. Brodtkorb 181 Dec 16, 2022
Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Federal University of Rio Grande do Norte Technology Center Department of Computer Engineering and Automation Machine Learning Based Systems Design Re

Ivanovitch Silva 81 Oct 18, 2022
SPCL 48 Dec 12, 2022
MLFlow in a Dockercontainer based on Azurite and Postgres

mlflow-azurite-postgres docker This is a MLFLow image which works with a postgres DB and a local Azure Blob Storage Instance (Azurite). This image is

2 May 29, 2022
Simple structured learning framework for python

PyStruct PyStruct aims at being an easy-to-use structured learning and prediction library. Currently it implements only max-margin methods and a perce

pystruct 666 Jan 03, 2023
Azure MLOps (v2) solution accelerators.

Azure MLOps (v2) solution accelerator Welcome to the MLOps (v2) solution accelerator repository! This project is intended to serve as the starting poi

Microsoft Azure 233 Jan 01, 2023
The project's goal is to show a real world application of image segmentation using k means algorithm

The project's goal is to show a real world application of image segmentation using k means algorithm

2 Jan 22, 2022
Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

Artsem Zhyvalkouski 64 Nov 30, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 03, 2023
Machine Learning for Time-Series with Python.Published by Packt

Machine-Learning-for-Time-Series-with-Python Become proficient in deriving insights from time-series data and analyzing a model’s performance Links Am

Packt 124 Dec 28, 2022
Graphsignal is a machine learning model monitoring platform.

Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model

Graphsignal 143 Dec 05, 2022
Warren - Stock Price Predictor

Web app to predict closing stock prices in real time using Facebook's Prophet time series algorithm with a multi-variate, single-step time series forecasting strategy.

Kumar Nityan Suman 153 Jan 03, 2023