Supervised domain-agnostic prediction framework for probabilistic modelling

Overview

skpro

PyPI version Build Status License

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data points.

The package offers a variety of features and specifically allows for

  • the implementation of probabilistic prediction strategies in the supervised contexts
  • comparison of frequentist and Bayesian prediction methods
  • strategy optimization through hyperparamter tuning and ensemble methods (e.g. bagging)
  • workflow automation

List of developers and contributors

Documentation

The full documentation is available here.

Installation

Installation is easy using Python's package manager

$ pip install skpro

Contributing & Citation

We welcome contributions to the skpro project. Please read our contribution guide.

If you use skpro in a scientific publication, we would appreciate citations.

Comments
  • Distributions as return objects

    Distributions as return objects

    Re-opening the sub-issue opened in #3 and commented upon by @murphyk

    Question: should skpro's predict methods return a vector of distribution objects? For example, using the distributions from scipy.stats which implement methods pdf, cdf, mean, var, etc.

    Pro:

    • this would be using an existing, consolidated, and well-supported interface
    • it might be easier to use
    • it might be easier to understand

    Contra:

    • mixture types are not supported
    • l2 norm is not supported (as would be needed for squared/Gneiting loss)
    • mixed distributions on the reals, especially empirical distributions (weighted sum of deltas) which are returned by Bayesian packages are not supported
    • vectors of distributions are not supported, alternatively Cartesian products of distributions
    • this is not the status quo
    help wanted 
    opened by fkiraly 11
  • documentation: np.mean(y_pred) does not work

    documentation: np.mean(y_pred) does not work

    I'm following along with this intro example.. However this line fails

    (numpy.mean(y_pred) * 2).shape
    

    Error below (seems to be because Distribution objects don't support the mean() function but instead insist on obscurely calling it point!)

    np.mean(y_pred)
    Traceback (most recent call last):
    
      File "<ipython-input-38-19819be87ab5>", line 1, in <module>
        np.mean(y_pred)
    
      File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2920, in mean
        out=out, **kwargs)
    
      File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py", line 75, in _mean
        ret = umr_sum(arr, axis, dtype, out, keepdims)
    
    TypeError: unsupported operand type(s) for +: 'Distribution' and 'Distribution'
    
    opened by murphyk 3
  • First example: 'utils' not found

    First example: 'utils' not found

    The first example in your documentation (DensityBaseline) does not run right on my machine: it throws a 'module not found' exception at the call to 'utils'.

    This might be a python version problem (I am using 3.6), so perhaps it's not an error in the normal sense - though I don't see any specification that the package required a particular python version. Apologies if I missed it: in any case, I fixed it by importing matplotlib instead: i.e.

    import matplotlib.pyplot as plt plt.scatter(y_test, y_pred)

    instead of:

    import utils utils.plot_performance(y_test, y_pred)

    opened by Thomas-M-H-Hope 2
  • problem in loading the skpro

    problem in loading the skpro

    It has been 2 days that I am trying to import skpro. But I can not I keep getting this error:

    cannot import name 'six' from 'sklearn.externals' (C:\Users\My Book\anaconda3\lib\site-packages\sklearn\externals_init_.py)

    opened by honestee 1
  • (wish)list of probabilistic regressors to implement or to interface

    (wish)list of probabilistic regressors to implement or to interface

    A wishlist for probabilistic regression methods to implement or interface. This is partly copied from the R counterpart https://github.com/mlr-org/mlr3proba/issues/32 . Number of stars at the end is estimated difficulty or time investment.

    GLM

    • [ ] generalized linear model(s) with regression link, e.g., Gaussian *
    • [ ] generalized linear model(s) with count link, e.g., Poisson *
    • [ ] heteroscedastic linear regression ***
    • [ ] Bayesian GLM where conjugate priors are available, e.g., GLM with Gaussian link ***

    KRR aka Gaussian process regression

    • [ ] vanilla kernel ridge regression with fixed kernel parameters and variance *
    • [ ] kernel ridge regression with MLE for kernel parameters and regularization parameter **
    • [ ] heteroscedastic KRR or Gaussian processes ***

    CDE

    • [ ] variants of conditional density estimation (Nadaraya-Watson type) **
    • [ ] reduction to density estimation by binning of input variables, then apply unconditional density estimation **

    Tree-based

    • [ ] probabilistic regression trees **

    Neural networks

    • [ ] interface tensorflow probability - some hard-coded NN architectures **
    • [ ] generic tensorflow probability interface - some hard-coded NN architectures ***

    Bayesian toolboxes

    • [ ] generic pymc3 interface ***
    • [ ] generic pyro interface ****
    • [ ] generic Stan interface ****
    • [ ] generic JAGS interface ****
    • [ ] generic BUGS interface ****
    • [ ] generic Bayesian interface - prior-valued hyperparameters *****

    Pipeline elements for target transformation

    • [ ] distr fixed target transformation **
    • [ ] distr predictive target calibration **

    Composite techniques, reduction to deterministic regression

    • [ ] stick mean, sd, from a deterministic regressor which already has these as return types into some location/scale distr family (Gaussian, Laplace) *
    • [ ] use model 1 for the mean, model 2 fit to residuals (squared, absolute, or log), put this in some location/scale distr family (Gaussian, Laplace) **
    • [ ] upper/lower thresholder for a regression prediction, to use as a pipeline element for a forced lower variance bound **
    • [ ] generic parameter prediction by elicitation, output being plugged into parameters of a distr object not necessarily scale/location ****
    • [ ] reduction via bootstrapped sampling of a determinstic regressor **

    Ensembling type pipeline elements and compositors

    • [ ] simple bagging, averaging of pdf/cdf **
    • [ ] probabilistic boosting ***
    • [ ] probabilistic stacking ***

    baselines

    • [ ] always predict a Gaussian with mean = training mean, var = training var *
    • [ ] IMPORTANT as featureless baseline: reduction to distr/density estimation to produce an unconditional probabilistic regressor **
    • [ ] IMPORTANT as deterministic style baseline: reduction to deterministic regression, mean = prediction by det.regressor, var = training sample var, distr type = Gaussian (or Laplace) **

    Other reduction from/to probabilistic regression

    • [ ] reducing deterministic regression to probabilistic regression - take mean, median or mode **
    • [ ] reduction(s) to quantile regression, use predictive quantiles to make a distr ***
    • [ ] reducing deterministic (quantile) regression to probabilistic regression - take quantile(s) **
    • [ ] reducing interval regression to probabilistic regression - take mean/sd, or take quantile(s) **
    • [ ] reduction to survival, as the sub-case of no censoring **
    • [ ] reduction to classification, by binning ***
    good first issue 
    opened by fkiraly 0
  • skpro-refactoring (version-2)

    skpro-refactoring (version-2)

    See below some comments/description of the coming refactoring contents :

    • Distribution classes refactoring in a more OOD way (see. skpro->distribution)
    • Losse functions (see. metrics->distribution)
    • Estimators (see. metrics->distribution)

    Some descriptive notebooks (in docs->notebooks) and a full set of unit test (in tests) are also available.

    opened by jesellier 24
Releases(v1.0.1-beta)
Owner
The Alan Turing Institute
The UK's national institute for data science and artificial intelligence.
The Alan Turing Institute
A curated list of awesome Machine Learning frameworks, libraries and software.

Awesome Machine Learning A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you

Joseph Misiti 57.1k Jan 03, 2023
A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

PGPElib A mini library for Policy Gradients with Parameter-based Exploration [1] and friends. This library serves as a clean re-implementation of the

NNAISENSE 56 Jan 01, 2023
AlphaNet Improved Training of Supernet with Alpha-Divergence

AlphaNet: Improved Training of Supernet with Alpha-Divergence This repository contains our PyTorch training code, evaluation code and pretrained model

Facebook Research 87 Oct 10, 2022
Style transfer, deep learning, feature transform

FastPhotoStyle License Copyright (C) 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons

NVIDIA Corporation 10.9k Jan 02, 2023
Pytorch implementation for M^3L

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification (CVPR 2021) Introduction This is the Py

Yuyang Zhao 45 Dec 26, 2022
Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco

Gty 44 Dec 17, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
A simple Rock-Paper-Scissors game using CV in python

ML18_Rock-Paper-Scissors-using-CV A simple Rock-Paper-Scissors game using CV in python For IITISOC-21 Rules and procedure to play the interactive game

Anirudha Bhagwat 3 Aug 08, 2021
Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

Suyash More 110 Dec 03, 2022
State-Relabeling Adversarial Active Learning

State-Relabeling Adversarial Active Learning Code for SRAAL [2020 CVPR Oral] Requirements torch = 1.6.0 numpy = 1.19.1 tqdm = 4.31.1 AL Results The

10 Jul 14, 2022
A python module for scientific analysis of 3D objects based on VTK and Numpy

A lightweight and powerful python module for scientific analysis and visualization of 3d objects.

Marco Musy 1.5k Jan 06, 2023
ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet) (

Wei-Ting Chen 49 Dec 27, 2022
Face Mask Detection system based on computer vision and deep learning using OpenCV and Tensorflow/Keras

Face Mask Detection Face Mask Detection System built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Chandrika Deb 1.4k Jan 03, 2023
DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics

Dimensionality Reduction + Clustering + Unsupervised Score Metrics Introduction

11 Nov 15, 2022
Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

Bingchen Zhao 83 Dec 15, 2022
Benchmark tools for Compressive LiDAR-to-map registration

Benchmark tools for Compressive LiDAR-to-map registration This repo contains the released version of code and datasets used for our IROS 2021 paper: "

Allie 9 Nov 24, 2022
EASY - Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients.

EASY - Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients. This repository is the official im

Yassir BENDOU 57 Dec 26, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

Albert Pumarola 291 Jan 02, 2023
PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Symbolic Melody Identification This repository is an unofficial PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identifica

Sophia Y. Chou 3 Feb 21, 2022