Supervised domain-agnostic prediction framework for probabilistic modelling

Overview

skpro

PyPI version Build Status License

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data points.

The package offers a variety of features and specifically allows for

  • the implementation of probabilistic prediction strategies in the supervised contexts
  • comparison of frequentist and Bayesian prediction methods
  • strategy optimization through hyperparamter tuning and ensemble methods (e.g. bagging)
  • workflow automation

List of developers and contributors

Documentation

The full documentation is available here.

Installation

Installation is easy using Python's package manager

$ pip install skpro

Contributing & Citation

We welcome contributions to the skpro project. Please read our contribution guide.

If you use skpro in a scientific publication, we would appreciate citations.

Comments
  • Distributions as return objects

    Distributions as return objects

    Re-opening the sub-issue opened in #3 and commented upon by @murphyk

    Question: should skpro's predict methods return a vector of distribution objects? For example, using the distributions from scipy.stats which implement methods pdf, cdf, mean, var, etc.

    Pro:

    • this would be using an existing, consolidated, and well-supported interface
    • it might be easier to use
    • it might be easier to understand

    Contra:

    • mixture types are not supported
    • l2 norm is not supported (as would be needed for squared/Gneiting loss)
    • mixed distributions on the reals, especially empirical distributions (weighted sum of deltas) which are returned by Bayesian packages are not supported
    • vectors of distributions are not supported, alternatively Cartesian products of distributions
    • this is not the status quo
    help wanted 
    opened by fkiraly 11
  • documentation: np.mean(y_pred) does not work

    documentation: np.mean(y_pred) does not work

    I'm following along with this intro example.. However this line fails

    (numpy.mean(y_pred) * 2).shape
    

    Error below (seems to be because Distribution objects don't support the mean() function but instead insist on obscurely calling it point!)

    np.mean(y_pred)
    Traceback (most recent call last):
    
      File "<ipython-input-38-19819be87ab5>", line 1, in <module>
        np.mean(y_pred)
    
      File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2920, in mean
        out=out, **kwargs)
    
      File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py", line 75, in _mean
        ret = umr_sum(arr, axis, dtype, out, keepdims)
    
    TypeError: unsupported operand type(s) for +: 'Distribution' and 'Distribution'
    
    opened by murphyk 3
  • First example: 'utils' not found

    First example: 'utils' not found

    The first example in your documentation (DensityBaseline) does not run right on my machine: it throws a 'module not found' exception at the call to 'utils'.

    This might be a python version problem (I am using 3.6), so perhaps it's not an error in the normal sense - though I don't see any specification that the package required a particular python version. Apologies if I missed it: in any case, I fixed it by importing matplotlib instead: i.e.

    import matplotlib.pyplot as plt plt.scatter(y_test, y_pred)

    instead of:

    import utils utils.plot_performance(y_test, y_pred)

    opened by Thomas-M-H-Hope 2
  • problem in loading the skpro

    problem in loading the skpro

    It has been 2 days that I am trying to import skpro. But I can not I keep getting this error:

    cannot import name 'six' from 'sklearn.externals' (C:\Users\My Book\anaconda3\lib\site-packages\sklearn\externals_init_.py)

    opened by honestee 1
  • (wish)list of probabilistic regressors to implement or to interface

    (wish)list of probabilistic regressors to implement or to interface

    A wishlist for probabilistic regression methods to implement or interface. This is partly copied from the R counterpart https://github.com/mlr-org/mlr3proba/issues/32 . Number of stars at the end is estimated difficulty or time investment.

    GLM

    • [ ] generalized linear model(s) with regression link, e.g., Gaussian *
    • [ ] generalized linear model(s) with count link, e.g., Poisson *
    • [ ] heteroscedastic linear regression ***
    • [ ] Bayesian GLM where conjugate priors are available, e.g., GLM with Gaussian link ***

    KRR aka Gaussian process regression

    • [ ] vanilla kernel ridge regression with fixed kernel parameters and variance *
    • [ ] kernel ridge regression with MLE for kernel parameters and regularization parameter **
    • [ ] heteroscedastic KRR or Gaussian processes ***

    CDE

    • [ ] variants of conditional density estimation (Nadaraya-Watson type) **
    • [ ] reduction to density estimation by binning of input variables, then apply unconditional density estimation **

    Tree-based

    • [ ] probabilistic regression trees **

    Neural networks

    • [ ] interface tensorflow probability - some hard-coded NN architectures **
    • [ ] generic tensorflow probability interface - some hard-coded NN architectures ***

    Bayesian toolboxes

    • [ ] generic pymc3 interface ***
    • [ ] generic pyro interface ****
    • [ ] generic Stan interface ****
    • [ ] generic JAGS interface ****
    • [ ] generic BUGS interface ****
    • [ ] generic Bayesian interface - prior-valued hyperparameters *****

    Pipeline elements for target transformation

    • [ ] distr fixed target transformation **
    • [ ] distr predictive target calibration **

    Composite techniques, reduction to deterministic regression

    • [ ] stick mean, sd, from a deterministic regressor which already has these as return types into some location/scale distr family (Gaussian, Laplace) *
    • [ ] use model 1 for the mean, model 2 fit to residuals (squared, absolute, or log), put this in some location/scale distr family (Gaussian, Laplace) **
    • [ ] upper/lower thresholder for a regression prediction, to use as a pipeline element for a forced lower variance bound **
    • [ ] generic parameter prediction by elicitation, output being plugged into parameters of a distr object not necessarily scale/location ****
    • [ ] reduction via bootstrapped sampling of a determinstic regressor **

    Ensembling type pipeline elements and compositors

    • [ ] simple bagging, averaging of pdf/cdf **
    • [ ] probabilistic boosting ***
    • [ ] probabilistic stacking ***

    baselines

    • [ ] always predict a Gaussian with mean = training mean, var = training var *
    • [ ] IMPORTANT as featureless baseline: reduction to distr/density estimation to produce an unconditional probabilistic regressor **
    • [ ] IMPORTANT as deterministic style baseline: reduction to deterministic regression, mean = prediction by det.regressor, var = training sample var, distr type = Gaussian (or Laplace) **

    Other reduction from/to probabilistic regression

    • [ ] reducing deterministic regression to probabilistic regression - take mean, median or mode **
    • [ ] reduction(s) to quantile regression, use predictive quantiles to make a distr ***
    • [ ] reducing deterministic (quantile) regression to probabilistic regression - take quantile(s) **
    • [ ] reducing interval regression to probabilistic regression - take mean/sd, or take quantile(s) **
    • [ ] reduction to survival, as the sub-case of no censoring **
    • [ ] reduction to classification, by binning ***
    good first issue 
    opened by fkiraly 0
  • skpro-refactoring (version-2)

    skpro-refactoring (version-2)

    See below some comments/description of the coming refactoring contents :

    • Distribution classes refactoring in a more OOD way (see. skpro->distribution)
    • Losse functions (see. metrics->distribution)
    • Estimators (see. metrics->distribution)

    Some descriptive notebooks (in docs->notebooks) and a full set of unit test (in tests) are also available.

    opened by jesellier 24
Releases(v1.0.1-beta)
Owner
The Alan Turing Institute
The UK's national institute for data science and artificial intelligence.
The Alan Turing Institute
Time Delayed NN implemented in pytorch

Pytorch Time Delayed NN Time Delayed NN implemented in PyTorch. Usage kernels = [(1, 25), (2, 50), (3, 75), (4, 100), (5, 125), (6, 150)] tdnn = TDNN

Daniil Gavrilov 79 Aug 04, 2022
Code and data for paper "Deep Photo Style Transfer"

deep-photo-styletransfer Code and data for paper "Deep Photo Style Transfer" Disclaimer This software is published for academic and non-commercial use

Fujun Luan 9.9k Dec 29, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
frida工具的缝合怪

fridaUiTools fridaUiTools是一个界面化整理脚本的工具。新人的练手作品。参考项目ZenTracer,觉得既然可以界面化,那么应该可以把功能做的更加完善一些。跨平台支持:win、mac、linux 功能缝合怪。把一些常用的frida的hook脚本简单统一输出方式后,整合进来。并且

diveking 997 Jan 09, 2023
NVIDIA container runtime

nvidia-container-runtime A modified version of runc adding a custom pre-start hook to all containers. If environment variable NVIDIA_VISIBLE_DEVICES i

NVIDIA Corporation 938 Jan 06, 2023
Benchmarks for semi-supervised domain generalization.

Semi-Supervised Domain Generalization This code is the official implementation of the following paper: Semi-Supervised Domain Generalization with Stoc

Kaiyang 49 Dec 10, 2022
Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

Yasamin Jafarian 287 Jan 06, 2023
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

worstcoder 68 Dec 30, 2022
Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons This repository contains the code to repr

Computational Neuroscience, University of Bern 3 Aug 04, 2022
Code for NeurIPS 2021 paper 'Spatio-Temporal Variational Gaussian Processes'

Spatio-Temporal Variational GPs This repository is the official implementation of the methods in the publication: O. Hamelijnck, W.J. Wilkinson, N.A.

AaltoML 26 Sep 16, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 05, 2023
Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Generative Models Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow. Also present here are RBM and Helmholtz Machine. Note: Gen

Agustinus Kristiadi 7k Jan 02, 2023
Temporally Coherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Duc Linh Nguyen 2 Jan 18, 2022
This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

CV Lab @ Yonsei University 36 Nov 04, 2022
Reinforcement learning models in ViZDoom environment

DoomNet DoomNet is a ViZDoom agent trained by reinforcement learning. The agent is a neural network that outputs a probability of actions given only p

Andrey Kolishchak 126 Dec 09, 2022
Maximum Spatial Perturbation for Image-to-Image Translation (Official Implementation)

MSPC for I2I This repository is by Yanwu Xu and contains the PyTorch source code to reproduce the experiments in our CVPR2022 paper Maximum Spatial Pe

51 Dec 14, 2022
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

scikit-learn 52.5k Jan 08, 2023