Machine learning, in numpy

Overview

numpy-ml

Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in NumPy? No?

Installation

For rapid experimentation

To use this code as a starting point for ML prototyping / experimentation, just clone the repository, create a new virtualenv, and start hacking:

$ git clone https://github.com/ddbourgin/numpy-ml.git
$ cd numpy-ml && virtualenv npml && source npml/bin/activate
$ pip3 install -r requirements-dev.txt

As a package

If you don't plan to modify the source, you can also install numpy-ml as a Python package: pip3 install -u numpy_ml.

The reinforcement learning agents train on environments defined in the OpenAI gym. To install these alongside numpy-ml, you can use pip3 install -u 'numpy_ml[rl]'.

Documentation

For more details on the available models, see the project documentation.

Available models

  1. Gaussian mixture model

    • EM training
  2. Hidden Markov model

    • Viterbi decoding
    • Likelihood computation
    • MLE parameter estimation via Baum-Welch/forward-backward algorithm
  3. Latent Dirichlet allocation (topic model)

    • Standard model with MLE parameter estimation via variational EM
    • Smoothed model with MAP parameter estimation via MCMC
  4. Neural networks

    • Layers / Layer-wise ops
      • Add
      • Flatten
      • Multiply
      • Softmax
      • Fully-connected/Dense
      • Sparse evolutionary connections
      • LSTM
      • Elman-style RNN
      • Max + average pooling
      • Dot-product attention
      • Embedding layer
      • Restricted Boltzmann machine (w. CD-n training)
      • 2D deconvolution (w. padding and stride)
      • 2D convolution (w. padding, dilation, and stride)
      • 1D convolution (w. padding, dilation, stride, and causality)
    • Modules
      • Bidirectional LSTM
      • ResNet-style residual blocks (identity and convolution)
      • WaveNet-style residual blocks with dilated causal convolutions
      • Transformer-style multi-headed scaled dot product attention
    • Regularizers
      • Dropout
    • Normalization
      • Batch normalization (spatial and temporal)
      • Layer normalization (spatial and temporal)
    • Optimizers
      • SGD w/ momentum
      • AdaGrad
      • RMSProp
      • Adam
    • Learning Rate Schedulers
      • Constant
      • Exponential
      • Noam/Transformer
      • Dlib scheduler
    • Weight Initializers
      • Glorot/Xavier uniform and normal
      • He/Kaiming uniform and normal
      • Standard and truncated normal
    • Losses
      • Cross entropy
      • Squared error
      • Bernoulli VAE loss
      • Wasserstein loss with gradient penalty
      • Noise contrastive estimation loss
    • Activations
      • ReLU
      • Tanh
      • Affine
      • Sigmoid
      • Leaky ReLU
      • ELU
      • SELU
      • Exponential
      • Hard Sigmoid
      • Softplus
    • Models
      • Bernoulli variational autoencoder
      • Wasserstein GAN with gradient penalty
      • word2vec encoder with skip-gram and CBOW architectures
    • Utilities
      • col2im (MATLAB port)
      • im2col (MATLAB port)
      • conv1D
      • conv2D
      • deconv2D
      • minibatch
  5. Tree-based models

    • Decision trees (CART)
    • [Bagging] Random forests
    • [Boosting] Gradient-boosted decision trees
  6. Linear models

    • Ridge regression
    • Logistic regression
    • Ordinary least squares
    • Bayesian linear regression w/ conjugate priors
      • Unknown mean, known variance (Gaussian prior)
      • Unknown mean, unknown variance (Normal-Gamma / Normal-Inverse-Wishart prior)
  7. n-Gram sequence models

    • Maximum likelihood scores
    • Additive/Lidstone smoothing
    • Simple Good-Turing smoothing
  8. Multi-armed bandit models

    • UCB1
    • LinUCB
    • Epsilon-greedy
    • Thompson sampling w/ conjugate priors
      • Beta-Bernoulli sampler
    • LinUCB
  9. Reinforcement learning models

    • Cross-entropy method agent
    • First visit on-policy Monte Carlo agent
    • Weighted incremental importance sampling Monte Carlo agent
    • Expected SARSA agent
    • TD-0 Q-learning agent
    • Dyna-Q / Dyna-Q+ with prioritized sweeping
  10. Nonparameteric models

    • Nadaraya-Watson kernel regression
    • k-Nearest neighbors classification and regression
    • Gaussian process regression
  11. Matrix factorization

    • Regularized alternating least-squares
    • Non-negative matrix factorization
  12. Preprocessing

    • Discrete Fourier transform (1D signals)
    • Discrete cosine transform (type-II) (1D signals)
    • Bilinear interpolation (2D signals)
    • Nearest neighbor interpolation (1D and 2D signals)
    • Autocorrelation (1D signals)
    • Signal windowing
    • Text tokenization
    • Feature hashing
    • Feature standardization
    • One-hot encoding / decoding
    • Huffman coding / decoding
    • Term frequency-inverse document frequency (TF-IDF) encoding
    • MFCC encoding
  13. Utilities

    • Similarity kernels
    • Distance metrics
    • Priority queue
    • Ball tree
    • Discrete sampler
    • Graph processing and generators

Contributing

Am I missing your favorite model? Is there something that could be cleaner / less confusing? Did I mess something up? Submit a PR! The only requirement is that your models are written with just the Python standard library and NumPy. The SciPy library is also permitted under special circumstances ;)

See full contributing guidelines here.

This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

Xiaotao Gu 146 Dec 22, 2022
This repository contains the code for: RerrFact model for SciVer shared task

RerrFact This repository contains the code for: RerrFact model for SciVer shared task. Setup for Inference 1. Download SciFact database Download the S

Ashish Rana 1 May 22, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

CoCLR: Self-supervised Co-Training for Video Representation Learning This repository contains the implementation of: InfoNCE (MoCo on videos) UberNCE

Tengda Han 271 Jan 02, 2023
A Player for Kanye West's Stem Player. Sort of an emulator.

Stem Player Player Stem Player Player Usage Download the latest release here Optional: install ffmpeg, instructions here NOTE: DOES NOT ENABLE DOWNLOA

119 Dec 28, 2022
Implementation of Bottleneck Transformer in Pytorch

Bottleneck Transformer - Pytorch Implementation of Bottleneck Transformer, SotA visual recognition model with convolution + attention that outperforms

Phil Wang 621 Jan 06, 2023
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

FMEN Lowest memory consumption and second shortest runtime in NTIRE 2022 on Efficient Super-Resolution. Our paper: Fast and Memory-Efficient Network T

33 Dec 01, 2022
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Industrial KNN-based Anomaly Detection ⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py This repo aims to reproduce the results of

aventau 102 Dec 26, 2022
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines. We've created a system in which you can easily select and

Medical Machine Learning Lab - University of Münster 57 Nov 12, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

Tianning Yuan 269 Dec 21, 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation

StyleSwin This repo is the official implementation of "StyleSwin: Transformer-based GAN for High-resolution Image Generation". By Bowen Zhang, Shuyang

Microsoft 349 Dec 28, 2022
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

Petros Christodoulou 4.7k Jan 04, 2023
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
Graph Representation Learning via Graphical Mutual Information Maximization

GMI (Graphical Mutual Information) Graph Representation Learning via Graphical Mutual Information Maximization (Peng Z, Huang W, Luo M, et al., WWW 20

93 Dec 29, 2022
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
RealTime Emotion Recognizer for Machine Learning Study Jam's demo

Emotion recognizer Table of contents Clone project Dataset Install dependencies Main program Demo 1. Clone project git clone https://github.com/GDSC20

Google Developer Student Club - UIT 1 Oct 05, 2021