Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

Overview

SmallPebble

Project status: experimental, unstable.



SmallPebble is a minimal/toy automatic differentiation/deep learning library written from scratch in Python, using NumPy/CuPy.

The implementation is in smallpebble.py.

Features:

  • Relatively simple implementation.
  • Powerful API for creating models.
  • Various operations, such as matmul, conv2d, maxpool2d.
  • Broadcasting support.
  • Eager or lazy execution.
  • It's easy to add new SmallPebble functions.
  • GPU, if use CuPy.

Graphs are built implicitly via Python objects referencing Python objects. The only real step taken towards improving performance is to use NumPy/CuPy.

Should I use this?

You probably want a more efficient and featureful framework, such as JAX, PyTorch, TensorFlow, etc.

Read on to see:

  • Examples of deep learning models created and trained using SmallPebble.
  • A brief guide to using SmallPebble.

For an introduction to autodiff and an even more minimal autodiff implementation, look here.


import matplotlib.pyplot as plt
import numpy as np
import smallpebble as sp
from smallpebble.misc import load_data
from tqdm import tqdm

Training a neural network on MNIST

Load the dataset, and create a validation set.

X_train, y_train, _, _ = load_data('mnist')  # load / download from openml.org
X_train = X_train/255

# Separate out data for validation.
X = X_train[:50_000, ...]
y = y_train[:50_000]
X_eval = X_train[50_000:60_000, ...]
y_eval = y_train[50_000:60_000]

Build a model.

X_in = sp.Placeholder()
y_true = sp.Placeholder()

h = sp.linearlayer(28*28, 100)(X_in)
h = sp.Lazy(sp.leaky_relu)(h)
h = sp.linearlayer(100, 100)(h)
h = sp.Lazy(sp.leaky_relu)(h)
h = sp.linearlayer(100, 10)(h)
y_pred = sp.Lazy(sp.softmax)(h)
loss = sp.Lazy(sp.cross_entropy)(y_pred, y_true)

learnables = sp.get_learnables(y_pred)

loss_vals = []
validation_acc = []

Train model, and measure performance on validation dataset.

NUM_EPOCHS = 300
BATCH_SIZE = 200

eval_batch = sp.batch(X_eval, y_eval, BATCH_SIZE)

for i, (xbatch, ybatch) in tqdm(enumerate(sp.batch(X, y, BATCH_SIZE)), total=NUM_EPOCHS):
    if i > NUM_EPOCHS: break
    
    X_in.assign_value(sp.Variable(xbatch))
    y_true.assign_value(ybatch)
    
    loss_val = loss.run()  # run the graph
    if np.isnan(loss_val.array):
        print("loss is nan, aborting.")
        break
    loss_vals.append(loss_val.array)
        
    # Compute gradients, and carry out learning step.
    gradients = sp.get_gradients(loss_val)
    sp.sgd_step(learnables, gradients, 3e-4)
        
    # Compute validation accuracy:
    x_eval_batch, y_eval_batch = next(eval_batch)
    X_in.assign_value(sp.Variable(x_eval_batch))
    predictions = y_pred.run()
    predictions = np.argmax(predictions.array, axis=1)
    accuracy = (y_eval_batch == predictions).mean()
    validation_acc.append(accuracy)

plt.figure(figsize=(14, 4))
plt.subplot(1, 2, 1)
plt.title('Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.plot(loss_vals)
plt.subplot(1, 2, 2)
plt.title('Validation accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.suptitle('Neural network trained on MNIST, using SmallPebble.')
plt.ylim([0, 1])
plt.plot(validation_acc)
plt.show()
301it [00:03, 94.26it/s]                         

png

Training a convolutional neural network on MNIST

Make a function that creates trainable convolutional layers:

def convlayer(height, width, depth, n_kernels, strides=[1,1]):
    # Initialise kernels:
    sigma = np.sqrt(6 / (height*width*depth+height*width*n_kernels))
    kernels_init = sigma*(np.random.random([height, width, depth, n_kernels]) - .5)
    # Wrap with sp.Variable, so we can compute gradients:
    kernels = sp.Variable(kernels_init)
    # Flag as learnable, so we can extract from the model to train:
    kernels = sp.learnable(kernels)
    # Curry, to set `strides`:
    func = lambda images, kernels: sp.conv2d(images, kernels, strides=strides, padding='SAME')
    # Curry, to use the kernels created here:
    return lambda images: sp.Lazy(func)(images, kernels)

Define a model.

X_in = sp.Placeholder()
y_true = sp.Placeholder()

h = convlayer(height=3, width=3, depth=1, n_kernels=16)(X_in)
h = sp.Lazy(sp.leaky_relu)(h)
h = sp.Lazy(lambda a: sp.maxpool2d(a, 2, 2, strides=[2, 2]))(h)

h = sp.Lazy(lambda x: sp.reshape(x, [-1, 14*14*16]))(h)
h = sp.linearlayer(14*14*16, 64)(h)
h = sp.Lazy(sp.leaky_relu)(h)

h = sp.linearlayer(64, 10)(h)
y_pred = sp.Lazy(sp.softmax)(h)
loss = sp.Lazy(sp.cross_entropy)(y_pred, y_true)

learnables = sp.get_learnables(y_pred)

loss_vals = []
validation_acc = []

# Check we get the dimensions we expected.
X_in.assign_value(sp.Variable(X_train[0:3,:].reshape([-1,28,28,1])))
y_true.assign_value(y_train[0])
h.run().array.shape
(3, 10)
NUM_EPOCHS = 300
BATCH_SIZE = 200

eval_batch = sp.batch(X_eval.reshape([-1,28,28,1]), y_eval, BATCH_SIZE)

for i, (xbatch, ybatch) in tqdm(
    enumerate(sp.batch(X.reshape([-1,28,28,1]), y, BATCH_SIZE)), total=NUM_EPOCHS):
    if i > NUM_EPOCHS: break
    
    X_in.assign_value(sp.Variable(xbatch))
    y_true.assign_value(ybatch)
    
    loss_val = loss.run()
    if np.isnan(loss_val.array):
        print("Aborting, loss is nan.")
        break
    loss_vals.append(loss_val.array)
        
    # Compute gradients, and carry out learning step.
    gradients = sp.get_gradients(loss_val)
    sp.sgd_step(learnables, gradients, 3e-4)
        
    # Compute validation accuracy:
    x_eval_batch, y_eval_batch = next(eval_batch)
    X_in.assign_value(sp.Variable(x_eval_batch))
    predictions = y_pred.run()
    predictions = np.argmax(predictions.array, axis=1)
    accuracy = (y_eval_batch == predictions).mean()
    validation_acc.append(accuracy)

plt.figure(figsize=(14, 4))
plt.subplot(1, 2, 1)
plt.title('Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.plot(loss_vals)
plt.subplot(1, 2, 2)
plt.title('Validation accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.suptitle('CNN trained on MNIST, using SmallPebble.')
plt.ylim([0, 1])
plt.plot(validation_acc)
plt.show()
301it [03:35,  1.40it/s]                         

png

Training a CNN on CIFAR

Load the dataset.

X_train, y_train, _, _ = load_data('cifar')
X_train = X_train/255

# Separate out some data for validation.
X = X_train[:45_000, ...]
y = y_train[:45_000]
X_eval = X_train[45_000:50_000, ...]
y_eval = y_train[45_000:50_000]

Plot, to check it's the right data.

# This code is from: https://www.tensorflow.org/tutorials/images/cnn

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(8,8))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(X_train[i,:].reshape(32,32,3), cmap=plt.cm.binary)
    plt.xlabel(class_names[y_train[i]])

plt.show()

png

Define the model. Due to my lack of ram, it is kept relatively small.

X_in = sp.Placeholder()
y_true = sp.Placeholder()

h = convlayer(height=3, width=3, depth=3, n_kernels=16)(X_in)
h = sp.Lazy(sp.leaky_relu)(h)
h = sp.Lazy(lambda a: sp.maxpool2d(a, 2, 2, strides=[2, 2]))(h)

h = convlayer(height=3, width=3, depth=16, n_kernels=32)(h)
h = sp.Lazy(sp.leaky_relu)(h)
h = sp.Lazy(lambda a: sp.maxpool2d(a, 2, 2, strides=[2, 2]))(h)

h = sp.Lazy(lambda x: sp.reshape(x, [-1, 8*8*32]))(h)
h = sp.linearlayer(8*8*32, 64)(h)
h = sp.Lazy(sp.leaky_relu)(h)

h = sp.linearlayer(64, 10)(h)
h = sp.Lazy(sp.softmax)(h)

y_pred = h
loss = sp.Lazy(sp.cross_entropy)(y_pred, y_true)

learnables = sp.get_learnables(y_pred)

loss_vals = []
validation_acc = []

# Check we get the expected dimensions
X_in.assign_value(sp.Variable(X[0:3, :].reshape([-1, 32, 32, 3])))
h.run().shape
(3, 10)

Train the model.

NUM_EPOCHS = 3000
BATCH_SIZE = 32

eval_batch = sp.batch(X_eval, y_eval, BATCH_SIZE)

for i, (xbatch, ybatch) in tqdm(enumerate(sp.batch(X, y, BATCH_SIZE)), total=NUM_EPOCHS):
    if i > NUM_EPOCHS: break
       
    xbatch_images = xbatch.reshape([-1, 32, 32, 3])
    X_in.assign_value(sp.Variable(xbatch_images))
    y_true.assign_value(ybatch)
    
    loss_val = loss.run()
    if np.isnan(loss_val.array):
        print("Aborting, loss is nan.")
        break
    loss_vals.append(loss_val.array)
    
    # Compute gradients, and carry out learning step.
    gradients = sp.get_gradients(loss_val)  
    sp.sgd_step(learnables, gradients, 3e-3)
          
    # Compute validation accuracy:
    x_eval_batch, y_eval_batch = next(eval_batch)
    X_in.assign_value(sp.Variable(x_eval_batch.reshape([-1, 32, 32, 3])))
    predictions = y_pred.run()
    predictions = np.argmax(predictions.array, axis=1)
    accuracy = (y_eval_batch == predictions).mean()
    validation_acc.append(accuracy)

plt.figure(figsize=(14, 4))
plt.subplot(1, 2, 1)
plt.title('Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.plot(loss_vals)
plt.subplot(1, 2, 2)
plt.title('Validation accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.plot(validation_acc)
plt.show()
3001it [25:16,  1.98it/s]                            

png

...And we see some improvement, despite the model's small size, the unsophisticated optimisation method and the difficulty of the task.


Brief guide to using SmallPebble

SmallPebble provides the following building blocks to make models with:

  • sp.Variable
  • SmallPebble operations, such as sp.add, sp.mul, etc.
  • sp.get_gradients
  • sp.Lazy
  • sp.Placeholder (this is really just sp.Lazy on the identity function)
  • sp.learnable
  • sp.get_learnables

The following examples show how these are used.

sp.Variable & sp.get_gradients

With SmallPebble, you can:

  • Wrap NumPy arrays in sp.Variable
  • Apply SmallPebble operations (e.g. sp.matmul, sp.add, etc.)
  • Compute gradients with sp.get_gradients
a = sp.Variable(np.random.random([2, 2]))
b = sp.Variable(np.random.random([2, 2]))
c = sp.Variable(np.random.random([2]))
y = sp.mul(a, b) + c
print('y.array:\n', y.array)

gradients = sp.get_gradients(y)
grad_a = gradients[a]
grad_b = gradients[b]
grad_c = gradients[c]
print('grad_a:\n', grad_a)
print('grad_b:\n', grad_b)
print('grad_c:\n', grad_c)
y.array:
 [[0.50222439 0.67745659]
 [0.68666171 0.58330707]]
grad_a:
 [[0.56436821 0.2581522 ]
 [0.89043144 0.25750461]]
grad_b:
 [[0.11665152 0.85303194]
 [0.28106794 0.48955456]]
grad_c:
 [2. 2.]

Note that y is computed straight away, i.e. the (forward) computation happens immediately.

Also note that y is a sp.Variable and we could continue to carry out SmallPebble operations on it.

sp.Lazy & sp.Placeholder

Lazy graphs are constructed using sp.Lazy and sp.Placeholder.

lazy_node = sp.Lazy(lambda a, b: a + b)(1, 2)
print(lazy_node)
print(lazy_node.run())
<smallpebble.smallpebble.Lazy object at 0x7fbc92d58d50>
3
a = sp.Lazy(lambda a: a)(2)
y = sp.Lazy(lambda a, b, c: a * b + c)(a, 3, 4)
print(y)
print(y.run())
<smallpebble.smallpebble.Lazy object at 0x7fbc92d41d50>
10

Forward computation does not happen immediately - only when .run() is called.

a = sp.Placeholder()
b = sp.Variable(np.random.random([2, 2]))
y = sp.Lazy(sp.matmul)(a, b)

a.assign_value(sp.Variable(np.array([[1,2], [3,4]])))

result = y.run()
print('result.array:\n', result.array)
result.array:
 [[1.01817665 2.54693119]
 [2.42244218 5.69810698]]

You can use .run() as many times as you like.

Let's change the placeholder value and re-run the graph:

a.assign_value(sp.Variable(np.array([[10,20], [30,40]])))
result = y.run()
print('result.array:\n', result.array)
result.array:
 [[10.18176654 25.46931189]
 [24.22442177 56.98106985]]

Finally, let's compute gradients:

gradients = sp.get_gradients(result)

Note that sp.get_gradients is called on result, which is a sp.Variable, not on y, which is a sp.Lazy instance.

sp.learnable & sp.get_learnables

Use sp.learnable to flag parameters as learnable, allowing them to be extracted from a lazy graph with sp.get_learnables.

This enables the workflow of building a model, while flagging parameters as learnable, and then extracting all the parameters in one go at the end.

a = sp.Placeholder()
b = sp.learnable(sp.Variable(np.random.random([2, 1])))
y = sp.Lazy(sp.matmul)(a, b)
y = sp.Lazy(sp.add)(y, sp.learnable(sp.Variable(np.array([5]))))

learnables = sp.get_learnables(y)

for learnable in learnables:
    print(learnable)
<smallpebble.smallpebble.Variable object at 0x7fbc60b6ebd0>
<smallpebble.smallpebble.Variable object at 0x7fbc60b6ec50>

Switching between NumPy and CuPy

We can dynamically switch between NumPy and CuPy:

import cupy
import numpy
import smallpebble as sp

# Switch to CuPy.
sp.array_library = cupy

# And back to NumPy again:
sp.array_library = numpy
Owner
Sidney Radcliffe
Sidney Radcliffe
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
functorch is a prototype of JAX-like composable function transforms for PyTorch.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

Facebook Research 1.2k Jan 09, 2023
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The offical PyTorch implementation of Neural View Sy

Angtian Wang 20 Oct 09, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 08, 2023
labelpix is a graphical image labeling interface for drawing bounding boxes

Welcome to labelpix 👋 labelpix is a graphical image labeling interface for drawing bounding boxes. 🏠 Homepage Install pip install -r requirements.tx

schissmantics 26 May 24, 2022
HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

HiFT: Hierarchical Feature Transformer for Aerial Tracking Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li Our paper is Accepted by ICCV 2

Intelligent Vision for Robotics in Complex Environment 55 Nov 23, 2022
[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

EOPSN: Exemplar-Based Open-Set Panoptic Segmentation Network (CVPR 2021) PyTorch implementation for EOPSN. We propose open-set panoptic segmentation t

Jaedong Hwang 49 Dec 30, 2022
No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

Bhagvan Kommadi 5 Jan 28, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022
Source code for deep symbolic optimization.

Update July 10, 2021: This repository now supports an additional symbolic optimization task: learning symbolic policies for reinforcement learning. Th

Brenden Petersen 290 Dec 25, 2022
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022
Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transformers to Guarantee TopologyPreservation in Segmentations"

TEDS-Net Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transfo

Madeleine K Wyburd 14 Jan 04, 2023
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021) Hang Zhou, Yasheng Sun, Wayne Wu, Chen Cha

Hang_Zhou 628 Dec 28, 2022
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images In this paper, we present an effective Dynamic Enhancement Anchor

13 Dec 09, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

MoCo: Momentum Contrast for Unsupervised Visual Representation Learning This is a PyTorch implementation of the MoCo paper: @Article{he2019moco, aut

Meta Research 3.7k Jan 02, 2023
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022