SPARSEDNN

**If you want to use this repo, please send me an email: [email protected], or raise a Github issue. **

Fast sparse deep learning on CPUs. This is the kernel library generator described in the paper: https://arxiv.org/abs/2101.07948

Python API: python fastsparse.py. Minimal required dependencies. Should work anywhere.

C++ API: check out driver_cpu.cpp, or run autotune_cpu_random.sh 128 128 128 0. This requires cnpy to read numpy files, so make sure that you can link to cnpy.

Python API has some bad overhead due to using ctypes. This is noticeable for smaller matrices but not really noticeable for large matrices. The benchmarkings done in the Arxiv paper was all done with the C++ API.

Work that is not yet open sourced: kernel generator for sparse convolutions (as described in the Arxiv paper) using implicit convolution, lightweight inference engine to get end-to-end results, sparse int8 kernels. If interested in any of this please email.

FAQs:

How does this compare to Neuralmagic? Last time I checked the deepsparse library does not allow you to run kernel-level benchmarks. If you care about end to end neural network acceleration, you should definitely go with Neuralmagic if they happen to support your model.
Future work? This is not exactly along the lines of my PhD thesis so I work on this sparingly. If you want to contribute to this repo you could make a Pytorch or Tensorflow custom op with the Python or C++ API. However it's unclear how gradients would work, and you will have to compile this op with the fixed sparsity pattern, something that the current Pytorch/Tensorflow frameworks might not support that well.

Fast sparse deep learning on CPUs

Related tags

Overview

SPARSEDNN

Owner

Ziheng Wang

QueryFuzz implements a metamorphic testing approach to test Datalog engines.

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Quantum-enhanced transformer neural network

Auto-Lama combines object detection and image inpainting to automate object removals

Code for IntraQ, PyTorch implementation of our paper under review

Sparse Physics-based and Interpretable Neural Networks

Neural Logic Inductive Learning

Source code for The Power of Many: A Physarum Swarm Steiner Tree Algorithm

Benchmarking the robustness of Spatial-Temporal Models

A clear, concise, simple yet powerful and efficient API for deep learning.

Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

Async API for controlling Hue Lights

DeLag: Detecting Latency Degradation Patterns in Service-based Systems

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

Structured Data Gradient Pruning (SDGP)

Learning to Simulate Dynamic Environments with GameGAN (CVPR 2020)

[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

Preparation material for Dropbox interviews

Official implementation of the NeurIPS'21 paper 'Conditional Generation Using Polynomial Expansions'.