Pytorch port of Google Research's LEAF Audio paper

Last update: Oct 31, 2022

Related tags

Overview

leaf-audio-pytorch

Pytorch port of Google Research's LEAF Audio paper published at ICLR 2021.

This port is not completely finished, but the Leaf() frontend is fully ported over, functional and validated to have similar outputs to the original tensorflow implementation. A few small things are missing, such as the SincNet and SincNet+ implementations, a few different pooling layers, etc.

PLEASE leave issues, pull requests, comments, or anything you find in using this repository that may be of value to others who will try to use this.

Installation

From the root directory of this repo, run:

pip install -e .

Usage

leaf_audio_pytorch mirrors it's original respository; imports and arguments are the same.

import leaf_audio_pytorch.frontend as frontend

leaf = frontend.Leaf()

Installation for Developing

If you are looking to develop on this repo, the requirements.txt contains everything needed to run the torch and tf implementations of leaf audio simultaneously.

NOTE: There is some weird dependency stuff going on with the original leaf-audio repo. Seems like its a dependency issue with lingvo and waymo-open-dataset. These below commands are a workaround.

Install the packages required:

pip install -r requirements.txt --no-deps

Install the leaf-audio repo from Git SSH:

pip install git+ssh://[email protected]/google-research/leaf-audio.git --no-deps

Then add the leaf_audio_pytorch package as well

python setup.py develop

At this point everything should be good to go! The scripts in test/ contains some testing code to validate the torch implementation mirrors tf.

Some Things to Keep in Mind (PLEASE READ)

When writing this port, I ran a debugger of the torch and tf implementations side by side and validated that each layer and operation mirrors the tensorflow implementation (to within a few significant digits, i.e. a tensor's values may variate by 0.001). There is one notable exception: The depthwise convolution within the GaussianLowpass pooling layer has a larger variation in tensor values, but the ported operation still produces similar outputs. I'm not sure why this operation is producing different values, but i'm currently looking into it. Please do your own due diligence in using this port and making sure this works as expected.
As of March 29, I finished the initial version of the port, but I have not tested Leaf() in a traning setting yet. Calling .backward() on Leaf() throws no errors, meaning backprop works as expected. However, I do not yet know how this will function during training.
As PyTorch and Tensorflow follow different tensor ordering conventions, Leaf() does all of its operations and outputs tensors with channels first.

Reference

All credit and attribution goes to Neil Zeghidour and the Google Research team who wrote the paper and created the Tensorflow implementation.

Please visit their GitHub repository and review their ICLR publication.

Pytorch port of Google Research's LEAF Audio paper

Related tags

Overview

leaf-audio-pytorch

Installation

Usage

Installation for Developing

Some Things to Keep in Mind (PLEASE READ)

Reference

Owner

Dennis Fedorishin

Keyword spotting on Arm Cortex-M Microcontrollers

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

RADIal is available now! Check the download section

A Fast Monotone Rotating Shallow Water model

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

A faster pytorch implementation of faster r-cnn

Efficient training of deep recommenders on cloud.

Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)

Unofficial Implementation of MLP-Mixer, Image Classification Model

Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

GUI for TOAD-GAN, a PCG-ML algorithm for Token-based Super Mario Bros. Levels.

Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

Beyond imagenet attack (accepted by ICLR 2022) towards crafting adversarial examples for black-box domains.

Tool which allow you to detect and translate text.