PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Last update: Aug 19, 2022

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

This repository contains the PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing", the Sheffield entry for the first Clarity enhancement challenge (CEC1). The system consists of a Conv-TasNet based denoising module, and a finite-inpulse-response (FIR) filter based amplification module. A differentiable approximation to the Cambridge MSBG model released in the CEC1 is used in the loss function.

Requirements

To run the training recipe of the amplification module, the MSBG package and PyTorch STOI are needed.

Training

To build the overall system, the Conv-TasNet based denoising module needs to be trained in the first stage, and the scripts are in the recipe_den_convtasnet. The FIR based amplification module is trained in the second stage, and the scripts are in the recipe_amp_fir. The MBSTOI folder contains the MBSTOI implementation from the CEC1 project, with also the DBSTOI implementation.

References

[1] Luo Y, Mesgarani N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266.
[2] Andersen A H, de Haan J M, Tan Z H, et al. Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions[J]. Speech Communication, 2018, 102: 1-13.
[3] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', ICASSP 2010, Texas, Dallas.

Citation

If you use this work, please cite:

@article{tutwo,
  title={A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing},
  author={Tu, Zehai and Zhang, Jisi and Ma, Ning and Barker, Jon},
  year={2021},
  booktitle={The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021)},
}

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Related tags

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

Requirements

Training

References

Citation

Owner

Torch implementation of SegNet and deconvolutional network

PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Hardware accelerated, batchable and differentiable optimizers in JAX.

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

PyTorch implementation of the YOLO (You Only Look Once) v2

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Recognize Handwritten Digits using Deep Learning on the browser itself.

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Package for extracting emotions from social media text. Tailored for financial data.

Gesture recognition on Event Data

Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

2021 National Underwater Robotics Vision Optics

✔️ Visual, reactive testing library for Julia. Time machine included.

Pointer networks Tensorflow2

RLHive: a framework designed to facilitate research in reinforcement learning.

[ICML 2020] "When Does Self-Supervision Help Graph Convolutional Networks?" by Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen

Predicting future trajectories of people in cameras of novel scenarios and views.

PyTorch implementation of the TTC algorithm