A two-stage U-Net for high-fidelity denoising of historical recordings

Last update: Jan 05, 2023

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Official repository of the paper (not submitted yet):

E. Moliner and V. Välimäki,, "A two-stage U-Net for high-fidelity denosing of historical recordinds", in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, May, 2022

Abstract

Enhancing the sound quality of historical music recordings is a long-standing problem. This paper presents a novel denoising method based on a fully-convolutional deep neural network. A two-stage U-Net model architecture is designed to model and suppress the degradations with high fidelity. The method processes the time-frequency representation of audio, and is trained using realistic noisy data to jointly remove hiss, clicks, thumps, and other common additive disturbances from old analog discs. The proposed model outperforms previous methods in both objective and subjective metrics. The results of a formal blind listening test show that the method can denoise real gramophone recordings with an excellent quality. This study shows the importance of realistic training data and the power of deep learning in audio restoration.

Listen to our audio samples

Requirements

You will need at least python 3.7 and CUDA 10.1 if you want to use GPU. See requirements.txt for the required package versions.

To install the environment through anaconda, follow the instructions:

conda env update -f environment.yml
conda activate historical_denoiser

Denoising Recordings

Run the following commands to clone the repository and install the pretrained weights of the two-stage U-Net model:

git clone https://github.com/eloimoliner/denoising-historical-recordings.git
cd denoising-historical-recordings
wget https://github.com/eloimoliner/denoising-historical-recordings/releases/download/v0.0/checkpoint.zip
unzip checkpoint.zip /experiments/trained_model/

If the environment is installed correctly, you can denoise an audio file by running:

bash inference.sh "file name"

A ".wav" file with the denoised version, as well as the residual noise and the original signal in "mono", will be generated in the same directory as the input file.

Training

TODO

Comments

Will it work in Windows without CUDA?

Hello, The readme says: "You will need at least python 3.7 and CUDA 10.1 if you want to use GPU."

Unfortunately, my first attempt to run it in Windows without CUDA-supporting VGA failed. There is really no separate environment file for CPU-only? Is it possible to make it work without massive changes to the code?

opened by vitacon 15
installation without conda

Hi,

could you leave some hints about how to install this without conda? Your readme appears to be very much specified to this one case. Also it seems that you develop under linux so you use bash to execute. Maybe here a hint for win- users would be cool too.

I am just trying to get this to run under windows and so far had no success. I will update if I get further. All the best!

opened by GitHubGeniusOverlord 9
strange tensorflow version in requirements.txt

Hi,

when running python -m pip install tensorflow==2.3.0 as indicated in your requirements file, I get

ERROR: Could not find a version that satisfies the requirement tensorflow==2.3.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.5.2, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.8.0rc0) ERROR: No matching distribution found for tensorflow==2.3.0

It seems this version isn't even supported by pip anymore. Upgrade to 2.5.0?

The same is true for scipy==1.4.1. Not sure about which version to take there.

opened by GitHubGeniusOverlord 3
Update inference.sh

Small change to allow spaces in file names. Bash expands the variable $1 correctly even if it is in double quotes, python receives a single argument and not (if there are spaces) multiple arguments.

opened by JorenSix 1
How to start training for denoising?

If I would like to do a denoising task, where I've clean signals (in the "clean" folder) and noisy signals (in the "noise" folder).

opened by listener17 1

Releases(v0.0)

v0.0(Aug 31, 2021)

Uploading pretrained model
Source code(tar.gz)
Source code(zip)
checkpoint.zip(251.80 MB)

Owner

Eloi Moliner Juanpere

Doctoral candidate on audio signal processing at Aalto university.

GitHub Repository

Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

109 Dec 14, 2022

Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

36 Nov 29, 2022

Pretty Tensor - Fluent Neural Networks in TensorFlow

Pretty Tensor provides a high level builder API for TensorFlow. It provides thin wrappers on Tensors so that you can easily build multi-layer neural networks.

1.2k Dec 29, 2022

An implementation of Deep Forest 2021.2.1.

Deep Forest (DF) 21 DF21 is an implementation of Deep Forest 2021.2.1. It is designed to have the following advantages: Powerful: Better accuracy than

795 Jan 03, 2023

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

10 Dec 02, 2022

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

2.3k Jan 09, 2023

Implementation of ViViT: A Video Vision Transformer

ViViT: A Video Vision Transformer Unofficial implementation of ViViT: A Video Vision Transformer. Notes: This is in WIP. Model 2 is implemented, Model

297 Jan 06, 2023

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

GSCNN This is the official code for: Gated-SCNN: Gated Shape CNNs for Semantic Segmentation Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

859 Dec 26, 2022

Keras Image Embeddings using Contrastive Loss

Keras-Image-Embeddings-using-Contrastive-Loss Image to Embedding projection in vector space. Implementation in keras and tensorflow for custom data. B

5 Mar 21, 2022

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

105 Jan 03, 2023

ElasticFace: Elastic Margin Loss for Deep Face Recognition

This is the official repository of the paper: ElasticFace: Elastic Margin Loss for Deep Face Recognition Paper on arxiv: arxiv Model Log file Pretrain

113 Dec 14, 2022

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

12 Jul 08, 2022

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

(ACMMM 2021 Oral) SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment This repository shows two tasks: Face landmark detection and Fac

51 Dec 13, 2022

NR-GAN: Noise Robust Generative Adversarial Networks

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

59 Dec 11, 2022

PyTorch Implementation of ECCV 2020 Spotlight TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

TuiGAN-PyTorch Official PyTorch Implementation of "TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images" (ECCV 2020 Spotligh

181 Dec 09, 2022

A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

24 Dec 12, 2022

Collection of sports betting AI tools.

sports-betting sports-betting is a collection of tools that makes it easy to create machine learning models for sports betting and evaluate their perf

109 Dec 31, 2022

Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode

62 Dec 14, 2022

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022

Reinforcement Learning for the Blackjack

Reinforcement Learning for Blackjack Author: ZHA Mengyue Math Department of HKUST Problem Statement We study playing Blackjack by reinforcement learni

3 Jan 24, 2022

A two-stage U-Net for high-fidelity denoising of historical recordings

Related tags

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Abstract

Requirements

Denoising Recordings

Training

Comments

Will it work in Windows without CUDA?

installation without conda

strange tensorflow version in requirements.txt

Update inference.sh

How to start training for denoising?

Releases(v0.0)

v0.0(Aug 31, 2021)

Owner

Eloi Moliner Juanpere

Chinese clinical named entity recognition using pre-trained BERT model

Replication attempt for the Protein Folding Model

Pretty Tensor - Fluent Neural Networks in TensorFlow

An implementation of Deep Forest 2021.2.1.

[ICCV 2021] Deep Hough Voting for Robust Global Registration

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

Implementation of ViViT: A Video Vision Transformer

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

Keras Image Embeddings using Contrastive Loss

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

ElasticFace: Elastic Margin Loss for Deep Face Recognition

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

NR-GAN: Noise Robust Generative Adversarial Networks

PyTorch Implementation of ECCV 2020 Spotlight TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

A Tensorflow based library for Time Series Modelling with Gaussian Processes

Collection of sports betting AI tools.

Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Reinforcement Learning for the Blackjack