Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Last update: Jul 06, 2022

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

 Abeßer, J. & Müller, M. Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning, submitted to: ICASSP 2022

Related Work

we use pre-computed features & model architecture used in 3 previous papers
- these are all unsupervised domain adaptation methods

    Mezza, A. I., Habets, E. A. P., Müller, M., & Sarti, A. (2021).
    #Unsupervised domain adaptation for acoustic scene classification
    using band-wise statistics matching. Proceedings of the European
    Signal Processing Conference (EUSIPCO), 11–15.
    https://doi.org/10.23919/Eusipco47968.2020.9287533"

    Drossos, K., Magron, P., & Virtanen, T. (2019). Unsupervised Adversarial Domain Adaptation based
    on the Wasserstein Distance for Acoustic Scene Classification. Proceedings of the IEEE Workshop
    on Applications of Signal Processing to Audio and Acoustics (WASPAA), 259–263. New Paltz, NY, USA.

    Gharib, S., Drossos, K., Emre, C., Serdyuk, D., & Virtanen, T. (2018). Unsupervised Adversarial Domain
    Adaptation for Acoustic Scene Classification. Proceedings of the Detection and Classification of
    Acoustic Scenes and Events (DCASE). Surrey, UK.

Files

configs.py - Training configurations (C0 ... C3M)
generator.py - Data generator
losses.py - Loss implementations
model.py - Function to create dual-input / dual-output model
model_kaggle.py - reference CNN model from related work for acoustic scene classification (ASC)
normalization.py - Normalization methods (see Mezza et al. above)
params.py - General parameters
prediction.py - Prediction script to evaluate models on test data
training.py - Script to run the model training for 6 different configurations (see Fig. 2 in the paper)

How to run

create python environment (e.g. with conda), the following versions were used during the paper preparation process
- librosa==0.8.0
- matplotlib==3.3.2
- numpy=1.19.2
- python=3.7.0
- scikit-learn==0.23.2
- tensorflow==2.3.0
- torch==1.9.0
set in params.py the following variables
- dir_feat to your local copy of the .p files from https://zenodo.org/record/1401995
- dir_target to your local output folder
run python training.py && python prediction.py on a GPU device to train & evaluate the models

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Related tags

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

Related Work

Files

How to run

Owner

Jakob Abeßer

PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases

FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks

Deep Learning ❤️ OneFlow

The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)

A multilingual version of MS MARCO passage ranking dataset

CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Le dataset des images du projet d'IA de 2021

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

A denoising diffusion probabilistic model synthesises galaxies that are qualitatively and physically indistinguishable from the real thing.

Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Space-event-trace - Tracing service for spaceteam events

A Fast Knowledge Distillation Framework for Visual Recognition

VIsually-Pivoted Audio and(N) Text

IMBENS: class-imbalanced ensemble learning in Python.

Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）