Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Last update: Nov 28, 2022

Related tags

Overview

Continuous Speech Separation with Conformer

Introduction

We examine the use of the Conformer architecture for continuous speech separation. Conformer allows the separation model to efficiently capture both local and global context information, which is helpful for speech separation. Experimental results using the LibriCSS dataset show that the Conformer separation model achieves state of the art results for both single-channel and multi-channel settings.

For a detailed description and experimental results, please refer to our paper: Continuous Speech Separation with Conformer (Accepted by ICASSP 2021).

Environment

python 3.6.9, torch 1.7.1

Get Started

Download the overlapped speech of LibriCSS dataset.

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1PdloA-V8HGxkRu9MnT35_civpc3YXJsT' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1PdloA-V8HGxkRu9MnT35_civpc3YXJsT" -O overlapped_speech.zip && rm -rf /tmp/cookies.txt && unzip overlapped_speech.zip && rm overlapped_speech.zip

Download the Conformer separation models.

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1OlTbEvxYUoqWIHfeAXCftL9srbWUo4I1' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1OlTbEvxYUoqWIHfeAXCftL9srbWUo4I1" -O checkpoints.zip && rm -rf /tmp/cookies.txt && unzip checkpoints.zip && rm checkpoints.zip

Run the separation.

3.1 Single-channel separation

export MODEL_NAME=1ch_conformer_base
python3 separate.py \
    --checkpoint checkpoints/$MODEL_NAME \
    --mix-scp utils/overlapped_speech_1ch.scp \
    --dump-dir separated_speech/monaural/utterances_with_$MODEL_NAME \
    --device-id 0 \
    --num_spks 2

The separated speech can be found in the directory 'separated_speech/monaural/utterances_with_$MODEL_NAME'

3.2 Seven-channel separation

export MODEL_NAME=conformer_base
python3 separate.py \
    --checkpoint checkpoints/$MODEL_NAME \
    --mix-scp utils/overlapped_speech_7ch.scp \
    --dump-dir separated_speech/7ch/utterances_with_$MODEL_NAME \
    --device-id 0 \
    --num_spks 2 \
    --mvdr True

The separated speech can be found in the directory 'separated_speech/7ch/utterances_with_$MODEL_NAME'

Citation

If you find our work useful, please cite our paper:

@inproceedings{CSS_with_Conformer,
  title={Continuous speech separation with conformer},
  author={Chen, Sanyuan and Wu, Yu and Chen, Zhuo and Wu, Jian and Li, Jinyu and Yoshioka, Takuya and Wang, Chengyi and Liu, Shujie and Zhou, Ming},
  booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={5749--5753},
  year={2021},
  organization={IEEE}
}

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Related tags

Overview

Continuous Speech Separation with Conformer

Introduction

Environment

Get Started

Citation

Owner

Sanyuan Chen (陈三元)

Yggdrasil - A simplistic bot designed to streamline your server experience

Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Visual dialog agents with pre-trained vision-and-language encoders.

A machine learning package for streaming data in Python. The other ancestor of River.

PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

An OpenAI Gym environment for Super Mario Bros

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Python implementation of Lightning-rod Agent, the Stack4Things board-side probe

PyTorch implementaton of our CVPR 2021 paper "Bridging the Visual Gap: Wide-Range Image Blending"

Object detection using yolo-tiny model and opencv used as backend

Neural Module Network for VQA in Pytorch

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Implementation of momentum^2 teacher

Ascend your Jupyter Notebook usage