This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

Overview

slue-toolkit

made-with-python License: MIT

We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks. Refer https://arxiv.org/abs/2111.10367 for more details.

News

  • Nov. 22: We release the SLUE paper on arXiv along with the slue-toolkit repository. The repository contains data processing and evaluation scripts. We will publish the scripts for trainig the baseline models soon.

Installation

  1. git clone this repository and install slue-toolkit (development mode)
git clone https://github.com/asappresearch/slue-toolkit.git
pip install -e .

or install directly from Github

pip install git+https://github.com/asappresearch/slue-toolkit.git
  1. Install additional dependency based on your choice (e.g. you need fairseq and transformers for baselines)

SLUE Tasks

Automatic Speech Recognition (ASR)

Although this is not a SLU task, ASR can help analyze performance of downstream SLU tasks on the same domain. Additionally, pipeline approaches depend on ASR outputs, making ASR relevant to SLU. ASR is evaluated using word error rate (WER).

Named Entity Recognition (NER)

Named entity recognition involves detecting the named entities and their tags (types) in a given sentence. We evaluate performance using micro-averaged F1 and label-F1 scores. The F1 score evaluates an unordered list of named entity phrase and tag pairs predicted for each sentence. Only the tag predictions are considered for label-F1.

Sentiment Analysis (SA)

Sentiment analysis refers to classifying a given speech segment as having negative, neutral, or positive sentiment. We evaluate SA using macro-averaged (unweighted) recall and F1 scores.

Datasets

Corpus Size - utts (hours) Tasks License
Fine-tune Dev Test
SLUE-VoxPopuli 5,000 (14.5) 1,753 (5.0) 1,842 (4.9) ASR, NER CC0 (check complete license here)
SLUE-VoxCeleb 5,777 (12.8) 955 (2.1) 4,052 (9.0) ASR, SA CC-BY 4.0 (check complete license here)

For SLUE, you need VoxCeleb and VoxPopuli dataset. We carefully curated subset of those dataset for fine-tuning and evaluation for SLUE tasks, and we re-distribute the the subsets. Thus, you don't need to download a whole gigantic datasets. In the dataset, we also includes the human annotation and transcription for SLUE tasks. All you need to do is just running the script below and it will download and pre-process the dataset.

Download and pre-process dataset

bash scripts/download_datasets.sh

SLUE score evaluation

The test set data and annotation will be used for the official SLUE score evaluation, however we will not release the test set annotation. Thus, the SLUE score can be evaluated by submitting your prediction result in tsv format. We will prepare the website to accept your submission. Please stay tuned for this.

Model development rule

To train model, You can use fine-tuning and dev sets (audio, transcription and annotation) except the test set of SLUE task. Additionally you can use any kind of external dataset whether it is labeled or unlabeled for any purpose of training (e.g. pre-training and fine-tuning).

For vadidation of your model, you can use official dev set we provide, or you can make your own splits or cross-validation splits by mixing fine-tuning and dev set all together.

Baselines

ASR

Fine-tuning

Assuming that the preprocessed manifest files are in manifest/slue-voxceleb and manifest/slue-voxpopuli for SLUE-VoxCeleb and SLUE-VoxPopuli. This command fine-tune a wav2vec 2.0 base model on these two datasets using one GPU.

bash baselines/asr/ft-w2v2-base.sh manifest/slue-voxceleb save/asr/w2v2-base-vc
bash baselines/asr/ft-w2v2-base.sh manifest/slue-voxpopuli save/asr/w2v2-base-vp

Evaluation

To evaluate the fine-tuned wav2vec 2.0 ASR models on the dev set, please run the following commands.

python slue_toolkit/eval/eval_w2v.py eval_asr save/asr/w2v2-base-vc --data manifest/slue-voxceleb --subset dev
python slue_toolkit/eval/eval_w2v.py eval_asr save/asr/w2v2-base-vp --data manifest/slue-voxpopuli --subset dev

The WER will be printed directly. The predictions are saved in save/asr/w2v2-base-vc/pred-dev.wrd and save/asr/w2v2-base-vp/pred-dev.wrd and can be used for pipeline models.

More detail baseline experiment described here

NER

Fine-tuning End-to-end model

Assuming that the preprocessed manifest files are in manifest/slue-voxpopuli for SLUE-VoxPopuli. This command fine-tune a wav2vec 2.0 base model using one GPU.

bash baselines/ner/e2e_scripts/ft-w2v2-base.sh manifest/slue-voxpopuli/e2e_ner save/e2e_ner/w2v2-base

Evaluating End-to-End model

To evaluate the fine-tuned wav2vec 2.0 E2E NER model on the dev set, please run the following command. (decoding without language model)

bash baselines/ner/e2e_scripts/eval-ner.sh w2v2-base dev combined nolm

More detail baseline experiment described here

Sentiment Analysis

Fine-tuning

This command fine-tune a wav2vec 2.0 base model on the voxceleb dataset

bash baselines/sentiment/ft-w2v2-base-senti.sh manifest/slue-voxceleb save/sentiment/w2v2-base

Evaluation

To evaluate the fine-tuned wav2vec 2.0 sentiment model, run following commands or run baselines/sentiment/e2e_scripts/eval.sh

python3 slue_toolkit/eval/eval_w2v_sentiment.py --save-dir save/sentiment/w2v2-base --data manifest/slue-voxceleb --subset dev

More detail baseline experiment described here

Owner
ASAPP Research
AI for Enterprise
ASAPP Research
A Demo server serving Bert through ONNX with GPU written in Rust with <3

Demo BERT ONNX server written in rust This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized wit

Xavier Tao 28 Jan 01, 2023
Husein pet projects in here!

project-suka-suka Husein pet projects in here! List of projects mysejahtera-density. Generate resolution points using meshgrid and request each points

HUSEIN ZOLKEPLI 47 Dec 09, 2022
Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Demetri Pananos 9 Oct 04, 2022
Code for the paper "Attention Approximates Sparse Distributed Memory"

Attention Approximates Sparse Distributed Memory - Codebase This is all of the code used to run analyses in the paper "Attention Approximates Sparse D

Trenton Bricken 14 Dec 05, 2022
Official Pytorch implementation of "DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network" (CVPR'21)

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network Pytorch implementation for our DivCo. We propose a simple ye

64 Nov 22, 2022
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Usage First, install PyTorch 1.7.1+, torchvision 0.8.2

40 Dec 12, 2022
Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
CNN visualization tool in TensorFlow

tf_cnnvis A blog post describing the library: https://medium.com/@falaktheoptimist/want-to-look-inside-your-cnn-we-have-just-the-right-tool-for-you-ad

InFoCusp 778 Jan 02, 2023
Sound Event Detection with FilterAugment

Sound Event Detection with FilterAugment Official implementation of Heavily Augmented Sound Event Detection utilizing Weak Predictions (DCASE2021 Chal

43 Aug 28, 2022
The Environment I built to study Reinforcement Learning + Pokemon Showdown

pokemon-showdown-rl-environment The Environment I built to study Reinforcement Learning + Pokemon Showdown Been a while since I ran this. Think it is

3 Jan 16, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

Gengshan Yang 59 Nov 25, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
CondLaneNet: a Top-to-down Lane Detection Framework Based on Conditional Convolution

CondLaneNet: a Top-to-down Lane Detection Framework Based on Conditional Convolution This is the official implementation code of the paper "CondLaneNe

Alibaba Cloud 311 Dec 30, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
JupyterLite demo deployed to GitHub Pages 🚀

JupyterLite Demo JupyterLite deployed as a static site to GitHub Pages, for demo purposes. ✨ Try it in your browser ✨ ➡️ https://jupyterlite.github.io

JupyterLite 223 Jan 04, 2023
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. The related paper is avai

26 Dec 13, 2022
🤗 Push your spaCy pipelines to the Hugging Face Hub

spacy-huggingface-hub: Push your spaCy pipelines to the Hugging Face Hub This package provides a CLI command for uploading any trained spaCy pipeline

Explosion 30 Oct 09, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 08, 2023
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023