PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Last update: Dec 14, 2022

Related tags

Text Data & NLP ProSLU

Overview

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

This repository contains the official PyTorch implementation of the paper:

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding. Xiao Xu*, Libo Qin*, Kaiji Chen, Guoxing Wu, Linlin Li, Wanxiang Che. AAAI 2022. [Paper(Arxiv)] [Paper]

If you use any source codes or the datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

...

In the following, we will guide you how to use this repository step by step.

Workflow

Architecture

Results

Preparation

Our code is based on the following packages:

numpy==1.19.5
tqdm==4.50.2
pytorch==1.7.0
python==3.7.3
cudatoolkit==11.0.3
transformers==4.1.1

We highly suggest you using Anaconda to manage your python environment.

We download the chinese pretrained model checkpoints from the following links:

How to Run it

The script train.py acts as a main function to the project, you can run the experiments by the following commands.

# LSTM w/o Profile on TITAN Xp
python train.py -g -fs -es -uf -bs 8 -lr 0.0006
# LSTM w/ Profile on TITAN Xp
python train.py -g -fs -es -uf -ui -bs 8 -lr 0.0004
# BERT w/o Profile on Tesla V100s PCIE 32GB
python train.py -g -fs -es -uf -up -mt XLNet -bs 8 -lr 0.001 -blr 4e-05
# BERT w/ Profile on Tesla V100 PCIE 32GB
python train.py -g -fs -es -uf -up -ui -mt ELECTRA -bs 8 -lr 0.0008 -blr 4e-05

If you have any question, please issue the project or email me or lbqin, and we will reply you soon.

Acknowledgement

We are highly grateful for the public code of Stack-Propagation!

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding. Libo Qin,Wanxiang Che, Yangming Li, Haoyang Wen and Ting Liu. (EMNLP 2019). Long paper. [pdf] [code]
We are highly grateful for the open-source knowledge graph!
- CN-DBpedia
- OwnThink

PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Related tags

Overview

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Workflow

Architecture

Results

Preparation

How to Run it

Acknowledgement

Owner

Xiao Xu

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

A natural language modeling framework based on PyTorch

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Pipeline for fast building text classification TF-IDF + LogReg baselines.

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

Comprehensive-E2E-TTS - PyTorch Implementation

Model for recasing and repunctuating ASR transcripts

Paddlespeech Streaming ASR GUI

To be a next-generation DL-based phenotype prediction from genome mutations.

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.

FireFlyer Record file format, writer and reader for DL training samples.

Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

A full spaCy pipeline and models for scientific/biomedical documents.