Natural Language Processing Tasks and Examples
With the advancement of A.I. technology in recent years, natural language processing technology has been able to solve so many problems. While working as an NLP engineer, I encountered various tasks, and I thought it would be nice to gather and organize the natural language processing tasks I have dealt with in one place. Borrowing Kyubyong's project format, I organized natural language processing tasks with references and example code.
Automated Essay Scoring
- WIKIAutomated Essay Scoring
- DATAThe Hewlett Foundation: Automated Essay Scoring
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's AES
Automatic Speech Recognition
- WIKISpeech Recognition
- DATALibriSpeech
- DATAAISHELL-1
- DATAKsponSpeech
- MODELDeep Speech2
- MODELListen, Attend and Spell
- MODELWav2vec 2.0
- OFF-THE-SHELFPororo's ASR
- CODEExample with KsponSpeech
Dialogue Generation
- WIKIDialogue System
- DATAPersona Chat
- DATAKorean SNS Corpus
- MODELDialogue GPT
- CODEExample with Korean SNS Corpus
Dialogue Retrieval
- WIKIDialogue System
- DATAPersona Chat
- DATAKorean SNS Corpus
- MODELPoly Encoder
- CODEExample with Korean SNS Corpus
Fill in the Blank
- WIKICloze Test
- INFOMasked-Language-Modeling with BERT
- MODELBERT
- MODELRoBERTa
- OFF-THE-SHELFPororo's Fill in the Blank
- CODEExample with WikiCorpus
Grammatical Error Correction
- WIKIAutocorrection
- DATANUS Non-commercial research/trial corpus license
- DATACornell Movie--Dialogs Corpus
- OFF-THE-SHELFPororo's GEC
Grapheme To Phoneme
- WIKIGrapheme
- WIKIPhoneme
- REPRESENTATIVE-DATAMultilingual Pronunciation Data
- OFF-THE-SHELF-MODELPororo's G2P
Language Modeling
- WIKILanguage Model
- INFOA beginner’s guide to language models
- MODELGPT3
- MODELGPT2
- MODELKen-LM
- MODELRNN-LM
- CODEExample with OpenWebText
Machine Reading Comprehension
- WIKIReading Comprehension
- INFOMachine Reading Comprehension with BERT
- DATASQuAD
- DATAKorQuad
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's MRC
- CODEExample with SQuAD & KorQuad
Machine Translation
- WIKITranslation
- DATAWMT 2014 English-to-French
- DATAKorean-English translation corpus
- MODELTransformer
- OFF-THE-SHELFPororo's Translation
- CODEExample with Korean-English translation corpus
Math Word Problem Solving
- PAPER-WITH-CODEMath Word Problem Solving
- DATADeepMind Mathmatics Dataset
Natural Language Inference
- WIKITextual Entailment
- DATAGLUE-MNLI
- DATAKorNLI
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's NLI
- CODEExample with GLUE-MNLI
Named Entity Recognition
- WIKINamed Entity Recognition
- DATACoNLL-2002 NER corpus
- DATACoNLL-2003 NER corpus
- DATANaver NER
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's NER
- CODEExample with Naver NER
Paraphrase Generation
- WIKIParaphrase
- OFF-THE-SHELFPororo's Paraphrase Generation
Phoneme To Grapheme
- OFF-THE-SHELFPororo's P2G
Sentiment Analysis
- WIKISentiment Analysis
- DATAGLUE-SST
- DATANSMC
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's Sentiment Analysis
- CODEExample with NSMC
Semantic Textual Similarity
- WIKISemantic Similarity
- DATAGLUE-STS
- DATAKorSTS
- MODELBERT
- MODELRoBERTa
- MODELElectra
- OFF-THE-SHELFPororo's STS
- CODEExample with SQuAD
Speech Synthesis
- WIKISpeech Synthesis
- DATALJ Speech
- DATACSS10
- DATAKSS
- MODELTacotron2
- MODELFastSpeech2
- MODELWaveNet
- MODELHifi-GAN
- OFF-THE-SHELFPororo's TTS
- CODEExample with LJ-Speech
- CODEExample with KSS
Summarization
- WIKIAutomatic Summarization
- DATAXSum
- DATAKorean Summarization Corpus
- MODELBART
- OFF-THE-SHELFPororo's Summarization
- CODEExample with XSum
Author
- Soohwan Kim @sooftware
- Contacts: [email protected]