This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

Last update: Dec 09, 2022

Related tags

Overview

AEDA: An Easier Data Augmentation Technique for Text Classification

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

The baseline code is for EDA: Easy Data Augmentation techniques for boosting performance on text classification tasks

Our augmentation code can be found in the code folder titled aeda.py. In addition, we also make available our train and test data which is in the data folder.

Citation

@misc{karimi2021aeda,
      title={AEDA: An Easier Data Augmentation Technique for Text Classification},
      author={Akbar Karimi and Leonardo Rossi and Andrea Prati},
      year={2021},
      eprint={2108.13230},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Owner

Akbar Karimi

GitHub Repository https://arxiv.org/abs/2108.13230

Word Bot for JKLM Bomb Party

Word Bot for JKLM Bomb Party A bot for Bomb Party on https://www.jklm.fun (Only English) Requirements pynput pyperclip pyautogui Usage: Step 1: Run th

7 Oct 30, 2022

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Deep-Learning-for-Text-Document-Classification Text classification is one of the popular tasks in NLP that allows a program to classify free-text docu

2 Mar 17, 2022

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

IVR-Chatbot Achievements 🏆 Team Uhtred won the Maverick 2.0 Bot-a-thon 2021 organized by AbInbev India. ❓ Problem Statement As we all know that, lot

9 Dec 08, 2022

本插件是pcrjjc插件的重置版，可以独立于后端api运行

pcrjjc2 本插件是pcrjjc重置版，不需要使用其他后端api，但是需要自行配置客户端本项目基于AGPL v3协议开源，由于项目特殊性，禁止基于本项目的任何商业行为配置方法环境需求：.net framework 4.5及以上 jre8 别忘了装jre8 别忘了装jre8 别忘了装jre8

132 Dec 26, 2022

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour

1 Jan 03, 2022

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Speech Separation The simple project to separate mixed voice (2 clean voices) to 2 separate voices. Result Example (Clisk to hear the voices): mix ||

31 Oct 30, 2022

A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

124 Jan 02, 2023

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Speaker-Embeddings-Correlation-Pooling This is the original implementation of the pooling method introduced in "Speaker embeddings by modeling channel

10 Apr 30, 2022

华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

HUAWEI STORE GO 2021 说明基于Python3+Selenium的华为商城抢购爬虫脚本，修改自近两年没更新的项目BUY-HW，为女神抢Nova 8（什么时候华为开始学小米玩饥饿营销了？）原项目的登陆以及抢购部分已经不可用，本项目对原项目进行了改正以适应新华为商城，并增加一些功能

111 Dec 22, 2022

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

🤗 Transformers Wav2Vec2 + Parlance's CTCDecode Introduction This repo shows how 🤗 Transformers can be used in combination with Parlance's ctcdecode

9 Jul 21, 2022

Poetry PEP 517 Build Backend & Core Utilities

Poetry Core A PEP 517 build backend implementation developed for Poetry. This project is intended to be a light weight, fully compliant, self-containe

293 Jan 02, 2023

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

ruCLIP-SB RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and re

5 Apr 13, 2022

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library.

GI-Pi Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library. The SP0

8 Dec 15, 2021

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

CTC Decoding Algorithms Update 2021: installable Python package Python implementation of some common Connectionist Temporal Classification (CTC) decod

736 Jan 03, 2023

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

Related tags

Overview

AEDA: An Easier Data Augmentation Technique for Text Classification

Citation

Owner

Akbar Karimi

Word Bot for JKLM Bomb Party

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

本插件是pcrjjc插件的重置版，可以独立于后端api运行

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

A paper list of pre-trained language models (PLMs).

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

Poetry PEP 517 Build Backend & Core Utilities

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library.

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Code associated with the Don't Stop Pretraining ACL 2020 paper

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset