for a paper about leveraging discourse markers for training new models

Last update: Nov 02, 2022

Related tags

Deep Learning TSLM-DISCOURSE-MARKERS

Overview

TSLM-DISCOURSE-MARKERS

Scope

This repository contains:

(1) Code to extract discourse markers from wikipedia (TSA).

(1) Code to extract significant discoßurse markers from predictions over a sample

Usage

Evaluation code:

Installation

Using pip:

pip install git+ssh://[email protected]/IBM/tslm-discourse-markers.git#egg=tslm-discourse-markers

Alternatively, you can first clone the code, and install the requirements:

1. git clone [email protected]:IBM/tslm-discousrse-markers.git
2. cd tslm-discourse-markers
3. pip install -r requirements.txt

You also need to download fasttext model: curl https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin -o ~/Downloads/lid.176.bin and spacy english model: python -m spacy download en_core_web_sm

Running

Citing tslm-discourse-markers

If you are using tslm-discourse-markers in a publication, please cite the following paper:

Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin,Ranit Aharonov and Noam Slonim 2022 Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis. AAAI-2022.

Model

SenDM model can be found at: https://huggingface.co/ibm/tslm-discourse-markers

Loading dataset

import datasets

directory = 'dataset/WIKI_ENGLISH' datasets.load_dataset('csv', data_files={folder: [f'{directory}/{folder}/{folder}_*.csv.gz'] for folder in ['train', 'dev','test']})

Contributing

This project welcomes external contributions, if you would like to contribute please see further instructions here

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Fork the repo
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Changelog

Major changes are documented here.

Notes

If you have any questions or issues you can create a new issue here.

License

This code is distributed under Apache License 2.0. If you would like to see the detailed LICENSE click here.

Authors

The YASO dataset was collected by Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin, Ranit Aharonov and Noam Slonim.

The code was written by Ilya Shnayderman.

for a paper about leveraging discourse markers for training new models

Related tags

Overview

TSLM-DISCOURSE-MARKERS

Scope

Usage

Citing tslm-discourse-markers

Model

Loading dataset

Contributing

Changelog

Notes

License

Authors

Owner

International Business Machines

✂️ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.

Convolutional Neural Network to detect deforestation in the Amazon Rainforest

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

Deep learning based hand gesture recognition using LSTM and MediaPipie.

Using OpenAI's CLIP to upscale and enhance images

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Generative Models as a Data Source for Multiview Representation Learning

Semantic graph parser based on Categorial grammars

Keras implementation of the GNM model in paper ’Graph-Based Semi-Supervised Learning with Nonignorable Nonresponses‘

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

[内测中]前向式Python环境快捷封装工具，快速将Python打包为EXE并添加CUDA、NoAVX等支持。

Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Vision Transformer and MLP-Mixer Architectures

Bayesian regularization for functional graphical models.

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

A full pipeline AutoML tool for tabular data

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.