One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Last update: Dec 11, 2022

Related tags

Deep Learning DMRST_Parser

Overview

Introduction

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".
Users can apply it to parse the input text from scratch, and get the EDU segmentations and the parsed tree structure.
The model supports both sentence-level and document-level RST discourse parsing.
This repo and the pre-trained model is only for research use.

Package Requirements

pytorch==1.7.1
transformers==4.8.2

Supported Languages

We trained and evaluated the model with the multilingual collection of RST discourse treebanks, and it natively supports 6 languages: English, Portuguese, Spanish, German, Dutch, Basque. Interested users can also try other languages.

Data Format

[Input] InputSentence: The input document/sentence, and the raw text will be tokenizaed and encoded by the xlm-roberta-base language backbone. '|| ' denotes the EDU boundary positions.
- Although the report, || which has released || before the stock market opened, || didn't trigger the 190.58 point drop in the Dow Jones Industrial Average, || analysts said || it did play a role in the market's decline. ||
[Output] EDU_Breaks: The indices of the EDU boundary tokens, including the last word of the sentence.
- [2, 5, 10, 22, 24, 33]
[Output] tree_parsing_output: The model outputs of the discourse parsing tree follow this format.
- (1:Satellite=Contrast:4,5:Nucleus=span:6) (1:Nucleus=Same-Unit:3,4:Nucleus=Same-Unite:4) (5:Satellite=Attribution:5,6:Nucleus=span:6) (1:Satellite=span:1,2:Nucleus=Elaboration:3) (2:Nucleus=span:2,3:Satellite=Temporal:3)

How to use it for parsing

Put the text paragraph to the file ./data/text_for_inference.txt.
Run the script MUL_main_Infer.py to obtain the RST parsing result. See the script for detailed model output.
We recommend users to run the parser on a GPU-equipped environment.

Citation

@article{liu2021dmrst,
  title={DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy F},
  journal={arXiv preprint arXiv:2110.04518},
  year={2021}
}

@inproceedings{liu2020multilingual,
  title={Multilingual Neural RST Discourse Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy},
  booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
  pages={6730--6738},
  year={2020}
}

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Related tags

Overview

Introduction

Package Requirements

Supported Languages

Data Format

How to use it for parsing

Citation

Owner

seq-to-mind

Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"

[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

DaReCzech is a dataset for text relevance ranking in Czech

Deep Markov Factor Analysis (NeurIPS2021)

Code implementation from my Medium blog post: [Transformers from Scratch in PyTorch]

This is an example of a reproducible modelling project

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

Keras Image Embeddings using Contrastive Loss

Convert openmmlab (not only mmdetection) series model to tensorrt

[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

DABO: Data Augmentation with Bilevel Optimization

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Malmo Collaborative AI Challenge - Team Pig Catcher

TensorFlow, PyTorch and Numpy layers for generating Orthogonal Polynomials

PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

To model the probability of a soccer coach leave his/her team during Campeonato Brasileiro for 10 chosen teams and considering years 2018, 2019 and 2020.