Codes for coreference-aware machine reading comprehension

Last update: Sep 29, 2022

Related tags

Overview

Data and code for the paper "Tracing Origins: Coreference-aware Machine Reading Comprehension" at ACL2022.

Dataset

There are three folders for our three models mentioned in the paper: Coref_additive_spacy for Coref_additive_attention, Coref_dgl_spacy for GNN and Coref_multiplication_spacy for Coref_multiplication_attention, and each contains the train data set and the dev data set under the quoref folder.

each sample contains

context: the paragraph text
context_id: the unique identifier of the context
qas: a group of questions
question: question text
id: the unique identifier of the question
answers: a group of the answers to one question
text: answer text
answer_start: the start_position of one answer

Models

If you want to use our trained model, please download it from Google drive

Training

python run_quoref.py --train_file "quoref/train.json" --predict_file "quoref/dev.json" --model_type "roberta_multi" --model_name_or_path "roberta-large" --output_dir "out" --do_train --do_eval --eval_all_checkpoints --learning_rate 1e-5 --num_train_epochs 6 --overwrite_output_dir --per_gpu_train_batch_size 4 --save_steps 6000 --coref_weight 0.4

Kindly Hint

There is an open issue regarding the compatibility between NeuralCoref and spaCy 3.0. If you intend to use the latest spaCy models, please watch the issue.

Cite

If you extend or use this work, please cite the paper where it was introduced:

@article{Huang2021TracingOC,
  title={Tracing Origins: Coref-aware Machine Reading Comprehension},
  author={Baorong Huang and Zhuosheng Zhang and Hai Zhao},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.07961}
}

Codes for coreference-aware machine reading comprehension

Related tags

Overview

Dataset

Models

Training

Kindly Hint

Cite

Owner

超轻量级bert的pytorch版本，大量中文注释，容易修改结构，持续更新

मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

Speach Recognitions

Programme de chiffrement et de déchiffrement inverse d'un message en python3.

Partially offline multi-language translator built upon Huggingface transformers.

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

Speech Recognition for Uyghur using Speech transformer

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

🏆 • 5050 most frequent words in 109 languages

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

End-to-End Speech Processing Toolkit

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

Data preprocessing rosetta parser for python

Continuously update some NLP practice based on different tasks.

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

This is Assignment1 code for the Web Data Processing System.