code for modular summarization work published in ACL2021 by Krishna et al

Last update: Nov 24, 2022

Related tags

Overview

This repository contains the code for running modular summarization pipelines as described in the publication
Krishna K, Khosla K, Bigham J, Lipton ZC. Generating SOAP Notes from Doctor-Patient Conversations." ACL 2021.

Instructions

Although we can not release models trained on the confidential medical data, we have released models trained on the publicly available AMI dataset.
To reproduce the results on the AMI dataset, you need to follow the steps listed below. For convenience, we have also created a Google Colab notebook here that runs these steps on Google's servers (free-of-cost as of June 2021) and produces the summaries and their rouge scores.

Step1: Set up the environment by installing the required packages mentioned in requirements.txt using pip.

Step2: Download the ami_models folder from this link and put it at the root of the repository:

Step3: Run the following 3 commands to prepare data, run summary generation pipelines, and show the achieved rouge scores.

# command1: downloads and preprocesses AMI dataset  
./prepare_data.sh  
  
 # command2: runs the summarization pipelines on the data and computes rouge scores  
 # (before running this command, you need to download the models as shown above)  
./predict_ami.sh  
  
# command3: print the results  
python show_results.py

code for modular summarization work published in ACL2021 by Krishna et al

Related tags

Overview

Instructions

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Finally decent dictionaries based on Wiktionary for your beloved eBook reader.

Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Pangu-Alpha for Transformers

AutoGluon: AutoML for Text, Image, and Tabular Data

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Creating a chess engine using GPT-3

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

Gold standard corpus annotated with verb-preverb connections for Hungarian.

Understanding the Difficulty of Training Transformers

A high-level Python library for Quantum Natural Language Processing

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization

Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields

Python implementation of TextRank for phrase extraction and summarization of text documents

Source code of the "Graph-Bert: Only Attention is Needed for Learning Graph Representations" paper

A BERT-based reverse-dictionary of Korean proverbs

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+