CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

Overview

CvT2DistilGPT2

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

  • This repository houses the implementation of CvT2DistilGPT2 from [1].
  • CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.
  • Checkpoints for CvT2DistilGPT2 on MIMIC-CXR and IU X-Ray are available.
  • This implementation could be adapted for any image captioning task by modifying the datamodule.

CvT2DistilGPT2 for MIMIC-CXR. Q, K, and V are the queries, keys, and values, respectively, for multi-head attention. * indicates that the linear layers for Q, K, and V are replaced with the convolutional layers depicted below the multi-head attention module. [BOS] is the beginning-of-sentence special token. N_l is the number of layers for each stage, where N_l=1, N_l=4, and N_l=16 for the first, second, and third stage, respectively. The head for DistilGPT2 is the same used for language modelling. Subwords produced by DistilGPT2 are separated by a vertical bar.

Installation

The required packages are located in requirements.txt. It is recommended that these are installed in a virtualenv:

python3 -m venv --system-site-packages venv
source venv/bin/activate
pip install --upgrade pip
pip install --upgrade -r requirements.txt --no-cache-dir

Datasets

For MIMIC-CXR:

  1. Download MIMIC-CXR-JPG from:

    https://physionet.org/content/mimic-cxr-jpg/2.0.0/
    
  2. Place in dataset/mimic_cxr_jpg such that dataset/mimic_cxr_jpg/physionet.org/files/mimic-cxr-jpg/2.0.0/files.

  3. Download the Chen et al. labels for MIMIC-CXR from:

    https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing
    
  4. Place annotations.json in dataset/mimic_cxr_chen

For IU X-Ray:

  1. Download the Chen et al. labels and the chest X-rays in png format for IU X-Ray from:
    https://drive.google.com/file/d/1c0BXEuDy8Cmm2jfN0YYGkQxFZd2ZIoLg/view
    
  2. Place files into dataset/iu_x-ray_chen such that dataset/iu_x-ray_chen/annotations.json and dataset/iu_x-ray_chen/images.

#####Note: the dataset directory can be changed for each task with the variable dataset_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Checkpoints

The checkpoints for MIMIC-CXR and IU X-Ray can be found at (the download link is located at the top right): https://doi.org/10.25919/hbqx-2p71. Place the checkpoints in the experiment directory for each version of each task, e.g., experiment/mimic_cxr_jpg_chen/cvt_21_to_gpt2_scst/epoch=0-val_chen_cider=0.410965.ckpt #####Note: the experiment directory can be changed for each task with the variable exp_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Instructions

  • The model configurations for each task can be found in its config directory, e.g. task/mimic_cxr_jpg_chen/config.

  • A job for a model is described in the tasks jobs.yaml file, e.g. task/mimic_cxr_jpg_chen/jobs.yaml.

  • To test the CvT2DistilGPT2 + SCST checkpoint, set task/mimic_cxr_jpg_chen/jobs.yaml to (default):

    cvt_21_to_distilgpt2_scst:
        train: 0
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    
  • To train CvT2DistilGPT2 with teacher forcing and then test, set task/mimic_cxr_jpg_chen/jobs.yaml to:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    

    or with Slurm:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
        resumable: 1
        sbatch: 1
        time_limit: 1-00:00:00
    
  • To run the job:

    python3 main.py --task mimic_cxr_jpg_chen

#####Note: data from the job will be saved in the experiment directory.

Reference

[1] Aaron Nicolson, Jason Dowling, and Aaron Nicolson, Improving Chest X-Ray Report Generation by Leveraging Warm-Starting, Under review (January 2022)

Owner
The Australian e-Health Research Centre
The Australian e-Health Research Centre
Meaningful titles for tabs and PDF downloads! Also supports tab search.

arxiv-utils If you are a researcher that reads a lot on ArXiv, you'll benefit a lot from this web extension. Renames the title of PDF page to the pape

Johnson 174 Dec 20, 2022
Adversarial Learning for Modeling Human Motion

Adversarial Learning for Modeling Human Motion This repository contains the open source code which reproduces the results for the paper: Adversarial l

wangqi 6 Jun 15, 2021
basic tutorial on pytorch

Quick Tutorial on PyTorch PyTorch Basics Linear Regression Logistic Regression Artificial Neural Networks Convolutional Neural Networks Recurrent Neur

7 Sep 15, 2022
Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

Neural Architecture Search for Spiking Neural Networks Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks] (https

Intelligent Computing Lab at Yale University 28 Nov 18, 2022
Unified MultiWOZ evaluation scripts for the context-to-response task.

MultiWOZ Context-to-Response Evaluation Standardized and easy to use Inform, Success, BLEU ~ See the paper ~ Easy-to-use scripts for standardized eval

Tomáš Nekvinda 38 Dec 13, 2022
A disassembler for the RP2040 Programmable I/O State-machine!

piodisasm A disassembler for the RP2040 Programmable I/O State-machine! Usage Just run piodisasm.py on a file that contains the PIO code as hex! (Such

Ghidra Ninja 29 Dec 06, 2022
Integrated physics-based and ligand-based modeling.

ComBind ComBind integrates data-driven modeling and physics-based docking for improved binding pose prediction and binding affinity prediction. Given

Dror Lab 44 Oct 26, 2022
Keras documentation, hosted live at keras.io

Keras.io documentation generator This repository hosts the code used to generate the keras.io website. Generating a local copy of the website pip inst

Keras 2k Jan 08, 2023
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

SfM disambiguation with COLMAP About Structure-from-Motion generally fails when the scene exhibits symmetries and duplicated structures. In this repos

Computer Vision and Geometry Lab 193 Dec 26, 2022
BASH - Biomechanical Animated Skinned Human

We developed a method animating a statistical 3D human model for biomechanical analysis to increase accessibility for non-experts, like patients, athletes, or designers.

Machine Learning and Data Analytics Lab FAU 66 Nov 19, 2022
Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

Jiacheng Ye 12 Nov 24, 2022
Constraint-based geometry sketcher for blender

Constraint-based sketcher addon for Blender that allows to create precise 2d shapes by defining a set of geometric constraints like tangent, distance,

1.7k Dec 31, 2022
Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非

Nitin 53 Nov 17, 2022
Code for generating a single image pretraining dataset

Single Image Pretraining of Visual Representations As shown in the paper A critical analysis of self-supervision, or what we can learn from a single i

Yuki M. Asano 12 Dec 19, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
[ICRA2021] Reconstructing Interactive 3D Scene by Panoptic Mapping and CAD Model Alignment

Interactive Scene Reconstruction Project Page | Paper This repository contains the implementation of our ICRA2021 paper Reconstructing Interactive 3D

97 Dec 28, 2022
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb

Google Research 135 Dec 27, 2022
Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

WIDER-YOLO : Yüz Tespit Uygulaması Yap Wider-Yolo Kütüphanesinin Kullanımı 1. Wider Face Veri Setini İndir Train Dataset Val Dataset Test Dataset Not:

Kadir Nar 6 Aug 22, 2022
[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University 99 Dec 30, 2022