Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

Overview

Hierarchical Attention Mining (HAM) for weakly-supervised abnormality localization

This is the official PyTorch implementation for the HAM method.

Paper | Model | Data

Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis
by Xi Ouyang, Srikrishna Karanam, Ziyan Wu, Terrence Chen, Jiayu Huo, Xiang Sean Zhou, Qian Wang, Jie-Zhi Cheng

Teaser image

Abstract

We propose a new attention-driven weakly supervised algorithm comprising a hierarchical attention mining framework that unifies activation- and gradient-based visual attention in a holistic manner. On two largescale chest X-ray datasets (NIH Chest X-ray14 and CheXpert), it can achieve significant localization performance improvements over the current state of the art while also achieve competitive classification performance.

Release Notes

This repository is a faithful reimplementation of HAM in PyTorch, including all the training and evaluation codes on NIH Chest X-ray14 dataset. We also provide two trained models for this dataset. For CheXpert dataset, you can refer to the data preparation on NIH Chest X-ray14 dataset to implement the code.

Since images in CheXpert are annotated with labels only at image level, we invite a senior radiologist with 10+ years of experience to label the bounding boxes for 9 abnormalities. These annotations are also opensourced here, including 6099 bounding boxes for 2345 images. We hope that it can help the future researchers to better verify their methods.

Installation

Clone this repo.

git clone https://github.com/oyxhust/HAM.git
cd HAM/

We provide the CRF method in this code, which can help to refine the box annotations to close to the mask. It is not activated by default, but pydensecrf should be installed.

pip install cython
pip install git+https://github.com/lucasb-eyer/pydensecrf.git

This code requires PyTorch 1.1+ and python 3+. Please install Pytorch 1.1+ environment, and install dependencies (e.g., torchvision, visdom and PIL) by

pip install -r requirements.txt

Dataset Preparation

For NIH Chest X-ray14 or CheXpert, the datasets must be downloaded beforehand. Please download them on the respective webpages. In the case of NIH Chest X-ray14, we put a few sample images in this code repo.

Preparing NIH Chest X-ray14 Dataset. The dataset can be downloaded here (also could be downloaded from Kaggle Challenge). In particular, you will need to download “images_001.tar.gz", “images_002.tar.gz", ..., and "images_012.tar.gz". All these files will be decompressed into a images folder. The images and labels should be arranged in the same directory structure as datasets/NIHChestXray14/. The folder structure should be as follows:

├── datasets
│   ├── NIHChestXray14
│   │    ├── images
│   │    │    ├── 00000001_000.png
│   │    │    ├── 00000001_001.png
│   │    │    ├── 00000001_002.png
│   │    │    ├── ...
│   │    ├── Annotations
│   │    │    ├── BBoxes.json
│   │    │    └── Tags.json
│   │    ├── ImageSets
│   │    │    ├── bbox
│   │    │    └── Main

BBoxes.json is a dictionary to store the bounding boxes for abnormality localization in this dataset. Tags.json is a dictionary of the image-level labels for all the images. These json files are generated from the original dataset files for better input to our models. ImageSets contains data split files of the training, validation and testing in different settings.

Preparing CheXpert Dataset. The dataset can be downloaded here. You can follow the similar folder structure of NIH Chest X-ray14 dataset to prepare this dataset. Since images in CheXpert are annotated with labels only at image level, we invite a senior radiologist with 10+ years of experience to label the bounding boxes for 9 abnormalities. We release these localization annotations in datasets/CheXpert/Annotations/BBoxes.json. It is a dictionary with the relative image path under CheXpert/images fold as the key, and store the corresponding abnormality box annotations. For each bounding box, the coordinate format is [xmin, ymin, xmax, ymax]. We have labeled 6099 bounding boxes for 2345 images. It is worth noting that the number of our box annotations is significantly larger than the number of annotated boxes in the NIH dataset. Hope to better help future researchers.

Training

Visdom

We use visdom to plot the loss values and the training attention results. Therefore, it is neccesary to open a visdom port during training. Open a terminal in any folder and input:

python -m visdom.server

Click the URL http://localhost:8097 to see the visualization results. Here, it uses the default port 8097.

Also, you can choose to use other ports by

python -m visdom.server -p 10004

10004 is an example, which can be set into any other available port of your machine. The port number "display_port" in the corresponding config file should be also set into the same value, and click the URL http://localhost:10004 to see the visualization results.

Classification Experiments

Train a model on the official split of NIH Chest-Xray14 dataset:

python train.py -cfg configs/NIHChestXray14/classification/official_split.yaml

The training log and visualization results will be stored on output/logs and output/train respectively. Our trained models on offical splits is avaliable on Baidu Yun (code: rhsd) or Google Drive.

Train the models on 5-fold cross-validation (CV) scheme of NIH Chest-Xray14 dataset. In each fold, we use 70% of the annotated and 70% of unannotated images for training, and 10% of the annotated and unannotated images for validation. Then, the rest 20% of the annotated and unannotated images are used for testing. Train on fold1:

python train.py -cfg configs/NIHChestXray14/classification/5fold_validation/fold1.yaml

Config files of other folds can be also found on configs/NIHChestXray14/classification/5fold_validation.

Localization Experiments

Train our model with 100% images (111,240) without any box annotations and test with the 880 images with box annotations on NIH Chest-Xray14 dataset:

python train.py -cfg configs/NIHChestXray14/localization/without_bboxes.yaml

Train the models on 5-fold cross-validation (CV) scheme of NIH Chest-Xray14 dataset. In each fold, we train our model with 50% of the unannotated images and 80% of the annotated images, and tested with the remaining 20% of the annotated images. Train on fold1:

python train.py -cfg configs/NIHChestXray14/localization/5fold_validation/fold1.yaml

Trained models on fold1 is avaliable on Baidu Yun (code: c7np) or Google Drive. Config files of other folds can be also found on configs/NIHChestXray14/localization/5fold_validation.

Testing

Here, we take the test results for the official splits of NIH Chest-Xray14 dataset as an example. It requires the corresponding config file and the trained weights. Please download the trained model on Baidu Yun (code: rhsd) or Google Drive.

Once get the trained model, the test results (classification and localization metrics) can be calculated by:

python test.py -cfg configs/official_split.yaml -w [PATH/TO/OfficialSplit_Model.pt]

Also, the attention maps from our method can be generated by:

python visual.py -cfg configs/official_split.yaml -w [PATH/TO/OfficialSplit_Model.pt]

The attention results will be saved in outputs/visual. You can open the index.html to check all the results in a webpage.

Also, you can download the model trained with box annotations on Baidu Yun (code: c7np) or Google Drive. This model can achieve much better localization results than the model trained on official split.

Code Structure

  • train.py, test.py, visual.py: the entry point for training, testing, and visualization.
  • configs/: config files for training and testing.
  • data/: the data loader.
  • models/: creates the networks.
  • modules/attentions/: the proposed foreground attention module.
  • modules/sync_batchnorm/: the synchronized batchNorm.
  • utils/: define the training, testing and visualization process and the support modules.

Options

Options in config files. "GPUs" can select the gpus. "Means" and "Stds" are used for the normalization. "arch" only supports the resnet structure, you can choose one from ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101','resnet152', 'resnext50_32x4d', 'resnext101_32x8d']. "Downsampling" defines the downsampline rate of the feature map from the backbone, including [8, 16, 32]. "Using_pooling" decides whether to use the first pooling layer in resnet. "Using_dilation" decides whether to use the dilation convolution in resnet. "Using_pretrained_weights" decides whether to use the pretrained weights on ImageNet. "Using_CRF" decides whether to use the CRF to prepocess the input box annotations. It can help to refine the box to more close to the mask annotations. "img_size" and "in_channels" is the size and channel of the input image. "display_port" is the visdom port.

Options in utils/config.py. "model_savepath" is the path to save checkpoint. "outputs_path" is the path to save output results. "cam_w" and "cam_sigma" are the hyperparameters in the soft masking fuction in "refine_cams" function in models/AttentionModel/resnet.py. "cam_loss_sigma" is the hyperparameter for the soft masking in the abnormality attention map. "lse_r" is the hyperparameter for LSE pooling. "loss_ano", "loss_cls", "loss_ex_cls", "loss_bound", and "loss_union" are the weights for the weighting factors for different losses. "cls_thresh" is the threshold for classification prediction to calculate the accuracy. "cam_thresh" is the threshold for localization prediction to get the binary mask. "thresh_TIOU" and "thresh_TIOR" are the thresholds to calculate the TIoU and TIoR. "palette" defines the color for mask visualization.

Citation

If you use this code or our published annotations of CheXpert dataset for your research, please cite our paper.

@article{ouyang2021learning,
  title={Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis},
  author={Ouyang, Xi and Karanam, Srikrishna and Wu, Ziyan and Chen, Terrence and Huo, Jiayu and Zhou, Xiang Sean and Wang, Qian and Cheng, Jie-Zhi},
  journal={IEEE Transactions on Medical Imaging},
  volume={40},
  number={10},
  pages={2698--2710},
  year={2021},
  publisher={IEEE}
}

Acknowledgments

We thank Jiayuan Mao for his Synchronized Batch Normalization code, and Jun-Yan Zhu for his HTML visualization code.

Owner
Xi Ouyang
Xi Ouyang
Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

LancoPKU 105 Jan 03, 2023
PyTorch implementation of DreamerV2 model-based RL algorithm

PyDreamer Reimplementation of DreamerV2 model-based RL algorithm in PyTorch. The official DreamerV2 implementation can be found here. Features ... Run

118 Dec 15, 2022
Python Implementation of the CoronaWarnApp (CWA) Event Registration

Python implementation of the Corona-Warn-App (CWA) Event Registration This is an implementation of the Protocol used to generate event and location QR

MaZderMind 17 Oct 05, 2022
LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is no

Sahil Lamba 1 Dec 20, 2021
Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs

Project Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs, https://arxiv.org/pdf/2111.01940.pdf. Authors Truong Son Hy

5 Jun 28, 2022
The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

42 Dec 27, 2022
A framework that allows people to write their own Rocket League bots.

YOU PROBABLY SHOULDN'T PULL THIS REPO Bot Makers Read This! If you just want to make a bot, you don't need to be here. Instead, start with one of thes

543 Dec 20, 2022
Haze Removal can remove slight to extreme cases of haze affecting an image

Haze Removal can remove slight to extreme cases of haze affecting an image. Its most typical use is for landscape photography where the haze causes low contrast and low saturation, but it can also be

Grace Ugochi Nneji 3 Feb 15, 2022
This repository provides the code for MedViLL(Medical Vision Language Learner).

MedViLL This repository provides the code for MedViLL(Medical Vision Language Learner). Our proposed architecture MedViLL is a single BERT-based model

SuperSuperMoon 39 Jan 05, 2023
Skipgram Negative Sampling in PyTorch

PyTorch SGNS Word2Vec's SkipGramNegativeSampling in Python. Yet another but quite general negative sampling loss implemented in PyTorch. It can be use

Jamie J. Seol 287 Dec 14, 2022
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

谷下雨 47 Dec 25, 2022
Generate high quality pictures. GAN. Generative Adversarial Networks

ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The

Lieon 1 Dec 14, 2021
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Teaching end to end workflow of deep learning

Deep-Education This repository is now available for public use for teaching end to end workflow of deep learning. This implies that learners/researche

Data Lab at College of William and Mary 2 Sep 26, 2022
Pretrained Cost Model for Distributed Constraint Optimization Problems

Pretrained Cost Model for Distributed Constraint Optimization Problems Requirements PyTorch 1.9.0 PyTorch Geometric 1.7.1 Directory structure baseline

2 Aug 28, 2022
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a

ImagePy 1.2k Dec 29, 2022
A mini lib that implements several useful functions binding to PyTorch in C++.

Torch-gather A mini library that implements several useful functions binding to PyTorch in C++. What does gather do? Why do we need it? When dealing w

maxwellzh 8 Sep 07, 2022
FlingBot: The Unreasonable Effectiveness of Dynamic Manipulations for Cloth Unfolding

This repository contains code for training and evaluating FlingBot in both simulation and real-world settings on a dual-UR5 robot arm setup for Ubuntu 18.04

Columbia Artificial Intelligence and Robotics Lab 70 Dec 06, 2022
这是一个yolox-keras的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Keras当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

Bubbliiiing 64 Nov 10, 2022
OverFeat is a Convolutional Network-based image classifier and feature extractor.

OverFeat OverFeat is a Convolutional Network-based image classifier and feature extractor. OverFeat was trained on the ImageNet dataset and participat

593 Dec 08, 2022