SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

Overview

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

A novel graph neural network (GNN) based model (termed SlideGraph+) to predict HER2 status directly from whole-slide images of routine Haematoxylin and Eosin (H&E) slides. This pipeline generates node-level and WSI-level predictions by using a graph representation to capture the biological geometric structure of the cellular architecture at the entire WSI level. A pre-processing function is used to do adaptive spatial agglomerative clustering to group spatially neighbouring regions with high degree of feature similarity and construct a WSI-level graph based on clusters.

Data

The repository can be used for constructing WSI-level graphs, training SlideGraph and predicting HER2 status on WSI-level graphs. The training data used in this study was downloaded from TCGA using https://portal.gdc.cancer.gov/projects/TCGA-BRCA.

Workflow of predicting HER2 status from H&E images

workflow1

GNN network architecture

GCN_architecture5

Environment

Please refer to requirements.txt

Repository Structure

Below are the main executable scripts in the repository:

features_to_graph.py: Construct WSI-level graph

platt.py: Normalise classifier output scores to a probability value

GNN_pr.py: Graph neural network architecture

train.py: Main training and inference script

Training the classification model

Data format

For training, each WSI has to have a WSI-level graph. In order to do that, it is required to generate x,y coordinates, feature vectors for local regions in the WSIs. x,y coordinates can be cental points of patches, centroid of nuclei and so on. Feature varies. It can be nuclear composition features (e.g.,counts of different types of nuclei in the patch), morphological features, receptor expression features, deep features (or neuralfeature embdeddings from a pre-trained neural network) and so on.

Each WSI should be fitted with one npz file which contains three arrays: x_coordinate, y_coordinate and corresponding region-level feature vector. Please refer to feature.npz in the example folder.

Graph construction

After npz files are ready, run features_to_graph.py to group spatially neighbouring regions with high degree of feature similarity and construct a graph based on clusters for each WSI.

  • Set path to the feature directories (feature_path)
  • Set path where graphs will be saved (output_path)
  • Modify hyperparameters, including similarity parameters (lambda_d, lambda_f), hierachical clustering distance threshold (lamda_h) and node connection distance threshold (distance_thres)

Training

After getting graphs of all WSIs,

  • Set path to the graph directories (bdir) in train.py
  • Set path to the clinical data (clin_path) in train.py
  • Modify hyperparameters, including learning_rate, weight_decay in train.py

Train the classification model and do 5-fold stratified cross validation using

python train.py

In each fold, top 10 best models (on validation dataset) and the model from the last epoch are tested on the testing dataset. Averaged classification performance among 5 folds are presented in the end.

Heatmap of node-level prediction scores

heatmap_final

Heatmaps of node-level prediction scores and zoomed-in regions which have different levels of HER2 prediction score. Boundary colour of each zoomed-in region represents its contribution to HER2 positivity (prediction score).

License

The source code SlideGraph as hosted on GitHub is released under the GNU General Public License (Version 3).

The full text of the licence is included in LICENSE.md.

Code for "Unsupervised Source Separation via Bayesian inference in the latent domain"

LQVAE-separation Code for "Unsupervised Source Separation via Bayesian inference in the latent domain" Paper Samples GT Compressed Separated Drums GT

Michele Mancusi 30 Oct 25, 2022
SCALoss: Side and Corner Aligned Loss for Bounding Box Regression (AAAI2022).

SCALoss PyTorch implementation of the paper "SCALoss: Side and Corner Aligned Loss for Bounding Box Regression" (AAAI 2022). Introduction IoU-based lo

TuZheng 20 Sep 07, 2022
An implementation of the paper "A Neural Algorithm of Artistic Style"

A Neural Algorithm of Artistic Style implementation - Neural Style Transfer This is an implementation of the research paper "A Neural Algorithm of Art

Srijarko Roy 27 Sep 20, 2022
Count GitHub Stars ⭐

Count GitHub Stars per Day ⭐ Track GitHub stars per day over a date range to measure the open-source popularity of different repositories. Requirement

Ultralytics 20 Nov 20, 2022
Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Multi-level-colonoscopy-malignant-tissue-detection-with-adversarial-CAC-UNet Implementation detail for our paper "Multi-level colonoscopy malignant ti

CVSM Group - email: <a href=[email protected]"> 84 Nov 22, 2022
PyTorch implementation of Rethinking Positional Encoding in Language Pre-training

TUPE PyTorch implementation of Rethinking Positional Encoding in Language Pre-training. Quickstart Clone this repository. git clone https://github.com

Jake Tae 5 Jan 27, 2022
Kalidokit is a blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models

Blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models.

Rich 4.5k Jan 07, 2023
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Lucas Alegre 74 Jan 03, 2023
Multi-tool reverse engineering collaboration solution.

CollaRE v0.3 Intorduction CollareRE is a tool for collaborative reverse engineering that aims to allow teams that do need to use more then one tool du

105 Nov 27, 2022
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Wenwen Yu 498 Dec 24, 2022
Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

Intellia ICT 4 Feb 15, 2022
Kaggleship: Kaggle Notebooks

Kaggleship: Kaggle Notebooks This repository contains my Kaggle notebooks. They are generally about data science, machine learning, and deep learning.

Erfan Sobhaei 1 Jan 25, 2022
A no-BS, dead-simple training visualizer for tf-keras

A no-BS, dead-simple training visualizer for tf-keras TrainingDashboard Plot inter-epoch and intra-epoch loss and metrics within a jupyter notebook wi

Vibhu Agrawal 3 May 28, 2021
'A C2C E-COMMERCE TRUST MODEL BASED ON REPUTATION' Python implementation

Project description A library providing functionalities to calculate reputation and degree of trust on C2C ecommerce platforms. The work is fully base

Davide Bigotti 2 Dec 14, 2022
N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting

N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting Recent progress in neural forecasting instigated significant improvements in the

Cristian Challu 82 Jan 04, 2023
OpenMMLab Detection Toolbox and Benchmark

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

OpenMMLab 22.5k Jan 05, 2023
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning 🧩 Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022