CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Related tags

Deep LearningCSAW-M
Overview

CSAW-M

This repository contains code for CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer. Source code for training models to estimate the mammographic masking level along with the checkpoints are made available here.
The repo containing the annotation tool developed to annotate CSAW-M could be found here. The dataset could be found here.


Training and evaluation

  • In order to train a model, please refer to scripts/train.sh where we have prepared commands and arguments to train a model. In order to encourage reproducibility, we also provide the cross-validation splits that we used in the project (please refer to the dataset website to access them). scripts/cross_val.sh provides example commands to run cross-validation.
  • In order to evaluate a trained model, please refer to scripts/eval.sh with example commands and arguments to evaluate a model.
  • Checkpoints could be downloaded from here.

Important arguments defined in in the main module

  • --train and --evaluate which should be used in training and evaluating models respectively.
  • --model_name: specifies the model name, which will then be used for saving/loading checkpoints
  • --loss_type: defines which loss type to train the model with. It could be either one_hot which means training the model in a multi-class setup under usual cross entropy loss, or multi_hot which means training the model in a multi-label setup using multi-hot encoding (defined for ordinal labels). Please refer to paper for more details.
  • --img_size: specifies the image size to train the model with.
  • Almost all the params in params.yml could be overridden using the corresponding arguments. Please refer to main.py to see the corresponding args.

Other notes

  • It is assumed that main.py is called from inside the src directory.
  • It is important to note that in the beginning of the main script, after reading/checking arguments, params defined in params.ymlis read and updated according to args, after which a call to the set_globals (defined in main.py) is made. This sets global params needed to run the program (GPU device, loggers etc.) For every new high-level module (like main.py) that accepts running arguments and calls other modules, this function shoud be called, as other modules assume that these global params are set.
  • By default, there is no suggested validation csv files, but in cross-validation (using --cv) the train/validation splits in each fold are extracted from the cv_files paths specified in params.yml.
  • In src/experiments.py you can find the call to the function that preprocesses the raw images. For some images we have defined a special set of parameters to be used to ensure text is successfully removed from the images during preprocessing. We have documented every step of the preprocessing function to make it more udnerstandable - feel free to modify it if you want to have your own preprocessed images!
  • The Dockerfile and packages used in this project could be found in the docker folder.

Citation

If you use this work, please cite our paper:

@article{sorkhei2021csaw,
  title={CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer},
  author={Sorkhei, Moein and Liu, Yue and Azizpour, Hossein and Azavedo, Edward and Dembrower, Karin and Ntoula, Dimitra and Zouzos, Athanasios and Strand, Fredrik and Smith, Kevin},
  year={2021}
}

Questions or suggestions?

Please feel free to contact us in case you have any questions or suggestions!

Owner
Yue Liu
PhD student in deep learning at KTH.
Yue Liu
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 08, 2022
Use evolutionary algorithms instead of gridsearch in scikit-learn

sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn. This allows you to reduce the time required to find the best parameter

rsteca 709 Jan 03, 2023
A universal memory dumper using Frida

Fridump Fridump (v0.1) is an open source memory dumping tool, primarily aimed to penetration testers and developers. Fridump is using the Frida framew

551 Jan 07, 2023
MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation This repository contains an implementation and further details of MoCoGAN: Decomposing Mo

Sergey Tulyakov 514 Dec 18, 2022
Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning plugins for distributed training using the Ray distributed compu

167 Jan 02, 2023
IntelĀ® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

IntelĀ® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If

tsc 0 Jan 11, 2022
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
An end-to-end PyTorch framework for image and video classification

What's New: March 2021: Added RegNetZ models November 2020: Vision Transformers now available, with training recipes! 2020-11-20: Classy Vision v0.5 R

Facebook Research 1.5k Dec 31, 2022
An open-source online reverse dictionary.

An open-source online reverse dictionary.

THUNLP 6.3k Jan 09, 2023
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

Lie Transformer - Pytorch (wip) Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch. Only the SE3 version will be present in thi

Phil Wang 78 Oct 26, 2022
UV matrix decompostion using movielens dataset

UV-matrix-decompostion-with-kfold UV matrix decompostion using movielens dataset upload the 'ratings.dat' file install the following python libraries

2 Oct 18, 2022
The Official PyTorch Implementation of DiscoBox.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision Paper | Project page | Demo (Youtube) | Demo (Bilib

NVIDIA Research Projects 89 Jan 09, 2023
CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation

CDGAN CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation CDGAN Implementation in PyTorch This is the imple

Kancharagunta Kishan Babu 6 Apr 19, 2022
Using Hotel Data to predict High Value And Potential VIP Guests

Description Using hotel data and AI to predict high value guests and potential VIP guests. Hotel can leverage on prediction resutls to run more effect

HCG 12 Feb 14, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
Simulation-based inference for the Galactic Center Excess

Simulation-based inference for the Galactic Center Excess Siddharth Mishra-Sharma and Kyle Cranmer Abstract The nature of the Fermi gamma-ray Galactic

Siddharth Mishra-Sharma 3 Jan 21, 2022
This package is for running the semantic SLAM algorithm using extracted planar surfaces from the received detection

Semantic SLAM This package can perform optimization of pose estimated from VO/VIO methods which tend to drift over time. It uses planar surfaces extra

Hriday Bavle 125 Dec 02, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022