Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

Last update: Nov 22, 2021

Overview

Git repositoty of the manuscript entitled

Statistical quantification of confounding bias in predictive modelling

by Tamas Spisak

The manuscript describes and validates the package mlconfound.

Read the docs. .

Abstract

The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively.

The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches.

The tests (implemented in the package mlconfound can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.

This repository contains:

The latex source of the manuscript describing the 'mlconfound' approach: see manuscript.tex and related files.
Sll source code required to reproduce the results in the manuscript. See the directories: simulated and empirical.
All results. See the directories simulated/results and the analysis notebooks.
All figures. See the directory fig.

To reproduce the whole analysis:

./reproduce.sh

Citation

T. Spisak, Statistical quantification of confounding bias in predictive modelling, preprint on arXiv:2111.00814, 2021.

Licensing

Manuscript source and figures (contents of the root folder and the fig dir): CC BY
Source code (contents of the empirical and simulated folders): GPL3

Acknowledgements

The manuscript builds on an aesthetic and simple LaTeX style suitable for "preprint" publications such as arXiv and bio-arXiv, etc. It is based on the nips_2018.sty style.

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

12 May 30, 2022

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Counterfactual VQA (CF-VQA) This repository is the Pytorch implementation of our paper "Counterfactual VQA: A Cause-Effect Look at Language Bias" in C

94 Dec 3, 2022

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Pytorch-MBNet A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK Training To train a new model, please ru

46 Dec 28, 2022

Submission to Twitter's algorithmic bias bounty challenge

Twitter Ethics Challenge: Pixel Perfect Submission to Twitter's algorithmic bias bounty challenge, by Travis Hoppe (@metasemantic). Abstract We build

4 Aug 19, 2022

Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

18 Nov 18, 2022

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

7 Jan 5, 2023

This is our ARTS test set, an enriched test set to probe Aspect Robustness of ABSA.

This is the repository for our 2020 paper "Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis". Data We provide

35 Nov 16, 2022

Fast, flexible and easy to use probabilistic modelling in Python.

Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic

3k Dec 29, 2022

:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling

bulbea "Deep Learning based Python Library for Stock Market Prediction and Modelling." Table of Contents Installation Usage Documentation Dependencies

1.8k Jan 5, 2023

Releases(revision-1.1.0)

revision-1.1.0(Jul 7, 2022)

T. Spisak, Statistical quantification of confounding bias in predictive modelling, preprint on arXiv:2111.00814, 2021.

Manuscript attached. Related package: https://mlconfound.readthedocs.io

Full Changelog: https://github.com/pni-lab/mlconfound-manuscript/compare/preprint-1.0.1...revision-1.1.0
Source code(tar.gz)
Source code(zip)
preprint-1.0.1(Nov 1, 2021)

T. Spisak, Statistical quantification of confounding bias in predictive modelling, preprint on arXiv:2111.00814, 2021.

Manuscript attached. Related package: https://mlconfound.readthedocs.io

Full Changelog: https://github.com/pni-lab/mlconfound-manuscript/compare/submit1-1.0.0...preprint-1.0.1
Source code(tar.gz)
Source code(zip)
mlconfound-arxiv.pdf(3.35 MB)
submit1-1.0.0(Oct 31, 2021)

Manuscript attached. Related package: https://mlconfound.readthedocs.io

Full Changelog: https://github.com/pni-lab/mlconfound-manuscript/compare/preprint-1.0.0...submit1-1.0.0
Source code(tar.gz)
Source code(zip)
mlconfound-submit.pdf(3.37 MB)
preprint-1.0.0(Oct 30, 2021)

T. Spisak, Statistical quantification of confounding bias in predictive modelling, a preprint, 2021.

Manuscript attached. Related package: https://mlconfound.readthedocs.io
Source code(tar.gz)
Source code(zip)
mlconfound-arxiv.pdf(3.35 MB)

Owner

PNI - Predictive Neuroimaging Lab, University Hospital Essen, Germany

GitHub Repository https://mlconfound.readthedocs.io

Rank 3 : Source code for OPPO 6G Data Generation Challenge

OPPO 6G Data Generation with an E2E Framework Homepage of OPPO 6G Data Generation Challenge Datasets H1_32T4R.mat H2_32T4R.mat Please put the original

97 Jan 07, 2023

You Only 👀 One Sequence

You Only 👀 One Sequence TL;DR: We study the transferability of the vanilla ViT pre-trained on mid-sized ImageNet-1k to the more challenging COCO obje

666 Jan 03, 2023

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing Environment Tested on Ubuntu 14.04 64bit and 16.04 64bit Installation # disabl

[email protected])"> 581 Dec 30, 2022

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

HiFi-GAN+ This project is an unoffical implementation of the HiFi-GAN+ model for audio bandwidth extension, from the paper Bandwidth Extension is All

134 Dec 30, 2022

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

51 Oct 08, 2022

Rule Extraction Methods for Interactive eXplainability

REMIX: Rule Extraction Methods for Interactive eXplainability This repository contains a variety of tools and methods for extracting interpretable rul

21 Jan 03, 2023

PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

228 Dec 29, 2022

[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning This is the Tensorflow implementation of ICLR 2021 paper Rank the Episo

48 Nov 21, 2022

Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

Hierarchical Attention Mining (HAM) for weakly-supervised abnormality localization This is the official PyTorch implementation for the HAM method. Pap

22 Jan 02, 2023

A study project using the AA-RMVSNet to reconstruct buildings from multiple images

3d-building-reconstruction This is part of a study project using the AA-RMVSNet to reconstruct buildings from multiple images. Introduction It is exci

17 Oct 17, 2022

Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

813 Jan 04, 2023

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

59 Dec 28, 2022

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

215 Jan 06, 2023