Benchmarking Pipeline for Prediction of Protein-Protein Interactions

Last update: Jun 27, 2022

Related tags

Overview

B4PPI

Benchmarking Pipeline for the Prediction of Protein-Protein Interactions

How this benchmarking pipeline has been built, and how to use it, is detailed in our preprint here (please cite it if you find this work useful!).

A minimal example is available here, and the list of requirements there.

How to use the gold standard

All the data files are in data, most of them are available as csv (sep='|') and pickled pandas DataFrames (sometimes the csv file may be missing due to file size constraints on GitHub).

The gold standard, without pre-processed features, can be loaded using:

goldStandard = pd.read_csv(
    os.path.join('data', 'benchmarkingGS_v1-0.csv'),
    sep='|'
)

Or with the pre-processed features:

goldStandard_with_featuresSeq = pd.read_pickle(
    os.path.join('data', 'benchmarkingGS_v1-0_similarityMeasure_sequence_v3-1.pkl')
)

UniProtIDs are used for both proteins A and B.
isInteraction is the ground truth from the IntAct database (1 = interacting proteins, 0 = non-interacting proteins).
trainTest is the split between training set (train), first testing set T1 (test1) and second testing set T2 (test2).
Pre-processed features are explained in the manuscript.

Training and evaluation can then be done normally. The code from the preprint is in the Training section.

How to cite this work

Lannelongue L., Inouye M., Construction of in silico protein-protein interaction networks across different topologies using machine learning, 2022, BioArxiv

Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

Credits

The code was written in Python 3.7.
Many libraries were used, in particular Pandas, Numpy, scikit-learn and PyTorch Lightning (full list in the code and in the requirements file).
Plots were drawn using Matplotlib, Seaborn and the MetBrewer colour palettes.
Logs were saved using Weight & Bias.

Benchmarking Pipeline for Prediction of Protein-Protein Interactions

Related tags

Overview

B4PPI

How to use the gold standard

How to cite this work

Licence

Credits

Owner

Loïc Lannelongue

Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

N-RPG - Novel role playing game da turfu

Learned image compression

Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

A task Provided by A respective Artenal Ai and Ml based Company to complete it

Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution

A complete, self-contained example for training ImageNet at state-of-the-art speed with FFCV

Official Implementation of "Transformers Can Do Bayesian Inference"

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

Optimizers-visualized - Visualization of different optimizers on local minimas and saddle points.

Learning where to learn - Gradient sparsity in meta and continual learning

A tensorflow model that predicts if the image is of a cat or of a dog.

Official PyTorch implementation of Spatial Dependency Networks.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A Lightweight Experiment & Resource Monitoring Tool 📺

Learning Features with Parameter-Free Layers (ICLR 2022)

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

Learning to Estimate Hidden Motions with Global Motion Aggregation