Create large-scale ML-driven multiscale simulation ensembles to study the interactions

Overview

MuMMI RAS v0.1

Released: Nov 16, 2021

MuMMI RAS is the application component of the MuMMI framework developed to create large-scale ML-driven multiscale simulation ensembles to study the interactions of RAS proteins and RAS-RAF protein complexes with lipid plasma membranes.

MuMMI framework was developed as part of the Pilot2 project of the Joint Design of Advanced Computing Solutions for Cancer funded jointly by the Department of Energy (DOE) and the National Cancer Institute (NCI).

The Pilot 2 project focuses on developing multiscale simulation models for understanding the interactions of the lipid plasma membrane with the RAS and RAF proteins. The broad computational tool development aims of this pilot are:

  • Developing scalable multi-scale molecular dynamics code that will automatically switch between phase field, coarse-grained and all-atom simulations.
  • Developing scalable machine learning and predictive models of molecular simulations to:
    • identify and quantify states from simulations
    • identify events from simulations that can automatically signal change of resolution between phase field, coarse-grained and all-atom simulations
    • aggregate information from the multi-resolution simulations to efficiently feedback to/from machine learning tools
  • Integrate sparse information from experiments with simulation data

MuMMI RAS defines the specific functionalities needed for the various components and scales of a target multiscale simulation. The application components need to define the scales, how to read the corresponding data, how to perform ML-based selection, how to run the simulations, how to perform analysis, and how to perform feedback. This code uses several utilities made available through "MuMMI Core".

Publications

MuMMI framework is described in the following publications.

  1. Bhatia et al. Generalizable Coordination of Large Multiscale Ensembles: Challenges and Learnings at Scale. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC '21, Article No. 10, November 2021. doi:10.1145/3458817.3476210.

  2. Di Natale et al. A Massively Parallel Infrastructure for Adaptive Multiscale Simulations: Modeling RAS Initiation Pathway for Cancer. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC '19, Article No. 57, November 2019. doi:10.1145/3295500.3356197.
    Best Paper at SC 2019.

  3. Ingólfsson et al. Machine Learning-driven Multiscale Modeling Reveals Lipid-Dependent Dynamics of RAS Signaling Protein. Proceedings of the National Academy of Sciences (PNAS), accepted, 2021. preprint.

  4. Reciprocal Coupling of Coarse-Grained and All-Atom scales. In preparation.

Installation

git clone https://github.com/mummi-framework/mummi-ras
cd mummi-ras
pip3 install .

export MUMMI_ROOT=/path/to/outputs
export MUMMI_CORE=/path/to/core/repo
export MUMMI_APP=/path/to/app/repo
export MUMMI_RESOURCES=/path/to/resources
The installaton process as described above installs the MuMMI framework. The simulation codes (gridsim2d, ddcMD, AMBER, GROMACS) are not included and are to be installed separately.
Spack installation. We are also working towards releasing the option of installing MuMMI and its dependencies through spack.

Authors and Acknowledgements

MuMMI was developed at Lawrence Livermore National Laboratory, in collaboration with Los Alamos National Laboratory, Oak Ridge National Laboratory, and International Business Machines. A list of main contributors is given below.

  • LLNL: Harsh Bhatia, Francesco Di Natale, Helgi I Ingólfsson, Joseph Y Moon, Xiaohua Zhang, Joseph R Chavez, Fikret Aydin, Tomas Oppelstrup, Timothy S Carpenter, Shiv Sundaram (previously LLNL), Gautham Dharuman (previously LLNL), Dong H Ahn, Stephen Herbein, Tom Scogland, Peer-Timo Bremer, and James N Glosli.

  • LANL: Chris Neale and Cesar Lopez

  • ORNL: Chris Stanley

  • IBM: Sara K Schumacher

MuMMI was funded by the Pilot2 project led by Dr. Fred Streitz (DOE) and Dr. Dwight Nissley (NIH). We acknowledge contributions from the entire Pilot 2 team.

This work was performed under the auspices of the U.S. Department of Energy (DOE) by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, Los Alamos National Laboratory (LANL) under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725.

Contact: Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, CA 94550.

Contributing

Contributions may be made through pull requests and/or issues on github.

License

MuMMI RAS is distributed under the terms of the MIT License.

Livermore Release Number: LLNL-CODE-827655

Comments
  • Are the trajectories in your publications publicly available?

    Are the trajectories in your publications publicly available?

    Hi, Congrats on the success, and huge thanks for making it open source. I wonder whether the trajectories in your publications are publicly available. Or are there any demo trajectories?

    I am a Ph.D. student at KAUST, using computer graphics to build and visualize mesoscale biology models, such as SARS-CoV-2 and bacteriophage T4. If possible, I (and my colleagues) would like to perform (multiscale, multi-representation, multi-granularity) visualization research on the trajectories you generated.

    Many thanks, Roden

    opened by RodenLuo 2
  • `flux` vs `slurm`

    `flux` vs `slurm`

    Hi,

    As flux is mentioned in the dependencies, is it possible to reproduce MuMMI RAS on a cluster that only has slurm?

    Workflow dependencies (e.g., python, flux, dynim, keras, etc.)

    Quoted from: https://github.com/mummi-framework/mummi-ras/blob/main/INSTALL.md

    Many thanks, Roden

    opened by RodenLuo 0
  • gridsim2d availability

    gridsim2d availability

    Hi, I wonder if the following code is available or not.

    gridsim2d: to be released shortly

    Quoted from: https://github.com/mummi-framework/mummi-ras/blob/main/INSTALL.md

    Thanks, Roden

    opened by RodenLuo 0
  • Patch for gromacs availability

    Patch for gromacs availability

    Hi, I wonder if the following patch is available or not.

    Note that we have a patch for gromacs installation for customization. To be open-sourced soon.

    Quoted from: https://github.com/mummi-framework/mummi-ras/blob/main/INSTALL.md

    Thanks, Roden

    opened by RodenLuo 0
  • Small scale test data for local deployment

    Small scale test data for local deployment

    Hi, I'm interested in deploying MuMMI on the KAUST IBEX cluster. It is mentioned in the installation doc that there is a small set of test data. Is it now publicly available? If not, is it possible for me to somehow access it so that I can perform a test run?

    Many thanks, Roden

    Again on lassen and on summit, we have created a small set of test data, which can be used to launch MuMMI at small scales. This (and the larger dataset) will be made public through NCI website. Until then, we can make this data available upon request.

    opened by RodenLuo 1
Releases(v1.0.0)
Python factor analysis library (PCA, CA, MCA, MFA, FAMD)

Prince is a library for doing factor analysis. This includes a variety of methods including principal component analysis (PCA) and correspondence anal

Max Halford 915 Dec 31, 2022
BigDL: Distributed Deep Learning Framework for Apache Spark

BigDL: Distributed Deep Learning on Apache Spark What is BigDL? BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can w

4.1k Jan 09, 2023
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

RAPIDS 3.1k Dec 28, 2022
Regularization and Feature Selection in Least Squares Temporal Difference Learning

Regularization and Feature Selection in Least Squares Temporal Difference Learning Description This is Python implementations of Least Angle Regressio

Mina Parham 0 Jan 18, 2022
An open-source library of algorithms to analyse time series in GPU and CPU.

An open-source library of algorithms to analyse time series in GPU and CPU.

Shapelets 216 Dec 30, 2022
MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training

MosaicML Composer MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training. We aim to ease th

MosaicML 2.8k Jan 06, 2023
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 363 Dec 14, 2022
Software Engineer Salary Prediction

Based on 2021 stack overflow data, this machine learning web application helps one predict the salary based on years of experience, level of education and the country they work in.

Jhanvi Mimani 1 Jan 08, 2022
A machine learning project that predicts the price of used cars in the UK

Car Price Prediction Image Credit: AA Cars Project Overview Scraped 3000 used cars data from AA Cars website using Python and BeautifulSoup. Cleaned t

Victor Umunna 7 Oct 13, 2022
A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.

A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.

Aayush Malik 80 Dec 12, 2022
Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

141 Dec 27, 2022
My capstone project for Udacity's Machine Learning Nanodegree

MLND-Capstone My capstone project for Udacity's Machine Learning Nanodegree Lane Detection with Deep Learning In this project, I use a deep learning-b

Michael Virgo 407 Dec 12, 2022
Python bindings for MPI

MPI for Python Overview Welcome to MPI for Python. This package provides Python bindings for the Message Passing Interface (MPI) standard. It is imple

MPI for Python 604 Dec 29, 2022
Book Item Based Collaborative Filtering

Book-Item-Based-Collaborative-Filtering Collaborative filtering methods are used

Şebnem 3 Jan 06, 2022
Deploy AutoML as a service using Flask

AutoML Service Deploy automated machine learning (AutoML) as a service using Flask, for both pipeline training and pipeline serving. The framework imp

Chris Rawles 221 Nov 04, 2022
onelearn: Online learning in Python

onelearn: Online learning in Python Documentation | Reproduce experiments | onelearn stands for ONE-shot LEARNning. It is a small python package for o

15 Nov 06, 2022
This is my implementation on the K-nearest neighbors algorithm from scratch using Python

K Nearest Neighbors (KNN) algorithm In this Machine Learning world, there are various algorithms designed for classification problems such as Logistic

sonny1902 1 Jan 08, 2022
Stacked Generalization (Ensemble Learning)

Stacking (stacked generalization) Overview ikki407/stacking - Simple and useful stacking library, written in Python. User can use models of scikit-lea

Ikki Tanaka 192 Dec 23, 2022
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Oracle 95 Dec 28, 2022
Forecasting prices using Facebook/Meta's Prophet model

CryptoForecasting using Machine and Deep learning (Part 1) CryptoForecasting using Machine Learning The main aspect of predicting the stock-related da

1 Nov 27, 2021