DockStream: A Docking Wrapper to Enhance De Novo Molecular Design

Overview

DockStream

alt text

Description

DockStream is a docking wrapper providing access to a collection of ligand embedders and docking backends. Docking execution and post hoc analysis can be automated via the benchmarking and analysis workflow. The flexilibity to specifiy a large variety of docking configurations allows tailored protocols for diverse end applications. DockStream can also parallelize docking across CPU cores, increasing throughput. DockStream is integrated with the de novo design platform, REINVENT, allowing one to incorporate docking into the generative process, thus providing the agent with 3D structural information.

Supported Backends

Ligand Embedders

Docking Backends

Note: The CCDC package, the OpenEye toolkit and Schrodinger's tools require you to obtain the respective software from those vendors.

Tutorials and Usage

Detailed Jupyter Notebook tutorials for all DockStream functionalities and workflows are provided in DockStreamCommunity. The DockStream repository here contains input JSON templates located in examples. The templates are organized as follows:

  • target_preparation: Preparing targets for docking
  • ligand_preparation: Generating 3D coordinates for ligands
  • docking: Docking ligands
  • integration: Combining different ligand embedders and docking backends into a single input JSON to run successively

Requirements

Two Conda environments are provided: DockStream via environment.yml and DockStreamFull via environment_full.yml. DockStream suffices for all use cases except when CCDC GOLD software is used, in which case DockStreamFull is required.

git clone <DockStream repository>
cd <DockStream directory>
conda env create -f environment.yml
conda activate DockStream

Enable use of OpenEye software (from REINVENT README)

You will need to set the environmental variable OE_LICENSE to activate the oechem license. One way to do this and keep it conda environment specific is: On the command-line, first:

cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh

Then edit ./etc/conda/activate.d/env_vars.sh as follows:

#!/bin/sh
export OE_LICENSE='/opt/scp/software/oelicense/1.0/oe_license.seq1'

and finally, edit ./etc/conda/deactivate.d/env_vars.sh :

#!/bin/sh
unset OE_LICENSE

Unit Tests

After cloning the DockStream repository, enable licenses, if applicable (OpenEye, CCDC, Schrodinger). Then execute the following:

python unit_tests.py

Contributors

Christian Margreitter ([email protected]) Jeff Guo ([email protected]) Alexey Voronov ([email protected])

Comments
  • Glide dockings using local machine

    Glide dockings using local machine

    Hi, I am trying to play with DockStream using Schrodinger. I am wondering if there is the possibility to use it in the local machine specifying $SCHRODINGER/glide instead of the tokens procedure.

    opened by Oulfin 6
  • Bug in Glide backend parallelization

    Bug in Glide backend parallelization

    First, thanks for contributing this nice toolbox.

    This is to report a bug in the following module:

    https://github.com/MolecularAI/DockStream/blob/7bdfd4a67f5c938e3222db59387e5a95e8a59e56/dockstream/core/Schrodinger/Glide_docker.py#L404

    while loop is used to process all sublists in batches. However, the number of processed sublists as recorded in jobs_submitted could be off because this variable is the cumulative sum of len(tmp_output_dirs), which could be smaller than len(cur_slice_sublists) if any of the sublists has no valid molecules to write out.

    The bug may cause some of the sublists get processed repeatedly, and in extreme cases may result in an infinite loop.

    I didn't check if any other backend uses similar logic to parallelize the run and may suffer from the same problem.

    opened by hshany 3
  • Question: Is it possible to feed an sdf file of prepared ligands straight into docking?

    Question: Is it possible to feed an sdf file of prepared ligands straight into docking?

    I'm trying to work out whether it's possible to put an sdf file of prepared ligands straight into a Glide run? i.e. not specifying an input_pool to the docking_runs list? (especially when using docker.py)

    opened by reskyner 2
  • Raise LigandPreparationFailed error

    Raise LigandPreparationFailed error

    For OpenEye Hybrid, it reported LigandPreparationFailed errors for both CORINA and OMEGA backend. One example is shown below: `File "/DockStream/dockstream/core/OpenEyeHybrid/Omega_ligand_preparator.py", line 66, in init raise LigandPreparationFailed("Cannot initialize OMEGA backend - abort.") dockstream.utils.dockstream_exceptions.LigandPreparationFailed: Cannot initialize OMEGA backend - abort.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/DockStream/docker.py", line 132, in raise LigandPreparationFailed dockstream.utils.dockstream_exceptions.LigandPreparationFailed`

    Could you please help me with this problem? I tried both the provided receptor-ligand data files from DockStreamCommunity and my own dataset. Both reported same LigandPreparation error. Thank you in advance!

    opened by fangffRS 1
  • ADV 1.2.0 support

    ADV 1.2.0 support

    For DockStream to work with the new AutoDock-Vina 1.2.0 (https://pubs.acs.org/doi/10.1021/acs.jcim.1c00203), the "log-file" specification has to go:

    https://github.com/MolecularAI/DockStream/blob/efefbe52d3cecb8b6d1b72ab719aad1e4702833b/dockstream/core/AutodockVina/AutodockVina_docker.py#L275

    Should be backwards-compatible.

    opened by CMargreitter 1
  • Input file of the function

    Input file of the function "parse_maestro"

    First of all, thank you for your wonderful work in drug development area using AI. I am using Glide to get the result through DockStream. I think the the function parse_maestro in Glide_docker.py can be used to extract setting for docking(In DockStream, this setting is written json file). Is this right? If so, could you tell me the input file type for the parse_mastro?! (eg. maegz, mae, sdf, etc.) I tried the function with maegz (output from Glide docking), but I couldn't get the result. I want to use parse_maestro function to reproduce the setting which applied to previous docking simulation. I would be very grateful if you could give the answer to me. Thanks!

    opened by SejeongPark8354 0
  • Openbabel integration failed

    Openbabel integration failed

    I am trying to implement Dockstream with the vina backend, an exception is raised with openBabel executable.

    Traceback (most recent call last): File "DockStream/target_preparator.py", line 130, in prep = AutodockVinaTargetPreparator(conf=config, target=input_pdb_path, run_number=run_number) File "C:\Users\Y-8874903-E.ESTUDIANT\OneDrive - URV\Escritorio\PLIP interaction\DockStream\dockstream\core\AutodockVina\AutodockVina_target_preparator.py", line 56, in init raise TargetPreparationFailed("Cannot initialize OpenBabel external library, which should be part of the environment - abort.") dockstream.utils.dockstream_exceptions.TargetPreparationFailed: Cannot initialize OpenBabel external library, which should be part of the environment - abort.

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "DockStream/target_preparator.py", line 139, in raise TargetPreparationFailed() from e dockstream.utils.dockstream_exceptions.TargetPreparationFailed

    Follow all necessary steps mentioned in docs.

    opened by Crispae 1
  • Parallelization of ADV for docking

    Parallelization of ADV for docking

    Hello,

    I am trying to run first docking experiments together with reinvent. I am observing many ADV jobs getting started with -cpu 1 (hardcoded), but a few (1 or 2) take quite long and leave all other CPUs idle until the batch has finished and a new batch has started.

    This leaves quite some capacity of a e.g. 16-core machine unused - at least that is my impression when observing the run via top or ps. In the dockstream.config, parallelization.number_cores is set to 16.

    Are there better practical settings to better exploit larger machines with 16-64 CPUs ?

    Lars

    opened by LarsAC 3
  • No module named 'ccdc'

    No module named 'ccdc'

    I believe I successfully installed the normal (not Full) DockStream package as per your instructions on the github site, and then tried to run the unit test, but this fails with a complaint regarding the ccdc module missing (see below). But I want to use Glide so wouldn’t need (nor have) ccdc. I am doing this on Ubuntu 18.04.

    Dockstream/python ./unit_tests.py Traceback (most recent call last): File "./unit_tests.py", line 10, in from tests.Gold import * File "/media/data/evehom/Projects/CompChem/DockStream/tests/Gold/init.py", line 1, in from tests.Gold.test_Gold_target_preparation import * File "/media/data/evehom/Projects/CompChem/DockStream/tests/Gold/test_Gold_target_preparation.py", line 11, in from dockstream.core.Gold.Gold_target_preparator import GoldTargetPreparator File "/media/data/evehom/Projects/CompChem/DockStream/dockstream/core/Gold/Gold_target_preparator.py", line 3, in import ccdc ModuleNotFoundError: No module named 'ccdc'

    opened by Evert-Homan 4
Releases(v1.0.0)
Owner
AstraZeneca - Molecular AI
Software from the Molecular AI department at AstraZeneca R&D
AstraZeneca - Molecular AI
Rax is a Learning-to-Rank library written in JAX

🦖 Rax: Composable Learning to Rank using JAX Rax is a Learning-to-Rank library written in JAX. Rax provides off-the-shelf implementations of ranking

Google 247 Dec 27, 2022
This repository contains an implementation of the Permutohedral Attention Module in Pytorch

Permutohedral_attention_module This repository contains an implementation of the Permutohedral Attention Module

Samuel JOUTARD 26 Nov 27, 2022
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

Kanghyun Choi 21 Nov 03, 2022
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

HV-plane reconstruction from a single 360 image Code for our paper in CVPR 2021: Indoor Panorama Planar 3D Reconstruction via Divide and Conquer (pape

sunset 36 Jan 03, 2023
Dahua Camera and Doorbell Home Assistant Integration

Home Assistant Dahua Integration The Dahua Home Assistant integration allows you to integrate your Dahua cameras and doorbells in Home Assistant. It's

Ronnie 216 Dec 26, 2022
Collection of in-progress libraries for entity neural networks.

ENN Incubator Collection of in-progress libraries for entity neural networks: Neural Network Architectures for Structured State Entity Gym: Abstractio

25 Dec 01, 2022
Code implementation from my Medium blog post: [Transformers from Scratch in PyTorch]

transformer-from-scratch Code for my Medium blog post: Transformers from Scratch in PyTorch Note: This Transformer code does not include masked attent

Frank Odom 27 Dec 21, 2022
PyTorch implementation of DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images

DARDet PyTorch implementation of "DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images", [pdf]. Highlights: 1. We develop a new dense

41 Oct 23, 2022
Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Kai Zhang 1.2k Dec 29, 2022
In this project we use both Resnet and Self-attention layer for cat, dog and flower classification.

cdf_att_classification classes = {0: 'cat', 1: 'dog', 2: 'flower'} In this project we use both Resnet and Self-attention layer for cdf-Classification.

3 Nov 23, 2022
Taichi Course Homework Template

太极图形课S1-标题部分 这个作业未来或将是你的开源项目,标题的内容可以来自作业中的核心关键词,让读者一眼看出你所完成的工作/做出的好玩demo 如果暂时未想好,起名时可以参考“太极图形课S1-xxx作业” 如下是作业(项目)展开说明的方法,可以帮大家理清思路,并且也对读者非常友好,请小伙伴们多多参

TaichiCourse 30 Nov 19, 2022
Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning

Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning Update (September 18th, 2021) A supporting document de

Taimur Hassan 1 Mar 16, 2022
Code for `BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery`, Neurips 2021

This folder contains the code for 'Scalable Variational Approaches for Bayesian Causal Discovery'. Installation To install, use conda with conda env c

14 Sep 21, 2022
Python Classes: Medical Insurance Project using Object Oriented Programming Concepts

Medical-Insurance-Project-OOP Python Classes: Medical Insurance Project using Object Oriented Programming Concepts Classes are an incredibly useful pr

Hugo B. 0 Feb 04, 2022
Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices Abstract For practical deep neural network design on mobile devices, it is e

11 Dec 30, 2022
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023
Code for "Steerable Pyramid Transform Enables Robust Left Ventricle Quantification"

Code for "Steerable Pyramid Transform Enables Robust Left Ventricle Quantification" This is an end-to-end framework for accurate and robust left ventr

2 Jul 09, 2022
Code for ICML 2021 paper: How could Neural Networks understand Programs?

OSCAR This repository contains the source code of our ICML 2021 paper How could Neural Networks understand Programs?. Environment Run following comman

Dinglan Peng 115 Dec 17, 2022
Agile SVG maker for python

Agile SVG Maker Need to draw hundreds of frames for a GIF? Need to change the style of all pictures in a PPT? Need to draw similar images with differe

SemiWaker 4 Sep 25, 2022
Fully convolutional deep neural network to remove transparent overlays from images

Fully convolutional deep neural network to remove transparent overlays from images

Marc Belmont 1.1k Jan 06, 2023