Betafold - AlphaFold with tunings

Related tags

Deep Learningbetafold
Overview

alphafold.hegelab.org

BetaFold

We (hegelab.org) craeted this standalone AlphaFold (AlphaFold-Multimer, v2.1.1) fork with changes that most likely will not be inserted in the main repository, but we found these modifications very useful during our daily work. We plan to try to push these changes gradually to main repo via our alphafold fork.

Warning

  • Currently, this is a no-Docker version. If you really need our functionalities inside a Docker Image, let us know.
  • Earlier opction for the configuration file was -c, now it is -C.

Changes / Features

  • It is called BetaFold, since there might be some minor bugs – we provide this code “as is”.
  • This fork includes the correction of memory issues from our alphafold fork (listed below).
  • The changes mostly affect the workflow logic.
  • BetaFold run can be influence via configuration files.
  • Different steps of AF2 runs (generating features; running models; performing relaxation) can be separated. Thus database searches can run on a CPU node, while model running can be performed on a GPU node. Note: timings.json file is overwritten upon consecutive partial runs – save it if you need it.

Configuration file

  • You can provide the configuration file as: ‘run_alphafold.sh ARGUMENTS -C CONF_FILENAME’ (slightly modified version of the bash script from AlfaFold without docker @ kalininalab; please see below our Requirement section)
  • If no configuration file or no section or no option is provided, everything is expected to run everything with the original default parameters.
[steps]
get_features = true
run_models = true
run_relax = true

[relax]
top

Requirements

Paper/Reference/Citation

Till we publish a methodological paper, please read and cite our preprint "AlphaFold2 transmembrane protein structure prediction shines".

Memory issues you may encounter when running original AlphaFold locally

"Out of Memory"

This is expected to be included in the next AF2 release, see: pull request #296.

Brief, somewhat outdated summary: Some of our AF2 runs with short sequences (~250 a.a.) consumed all of our memory (96GB) and died. Our targets in these cases were highly conserved and produced a very large alignment file, which is read into the memory by a simple .read() in alphafold/data/tools/jackhmmer.py _query_chunk. Importantly, the max_hit limit is applied at a later step to the full set, which resides already in the memory, so this option does not prevent this error.

  • To overcome this issue exhausting the system RAM, we read the .sto file line-by-line, so only max_hit will reach the memory.
  • Since the same data needed line-by-line for a3m conversion, we merged the two step together. We inserted to functions into alphafold/data/parsers.py: get_sto if only sto is needed and get_sto_a3m if also a3m is needed (the code is somewhat redundant but simple and clean).
  • This issue was caused by jackhmmer_uniref90_runner.query and jackhmmer_mgnify_runner.query, so we modified the calls to this function in alphafold/data/pipeline.py.
  • The called query in alphafold/data/tools/jackhmmer.py calls _query_chunk; from here we call our get_sto*; _query_chunk returns the raw_output dictionary, which also includes 'a3m' as a string or None.

License and Disclaimer

Please see the original.

Code for reproducing experiments in "Improved Training of Wasserstein GANs"

Improved Training of Wasserstein GANs Code for reproducing experiments in "Improved Training of Wasserstein GANs". Prerequisites Python, NumPy, Tensor

Ishaan Gulrajani 2.2k Jan 01, 2023
《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Rethinking Spatial Dimensions of Vision Transformers Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh | Paper NAVER

NAVER AI 224 Dec 27, 2022
Fully-automated scripts for collecting AI-related papers

AI-Paper-collector Fully-automated scripts for collecting AI-related papers List of Conferences to crawel ACL: 21-19 (including findings) EMNLP: 21-19

Gordon Lee 776 Jan 08, 2023
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks Molecular interaction networks are powerful resources for the discovery. While dee

Kexin Huang 49 Oct 15, 2022
Make your AirPlay devices as TTS speakers

Apple AirPlayer Home Assistant integration component, make your AirPlay devices as TTS speakers. Before Use 2021.6.X or earlier Apple Airplayer compon

George Zhao 117 Dec 15, 2022
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Vision Transformer Pytorch reimplementation of Google's repository for the ViT model that was released with the paper An Image is Worth 16x16 Words: T

Eunkwang Jeon 1.4k Dec 28, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 08, 2022
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

Minh-Khoi Pham 5 Nov 05, 2022
Lava-DL, but with PyTorch-Lightning flavour

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Sami BARCHID 4 Oct 31, 2022
GraphLily: A Graph Linear Algebra Overlay on HBM-Equipped FPGAs

GraphLily: A Graph Linear Algebra Overlay on HBM-Equipped FPGAs GraphLily is the first FPGA overlay for graph processing. GraphLily supports a rich se

Cornell Zhang Research Group 39 Dec 13, 2022
Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Total number of Spanning Trees in a Graph This is a python script just written f

Mehdi I. 0 Jul 18, 2022
TGS Salt Identification Challenge

TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp

neptune.ai 123 Nov 04, 2022
Code for the paper: Sketch Your Own GAN

Sketch Your Own GAN Project | Paper | Youtube Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to match the in

677 Dec 28, 2022
Additional code for Stable-baselines3 to load and upload models from the Hub.

Hugging Face x Stable-baselines3 A library to load and upload Stable-baselines3 models from the Hub. Installation With pip Examples [Todo: add colab t

Hugging Face 34 Dec 10, 2022
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features"

Pytorch Implementation of Deep Orthogonal Fusion of Local and Global Features (DOLG) This is the unofficial PyTorch Implementation of "DOLG: Single-St

DK 96 Jan 06, 2023
Robustness via Cross-Domain Ensembles

Robustness via Cross-Domain Ensembles [ICCV 2021, Oral] This repository contains tools for training and evaluating: Pretrained models Demo code Traini

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 27 Dec 23, 2022
An Approach to Explore Logistic Regression Models

User-centered Regression An Approach to Explore Logistic Regression Models This tool applies the potential of Attribute-RadViz in identifying correlat

0 Nov 12, 2021
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 07, 2023