The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Last update: Oct 19, 2022

Overview

Enformer TPU training script (wip)

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters, in an effort to migrate the model to pytorch.

This was pieced together from the Deepmind Enformer repository, the colab training notebook, as well as Basenji sequence augmentation code

It accounts for:

distributed TPU training
distributed datasets
distributed validation
gradient clipping
cross replica batchnorms
dataset augmentation

Training takes about 3 days on v3-64

Todo

fix script for differences in sequence length in basenji training data, which is ~130k vs ~190k bp as in paper

Citations

@article {Avsec2021.04.07.438649,
    author  = {Avsec, {\v Z}iga and Agarwal, Vikram and Visentin, Daniel and Ledsam, Joseph R. and Grabska-Barwinska, Agnieszka and Taylor, Kyle R. and Assael, Yannis and Jumper, John and Kohli, Pushmeet and Kelley, David R.},
    title   = {Effective gene expression prediction from sequence by integrating long-range interactions},
    elocation-id = {2021.04.07.438649},
    year    = {2021},
    doi     = {10.1101/2021.04.07.438649},
    publisher = {Cold Spring Harbor Laboratory},
    URL     = {https://www.biorxiv.org/content/early/2021/04/08/2021.04.07.438649},
    eprint  = {https://www.biorxiv.org/content/early/2021/04/08/2021.04.07.438649.full.pdf},
    journal = {bioRxiv}
}

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Related tags

Overview

Enformer TPU training script (wip)

Todo

Citations

Owner

Phil Wang

Selfplay In MultiPlayer Environments

OpenMMLab Pose Estimation Toolbox and Benchmark.

AWS documentation corpus for zero-shot open-book question answering.

OpenMMLab Image and Video Editing Toolbox

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

Automated Hyperparameter Optimization Competition

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Paddle Graph Learning (PGL) is an efficient and flexible graph learning framework based on PaddlePaddle

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Polynomial-time Meta-Interpretive Learning

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

2D&3D human pose estimation

The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.