A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

Related tags

Deep Learningjie
Overview

jie

jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model.

The codename is a tribute to the Chinese homophones:

  • 结 (jié) : a knot, a nod to the mysterious and often entangled structures of DNA
  • 解 (jiĕ) : to solve, to untie, our bid to uncover these structures amid noise and uncertainty
  • 姐 (jiĕ) : sister, our ability to resolve tightly paired replicated chromatids

Installation

Step 1 - Clone this repository:

git clone https://github.com/b2jia/jie.git
cd jie

Step 2 - Create a new conda environment and install dependencies:

conda create --name jie -f environment.yml
conda activate jie

Step 3 - Install jie:

pip install -e .

To test, run:

python -W ignore test/test_jie.py

Usage

jie is an exposition of chromatin tracing using polymer physics. The main function of this package is to illustrate the utility and power of spatial genome alignment.

jie is NOT an all-purpose spatial genome aligner. Chromatin imaging is a nascent field and data collection is still being standardized. This aligner may not be compatible with different imaging protocols and data formats, among other variables.

We provide a vignette under jie/jupyter/, with emphasis on inspectability. This walks through the intuition of our spatial genome alignment and polymer fiber karyotyping routines:

00-spatial-genome-alignment-walk-thru.ipynb

We also provide a series of Jupyter notebooks (jie/jupyter/), with emphasis on reproducibility. This reproduces figures from our accompanying manuscript:

01-seqFISH-plus-mouse-ESC-spatial-genome-alignment.ipynb
02-seqFISH-plus-mouse-ESC-polymer-fiber-karyotyping.ipynb
03-seqFISH-plus-mouse-brain-spatial-genome-alignment.ipynb
04-seqFISH-plus-mouse-brain-polymer-fiber-karyotyping.ipynb
05-bench-mark-spatial-genome-agignment-against-chromatin-tracing-algorithm.ipynb

A command-line tool forthcoming.

Motivation

Multiplexed DNA-FISH is a powerful imaging technology that enables us to peer directly at the spatial location of genes inside the nucleus. Each gene appears as tiny dot under imaging.

Pivotally, figuring out which dots are physically linked would trace out the structure of chromosomes. Unfortunately, imaging is noisy, and single-cell biology is extremely variable. The two confound each other, making chromatin tracing prohibitively difficult!

For instance, in a diploid cell line with two copies of a gene we expect to see two spots. But what happens when we see:

  • Extra signals:
    • Is it noise?
      • Off-target labeling: The FISH probes might inadvertently label an off-target gene
    • Or is it biological variation?
      • Aneuploidy: A cell (ie. cancerous cell) may have more than one copy of a gene
      • Cell cycle: When a cell gets ready to divide, it duplicates its genes
  • Missing signals:
    • Is it noise?
      • Poor probe labeling: The FISH probes never labeled the intended target gene
    • Or is it biological variation?
      • Copy Number Variation: A cell may have a gene deletion

If true signal and noise are indistinguishable, how do we know we are selecting true signals during chromatin tracing? It is not obvious which spots should be connected as part of a chromatin fiber. This dilemma was first aptly characterized by Ross et al. (https://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.011918), which is nothing short of prescient...!

jie is, conceptually, a spatial genome aligner that disambiguates spot selection by checking each imaged signal against a reference polymer physics model of chromatin. It relies on the key insight that the spatial separation between two genes should be congruent with its genomic separation.

It makes no assumptions about the expected copy number of a gene, and when it traces chromatin it does so instead by evaluating the physical likelihood of the chromatin fiber. In doing so, we can uncover copy number variations and even sister chromatids from multiplexed DNA-FISH imaging data.

Citation

Contact

Author: Bojing (Blair) Jia
Email: b2jia at eng dot ucsd dot edu
Position: MD-PhD Student, Ren Lab

For other work related to single-cell biology, 3D genome, and chromatin imaging, please visit Prof. Bing Ren's website: http://renlab.sdsc.edu/

Owner
Bojing Jia
How do we better describe the world around us?
Bojing Jia
Source code of generalized shuffled linear regression

Generalized-Shuffled-Linear-Regression Code for the ICCV 2021 paper: Generalized Shuffled Linear Regression. Authors: Feiran Li, Kent Fujiwara, Fumio

FEI 7 Oct 26, 2022
Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

The Temporal Robustness of Stochastic Signals Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals" Case stud

0 Oct 28, 2021
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 03, 2022
Place holder for HOPE: a human-centric and task-oriented MT evaluation framework using professional post-editing

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation Place holder for dat

Lifeng Han 1 Apr 25, 2022
This is just a funny project that we want to see AutoEncoder (AE) can actually work to enhance the features we want

Funny_muscle_enhancer :) 1.Discription: This is just a funny project that we want to see AutoEncoder (AE) can actually work on the some features. We w

Jing-Yao Chen (Jacob) 8 Oct 01, 2022
Deploy optimized transformer based models on Nvidia Triton server

Deploy optimized transformer based models on Nvidia Triton server

Lefebvre Sarrut Services 1.2k Jan 05, 2023
Yet another video caption

Yet another video caption

Fan Zhimin 5 May 26, 2022
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Tracy (Shengmin) Tao 1 Apr 12, 2022
SAMO: Streaming Architecture Mapping Optimisation

SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model

Alexander Montgomerie-Corcoran 20 Dec 10, 2022
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

Arthur Bražinskas 39 Jan 01, 2023
The 2nd Version Of Slothybot

SlothyBot Go to this website: "https://bitly.com/SlothyBot" The 2nd Version Of Slothybot. The Bot Has Many Features, Such As: Moderation Commands; Kic

Slothy 0 Jun 01, 2022
Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

DeepCDR Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network This work has been accepted to ECCB2020 and was also published in the

Qiao Liu 50 Dec 18, 2022
YOLOv5 Series Multi-backbone, Pruning and quantization Compression Tool Box.

YOLOv5-Compression Update News Requirements 环境安装 pip install -r requirements.txt Evaluation metric Visdrone Model mAP ZhangYuan 719 Jan 02, 2023

Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022
A modular domain adaptation library written in PyTorch.

A modular domain adaptation library written in PyTorch.

Kevin Musgrave 225 Dec 29, 2022
[ICML 2021] A fast algorithm for fitting robust decision trees.

GROOT: Growing Robust Trees Growing Robust Trees (GROOT) is an algorithm that fits binary classification decision trees such that they are robust agai

Cyber Analytics Lab 17 Nov 21, 2022
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.

numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation

Ryo Yonetani 2 Jan 16, 2022