fMRIprep Pipeline To Machine Learning

Overview

fMRIprep Pipeline To Machine Learning(Demo)

所有配置均在config.py文件下定义

前置环境(lilab)

  • 各个节点均安装docker,并有fmripre的镜像
  • 可以使用conda中的base环境(相应的第三份包之后更新)

1. fmriprep script on single machine(docker)

config.py中的fMRI_Prep_Job类中配置相应变量,注意在修改cmd时,不能修改{}中的关键字。在执行此步骤时,将自动在bids同级目录下建立processed文件夹,用来存放后处理数据。其中处理后的fmriprep数据存放在processed/frmriprepprceossed/fressurfer中。

class fMRI_Prep_Job:
    # input data path
    bids_data_path  = "/share/data2/dataset/ds002748/depression"
    # 一个容器中处理多少个被试 
    step = 8
    # fmriprep opm thread
    thread = 9
    # max work contianers
    max_work_nums = 10

    # 在bids同级目录下创建processed文件夹
    bids_output_path = os.path.join("/".join(bids_data_path.split('/')[:-1]),'processed')
    if not os.path.exists(bids_output_path):
        os.mkdir(bids_output_path)
    # fmri work path 
    fmri_work="/share/fmri_work"
    # freesurfer_license
    freesurfer_license = "/share/user_data/public/fanq_ocd/license.txt"
    # contianer id fmriprep
    contianer_id = "d7235efbbd3c"
    # fmriprep cmd 
    cmd ="docker run -it --rm -v {bids_data_path}:/data -v {freesurfer_license}:/opt/freesurfer/license.txt -v {bids_output_path}:/out -v {fmri_work}:/work {contianer_id} /data /out --skip_bids_validation --ignore slicetiming fieldmaps  -w /work --omp-nthreads {thread} --fs-no-reconall --resource-monitor participant --participant-label {subject_ids}"

2. fmriprep post preocess

这一步的操作主要依赖于fmribrant,主要作用是回归掉白质信号、脑脊液信号、全脑信号、头动信息、并进行滤波(可选),将其处理后的文件放存在prcoessed/post-precoss/ fliter/clean_imgs 中, 可选表示是否进行滤波。该配置中不建议修改dataset_path,store_path

class PostProcess:
    """
    fmriprep 后处理数据
    """
    # 类型的名字
    task_type = "rest"

    dataset_path = os.path.join(fMRI_Prep_Job.bids_output_path,'fmriprep')

    store_path = os.path.join(fMRI_Prep_Job.bids_output_path,'post-process')

    t_r = 2.5

    low_pass = 0.08

    high_pass = 0.01

    n_process = 40

    if t_r != None:
        store_path = os.path.join(store_path,'filter','clean_imgs')
    else:
        store_path = os.path.join(store_path,'unfilter','clean_imgs')

    os.makedirs(store_path,exist_ok=True)

3.获取ROI级别的时间序列

atlas由271个roi组成,分别是Schaefer_200(皮上),Tianye_54(皮下),Buckner_17(小脑)。由于在fmribrant中实现提取时间序列的功能,简单封装一下。

class RoiTs:
    """
    ROI 级别时间序列
    处理271个全脑roi
    """
    n_process = 40

    # 如果在第二步fmri post process已经滤波之后,不建议再次使用滤波操作
    t_r = None
    
    low_pass = None

    high_pass = None
    
    flag_gs = False #  回归全脑均值为 True 否则为False
    # 以下内容不建议修改

    if flag_gs:
        file_name = "*with_gs.nii.gz"
        ts_file = "GS"
    else:
        file_name = "*without_gs.nii.gz"
        ts_file = "NO_GS"
    
    reg_path = os.path.join(PostProcess.store_path,"*",PostProcess.task_type,file_name)
    
    subject_id_index = -3

    save_path = os.path.join("/".join(PostProcess.store_path.split('/')[:-1]),'timeseries',ts_file)

    os.makedirs(save_path,exist_ok=True)

4. Machine Learning(Baseline)

这一步是可选的,一般先用来看看FC做性别分类、年龄回归的效果如何。只保留粗略结果,详细结果可以使用baseline这个包。

class ML:
    # 选择的subject id 默认是全部
    sub_ids = [i.split('.')[0] for i in os.listdir(RoiTs.save_path)]
    # 量表位置
    csv = pd.read_csv('/share/data2/dataset/ds002748/depression/participants.tsv',sep='\t')
    #取交集
    csv = pd.DataFrame({"participant_id":sub_ids}).merge(csv)
    # 分类的任务
    classifies = ["gender"]
    # 回归的任务
    regressions = ["age"]
    # 分类模型
    classify_models = [SVC(),SVC(C=100),SVC(kernel='linear'),SVC(kernel='linear',C=100)]
    # 回归模型
    regress_models = [SVR(),SVR(C=100),SVR(kernel='linear'),SVR(kernel='linear',C=100)]
    kfold = 3
    # 多少个roi
    rois = 200

5. run

修改script/run.py

from fmriprep_job import run_fmri_prep
from fmriprep_pprocess import  run as pp_run
from roi2ts import run as roi_ts_run
from fast_fc_ml import run as ml_run


if __name__ =='__main__':
    run_fmri_prep() # fmriprep
    pp_run() # fmriprep post process
    roi_ts_run() # get roi time series
    ml_run() # machine learning

然后执行

python run.py

6. To Do

  • 质量控制
Owner
Alien
A student
Alien
Pragmatic AI Labs 421 Dec 31, 2022
Lseng-iseng eksplor Machine Learning dengan menggunakan library Scikit-Learn

Kalo dengar istilah ML, biasanya rada ambigu. Soalnya punya beberapa kepanjangan, seperti Mobile Legend, Makan Lontong, Ma**ng L*v* dan lain-lain. Tapi pada repo ini membahas Machine Learning :)

Alfiyanto Kondolele 1 Apr 06, 2022
Simple data balancing baselines for worst-group-accuracy benchmarks.

BalancingGroups Code to replicate the experimental results from Simple data balancing baselines achieve competitive worst-group-accuracy. Replicating

Facebook Research 29 Dec 02, 2022
Predict the income for each percentile of the population (Python) - FRENCH

05.income-prediction Predict the income for each percentile of the population (Python) - FRENCH Effectuez une prédiction de revenus Prérequis Pour ce

1 Feb 13, 2022
Model factory is a ML training platform to help engineers to build ML models at scale

Model Factory Machine learning today is powering many businesses today, e.g., search engine, e-commerce, news or feed recommendation. Training high qu

16 Sep 23, 2022
Python library for multilinear algebra and tensor factorizations

scikit-tensor is a Python module for multilinear algebra and tensor factorizations

Maximilian Nickel 394 Dec 09, 2022
QML: A Python Toolkit for Quantum Machine Learning

QML is a Python2/3-compatible toolkit for representation learning of properties of molecules and solids.

176 Dec 09, 2022
Timeseries analysis for neuroscience data

=================================================== Nitime: timeseries analysis for neuroscience data ===============================================

NIPY developers 212 Dec 09, 2022
Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

Paweł Rościszewski 131 Dec 12, 2022
Accelerating model creation and evaluation.

EmeraldML A machine learning library for streamlining the process of (1) cleaning and splitting data, (2) training, optimizing, and testing various mo

Yusuf 0 Dec 06, 2021
scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022
Machine-care - A simple python script to take care of simple maintenance tasks

Machine care An simple python script to take care of simple maintenance tasks fo

2 Jul 10, 2022
scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms.

Sklearn-genetic-opt scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms. This is meant to be an alternativ

Rodrigo Arenas 180 Dec 20, 2022
Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

Priyansh Sharma 7 Nov 09, 2022
NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

SUN Group @ UMN 28 Aug 03, 2022
AP1 Transcription Factor Binding Site Prediction

A machine learning project that predicted binding sites of AP1 transcription factor, using ChIP-Seq data and local DNA shape information.

1 Jan 21, 2022
A simple example of ML classification, cross validation, and visualization of feature importances

Simple-Classifier This is a basic example of how to use several different libraries for classification and ensembling, mostly with sklearn. Example as

Rob 2 Aug 25, 2022
Flightfare-Prediction - It is a Flightfare Prediction Web Application Using Machine learning,Python and flask

Flight_fare-Prediction It is a Flight_fare Prediction Web Application Using Machine learning,Python and flask Using Machine leaning i have created a F

1 Dec 06, 2022
PyPOTS - A Python Toolbox for Data Mining on Partially-Observed Time Series

A python toolbox/library for data mining on partially-observed time series, supporting tasks of forecasting/imputation/classification/clustering on incomplete multivariate time series with missing va

Wenjie Du 179 Dec 31, 2022
Time Series Prediction with tf.contrib.timeseries

TensorFlow-Time-Series-Examples Additional examples for TensorFlow Time Series(TFTS). Read a Time Series with TFTS From a Numpy Array: See "test_input

Zhiyuan He 476 Nov 17, 2022