fMRIprep Pipeline To Machine Learning

Overview

fMRIprep Pipeline To Machine Learning(Demo)

所有配置均在config.py文件下定义

前置环境(lilab)

  • 各个节点均安装docker,并有fmripre的镜像
  • 可以使用conda中的base环境(相应的第三份包之后更新)

1. fmriprep script on single machine(docker)

config.py中的fMRI_Prep_Job类中配置相应变量,注意在修改cmd时,不能修改{}中的关键字。在执行此步骤时,将自动在bids同级目录下建立processed文件夹,用来存放后处理数据。其中处理后的fmriprep数据存放在processed/frmriprepprceossed/fressurfer中。

class fMRI_Prep_Job:
    # input data path
    bids_data_path  = "/share/data2/dataset/ds002748/depression"
    # 一个容器中处理多少个被试 
    step = 8
    # fmriprep opm thread
    thread = 9
    # max work contianers
    max_work_nums = 10

    # 在bids同级目录下创建processed文件夹
    bids_output_path = os.path.join("/".join(bids_data_path.split('/')[:-1]),'processed')
    if not os.path.exists(bids_output_path):
        os.mkdir(bids_output_path)
    # fmri work path 
    fmri_work="/share/fmri_work"
    # freesurfer_license
    freesurfer_license = "/share/user_data/public/fanq_ocd/license.txt"
    # contianer id fmriprep
    contianer_id = "d7235efbbd3c"
    # fmriprep cmd 
    cmd ="docker run -it --rm -v {bids_data_path}:/data -v {freesurfer_license}:/opt/freesurfer/license.txt -v {bids_output_path}:/out -v {fmri_work}:/work {contianer_id} /data /out --skip_bids_validation --ignore slicetiming fieldmaps  -w /work --omp-nthreads {thread} --fs-no-reconall --resource-monitor participant --participant-label {subject_ids}"

2. fmriprep post preocess

这一步的操作主要依赖于fmribrant,主要作用是回归掉白质信号、脑脊液信号、全脑信号、头动信息、并进行滤波(可选),将其处理后的文件放存在prcoessed/post-precoss/ fliter/clean_imgs 中, 可选表示是否进行滤波。该配置中不建议修改dataset_path,store_path

class PostProcess:
    """
    fmriprep 后处理数据
    """
    # 类型的名字
    task_type = "rest"

    dataset_path = os.path.join(fMRI_Prep_Job.bids_output_path,'fmriprep')

    store_path = os.path.join(fMRI_Prep_Job.bids_output_path,'post-process')

    t_r = 2.5

    low_pass = 0.08

    high_pass = 0.01

    n_process = 40

    if t_r != None:
        store_path = os.path.join(store_path,'filter','clean_imgs')
    else:
        store_path = os.path.join(store_path,'unfilter','clean_imgs')

    os.makedirs(store_path,exist_ok=True)

3.获取ROI级别的时间序列

atlas由271个roi组成,分别是Schaefer_200(皮上),Tianye_54(皮下),Buckner_17(小脑)。由于在fmribrant中实现提取时间序列的功能,简单封装一下。

class RoiTs:
    """
    ROI 级别时间序列
    处理271个全脑roi
    """
    n_process = 40

    # 如果在第二步fmri post process已经滤波之后,不建议再次使用滤波操作
    t_r = None
    
    low_pass = None

    high_pass = None
    
    flag_gs = False #  回归全脑均值为 True 否则为False
    # 以下内容不建议修改

    if flag_gs:
        file_name = "*with_gs.nii.gz"
        ts_file = "GS"
    else:
        file_name = "*without_gs.nii.gz"
        ts_file = "NO_GS"
    
    reg_path = os.path.join(PostProcess.store_path,"*",PostProcess.task_type,file_name)
    
    subject_id_index = -3

    save_path = os.path.join("/".join(PostProcess.store_path.split('/')[:-1]),'timeseries',ts_file)

    os.makedirs(save_path,exist_ok=True)

4. Machine Learning(Baseline)

这一步是可选的,一般先用来看看FC做性别分类、年龄回归的效果如何。只保留粗略结果,详细结果可以使用baseline这个包。

class ML:
    # 选择的subject id 默认是全部
    sub_ids = [i.split('.')[0] for i in os.listdir(RoiTs.save_path)]
    # 量表位置
    csv = pd.read_csv('/share/data2/dataset/ds002748/depression/participants.tsv',sep='\t')
    #取交集
    csv = pd.DataFrame({"participant_id":sub_ids}).merge(csv)
    # 分类的任务
    classifies = ["gender"]
    # 回归的任务
    regressions = ["age"]
    # 分类模型
    classify_models = [SVC(),SVC(C=100),SVC(kernel='linear'),SVC(kernel='linear',C=100)]
    # 回归模型
    regress_models = [SVR(),SVR(C=100),SVR(kernel='linear'),SVR(kernel='linear',C=100)]
    kfold = 3
    # 多少个roi
    rois = 200

5. run

修改script/run.py

from fmriprep_job import run_fmri_prep
from fmriprep_pprocess import  run as pp_run
from roi2ts import run as roi_ts_run
from fast_fc_ml import run as ml_run


if __name__ =='__main__':
    run_fmri_prep() # fmriprep
    pp_run() # fmriprep post process
    roi_ts_run() # get roi time series
    ml_run() # machine learning

然后执行

python run.py

6. To Do

  • 质量控制
Owner
Alien
A student
Alien
虚拟货币(BTC、ETH)炒币量化系统项目。在一版本的基础上加入了趋势判断

🎉 第二版本 🎉 (现货趋势网格) 介绍 在第一版本的基础上 趋势判断,不在固定点位开单,选择更优的开仓点位 优势: 🎉 简单易上手 安全(不用将api_secret告诉他人) 如何启动 修改app目录下的authorization文件

幸福村的码农 250 Jan 07, 2023
Nixtla is an open-source time series forecasting library.

Nixtla Nixtla is an open-source time series forecasting library. We are helping data scientists and developers to have access to open source state-of-

Nixtla 401 Jan 08, 2023
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 03, 2023
决策树分类与回归模型的实现和可视化

DecisionTree 决策树分类与回归模型,以及可视化 DecisionTree ID3 C4.5 CART 分类 回归 决策树绘制 分类树 回归树 调参 剪枝 ID3 ID3决策树是最朴素的决策树分类器: 无剪枝 只支持离散属性 采用信息增益准则 在data.py中,我们记录了一个小的西瓜数据

Welt Xing 10 Oct 22, 2022
A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

Stox A Module to predict the "close price" for the next day and give "technical analysis". It uses a Neural Network and the LSTM algorithm to predict

Stox 31 Dec 16, 2022
A Python step-by-step primer for Machine Learning and Optimization

early-ML Presentation General Machine Learning tutorials A Python step-by-step primer for Machine Learning and Optimization This github repository gat

Dimitri Bettebghor 8 Dec 01, 2022
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Jean de Dieu Nyandwi 3.8k Jan 09, 2023
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Amplo 10 May 15, 2022
This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Interpretable Machine Learning with Python, published by Packt

Packt 299 Jan 02, 2023
Automated Time Series Forecasting

AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. There are dozens of forecasting mod

Colin Catlin 652 Jan 03, 2023
A quick reference guide to the most commonly used patterns and functions in PySpark SQL

Using PySpark we can process data from Hadoop HDFS, AWS S3, and many file systems. PySpark also is used to process real-time data using Streaming and

Sundar Ramamurthy 53 Dec 21, 2022
This project has Classification and Clustering done Via kNN and K-Means respectfully

This project has Classification and Clustering done Via kNN and K-Means respectfully. It later tests its efficiency via F1/accuracy/recall/precision for kNN and Davies-Bouldin Index for Clustering. T

Mohammad Ali Mustafa 0 Jan 20, 2022
Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

Rodrigo Arenas 1 Apr 26, 2022
The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

mlflow_hydra_optuna_the_easy_way The easy way to combine mlflow, hydra and optuna into one machine learning pipeline. Objective TODO Usage 1. build do

shibuiwilliam 9 Sep 09, 2022
SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the S

Amazon Web Services 1.8k Jan 01, 2023
Implementation of linesearch Optimization Algorithms in Python

Nonlinear Optimization Algorithms During my time as Scientific Assistant at the Karlsruhe Institute of Technology (Germany) I implemented various Opti

Paul 3 Dec 06, 2022
inding a method to objectively quantify skill versus chance in games, using reinforcement learning

Skill-vs-chance-games-analysis - Finding a method to objectively quantify skill versus chance in games, using reinforcement learning

Marcus Chiam 4 Nov 19, 2022
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Prophet: Automatic Forecasting Procedure Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends ar

Facebook 15.4k Jan 07, 2023
UpliftML: A Python Package for Scalable Uplift Modeling

UpliftML is a Python package for scalable unconstrained and constrained uplift modeling from experimental data. To accommodate working with big data, the package uses PySpark and H2O models as base l

Booking.com 254 Dec 31, 2022