Hello, I downloaded your code and the training dataset you provided, and when I pre-tr

Some questions about the training dataset in the MBM pretraining stage about mind-vis HOT 7 CLOSED

zjc062 commented on June 22, 2024

Some questions about the training dataset in the MBM pretraining stage

from mind-vis.

Comments (7)

zjc062 commented on June 22, 2024

Hi there,

Thank you for your interest in our work! We didn't include the whole pre-training set from HCP in our supplementary materials because of the size limit and license issue. But you can download them from this link.

Feel free to let me know if there's any issue :)

from mind-vis.

huzeyann commented on June 22, 2024

Can you provide more instructions on which data to download and how to organize the downloaded files?
I am trying to download "Resting State fMRI FIX-Denoised (Compact)1,096 of 1,113 subjects OK – 23,357 files, 4180.38 GB" from ConnectomeDB. But the code in dataset.py is reading HCP_visual_voxel.npz of shape [1200, num_voxels], is there some script to process the downloaded data to HCP_visual_voxel.npz?

from mind-vis.

zjc062 commented on June 22, 2024

Hi,

Sure! Glad to help!

"Resting State fMRI FIX-Denoised (Compact)1,096 of 1,113 subjects OK – 23,357 files, 4180.38 GB" is exactly the one that we used. There are a few preprocessed data versions. After unzipping this package, you may see <SubjectID>/MNINonLinear/Results/rfMRI_REST<SessionID>_<Direction>/rfMRI_REST<SessionID>_<Direction>_Atlas_MSMAll_hp2000_clean.dtseries.nii. For example, for Subject 102109, the Run 1 with LR direction is located at 102109/MNINonLinear/Results/rfMRI_REST1_LR/rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii.

After obtaining the pre-processed data, we used a python package hcp_utils to extract voxels from this CIFITI files. We aim to extract V1-V4 from The Glasser MMP1.0 Parcellation for decoding.

Here's the example script for processing the fMRI files into the HCP_visual_voxel.npz:

import nibabel as nib
import numpy as np
import hcp_utils as hcp
import os

SUBJECT_LIST_DIR = './YourSubList/'
HCP_DATA_DIR = './YourDataPath/HCP/S1200/individuals/'
ROOT_DIR = './SaveHere'
IMG_PATH = 'MNINonLinear/Results/rfMRI_REST1_LR/rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii'

v1_idx = np.where(((hcp.mmp.map_all == 1)) | (hcp.mmp.map_all == 181))[0] 
v2_idx = np.where(((hcp.mmp.map_all == 4)) | (hcp.mmp.map_all == 184))[0]
v3_idx = np.where(((hcp.mmp.map_all == 5)) | (hcp.mmp.map_all == 185))[0]
v4_idx = np.where(((hcp.mmp.map_all == 6)) | (hcp.mmp.map_all == 186))[0]

sub_list = [sub.split('_')[0] for sub in os.listdir(SUBJECT_LIST_DIR) if '.txt' in sub]
for sub in sub_list:
    img_path = os.path.join(HCP_DATA_DIR, sub, IMG_PATH)
    if not os.path.exists(img_path):
        continue
    img = nib.load(img_path)
    X = img.get_fdata()
    X = hcp.normalize(X)
    output_dir = os.path.join(ROOT_DIR, 'npz', sub)
    os.makedirs(output_dir, exist_ok=True)
    np.savez(
        os.path.join(output_dir, 'HCP_visual_voxel.npz'),
        V1=X[:,v1_idx],
        V2=X[:,v2_idx],
        V3=X[:,v3_idx],
        V4=X[:,v4_idx]
    )

from mind-vis.

huzeyann commented on June 22, 2024

Hi,

Sure! Glad to help!

"Resting State fMRI FIX-Denoised (Compact)1,096 of 1,113 subjects OK – 23,357 files, 4180.38 GB" is exactly the one that we used. There are a few preprocessed data versions. After unzipping this package, you may see <SubjectID>/MNINonLinear/Results/rfMRI_REST<SessionID>_<Direction>/rfMRI_REST<SessionID>_<Direction>_Atlas_MSMAll_hp2000_clean.dtseries.nii. For example, for Subject 102109, the Run 1 with LR direction is located at 102109/MNINonLinear/Results/rfMRI_REST1_LR/rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii.

After obtaining the pre-processed data, we used a python package hcp_utils to extract voxels from this CIFITI files. We aim to extract V1-V4 from The Glasser MMP1.0 Parcellation for decoding.

Here's the example script for processing the fMRI files into the HCP_visual_voxel.npz:
import nibabel as nib
import numpy as np
import hcp_utils as hcp
import os

SUBJECT_LIST_DIR = './YourSubList/'
HCP_DATA_DIR = './YourDataPath/HCP/S1200/individuals/'
ROOT_DIR = './SaveHere'
IMG_PATH = 'MNINonLinear/Results/rfMRI_REST1_LR/rfMRI_REST1_LR_Atlas_MSMAll_hp2000_clean.dtseries.nii'

v1_idx = np.where(((hcp.mmp.map_all == 1)) | (hcp.mmp.map_all == 181))[0] 
v2_idx = np.where(((hcp.mmp.map_all == 4)) | (hcp.mmp.map_all == 184))[0]
v3_idx = np.where(((hcp.mmp.map_all == 5)) | (hcp.mmp.map_all == 185))[0]
v4_idx = np.where(((hcp.mmp.map_all == 6)) | (hcp.mmp.map_all == 186))[0]

sub_list = [sub.split('_')[0] for sub in os.listdir(SUBJECT_LIST_DIR) if '.txt' in sub]
for sub in sub_list:
    img_path = os.path.join(HCP_DATA_DIR, sub, IMG_PATH)
    if not os.path.exists(img_path):
        continue
    img = nib.load(img_path)
    X = img.get_fdata()
    X = hcp.normalize(X)
    output_dir = os.path.join(ROOT_DIR, 'npz', sub)
    os.makedirs(output_dir, exist_ok=True)
    np.savez(
        os.path.join(output_dir, 'HCP_visual_voxel.npz'),
        V1=X[:,v1_idx],
        V2=X[:,v2_idx],
        V3=X[:,v3_idx],
        V4=X[:,v4_idx]
    )

Thank you so much for the detailed instruction!
I am trying to re-run the pre-training, do you know roughly how many GPU hours will it take?

from mind-vis.

zjc062 commented on June 22, 2024

We paralleled 6 GTX3090ti and ran for 3 days. Good luck with the training :)

from mind-vis.

huzeyann commented on June 22, 2024

using the default config, I trained on 1 3090 for a day and it reached 100 epochs, the loss is now around 0.36, is this excepted?

{'lr': 0.00025, 'min_lr': 0.0, 'weight_decay': 0.05, 'num_epoch': 500, 'warmup_epochs': 40, 'batch_size': 100, 'clip_grad': 0.8, 'mask_ratio': 0.75, 'patch_size': 16, 'embed_dim': 1024, 'decoder_embed_dim': 512, 'depth': 24, 'num_heads': 16, 'decoder_num_heads': 16, 'mlp_ratio': 1.0, 'root_path': '.', 'output_path': './results/fmri_pretrain/11-01-2023-10-45-27', 'seed': 2022, 'roi': 'VC', 'aug_times': 1, 'num_sub_limit': None, 'include_hcp': True, 'include_kam': True, 'accum_iter': 1, 'use_nature_img_loss': False, 'img_recon_weight': 0.5, 'focus_range': None, 'focus_rate': 0.6, 'local_rank': 0}
Dataset size: 136014
Number of voxels: 4656
AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.95)
    capturable: False
    eps: 1e-08
    foreach: None
    lr: 0.00025
    maximize: False
    weight_decay: 0.0

Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.95)
    capturable: False
    eps: 1e-08
    foreach: None
    lr: 0.00025
    maximize: False
    weight_decay: 0.05
)
Start Training the fmri MAE ... ...
[Epoch 0] loss: 0.8791665854390976
.
.
.
[Epoch 94] loss: 0.36076937938857656

from mind-vis.

jqin4749 commented on June 22, 2024

using the default config, I trained on 1 3090 for a day and it reached 100 epochs, the loss is now around 0.36, is this excepted?

{'lr': 0.00025, 'min_lr': 0.0, 'weight_decay': 0.05, 'num_epoch': 500, 'warmup_epochs': 40, 'batch_size': 100, 'clip_grad': 0.8, 'mask_ratio': 0.75, 'patch_size': 16, 'embed_dim': 1024, 'decoder_embed_dim': 512, 'depth': 24, 'num_heads': 16, 'decoder_num_heads': 16, 'mlp_ratio': 1.0, 'root_path': '.', 'output_path': './results/fmri_pretrain/11-01-2023-10-45-27', 'seed': 2022, 'roi': 'VC', 'aug_times': 1, 'num_sub_limit': None, 'include_hcp': True, 'include_kam': True, 'accum_iter': 1, 'use_nature_img_loss': False, 'img_recon_weight': 0.5, 'focus_range': None, 'focus_rate': 0.6, 'local_rank': 0}
Dataset size: 136014
Number of voxels: 4656
AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.95)
    capturable: False
    eps: 1e-08
    foreach: None
    lr: 0.00025
    maximize: False
    weight_decay: 0.0

Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.95)
    capturable: False
    eps: 1e-08
    foreach: None
    lr: 0.00025
    maximize: False
    weight_decay: 0.05
)
Start Training the fmri MAE ... ...
[Epoch 0] loss: 0.8791665854390976
.
.
.
[Epoch 94] loss: 0.36076937938857656

It looks quite okay. But this doesn't reflect the performance of your downstream tasks.

from mind-vis.

Some questions about the training dataset in the MBM pretraining stage about mind-vis HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent