eminserin / nbs-predict Goto Github PK

A prediction-based extension of network-based statistics.

License: GNU General Public License v3.0

MATLAB 69.81% Fortran 30.18% Objective-C 0.01% M 0.01%

nbs-predict biomarkers machine-learning connectome neuroimaging

nbs-predict's Introduction

NBS-Predict

NBS-Predict is a prediction-based extension of the Network-based Statistic (Zalesky et. al., 2010). NBS-Predict aims to provide a fast way to identify neuroimaging-based biomarkers with high generalizability by combining machine learning with graph theory in a cross-validation structure.

Overview

NBS-Predict operates in a cross-validation structure (nested if hyperparameter optimization is desired). The general algorithm consists of model evaluation, feature selection like suprathreshold edge selection, hyperparameter optimization (optionally), and machine learning algorithm optimization (optionally).

Prerequisites

NBS-Predict requires the following software and toolboxes to run properly:

Matlab (2016b or newer) 
Statistics and Machine Learning Toolbox
Parallel Computing Toolbox (optional)

Installing

Download or clone this repository to a direction of your choice.

git clone --recursive [email protected]:eminSerin/NBS-Predict.git

Run MATLAB and navigate the NBS-Predict's directory using either the command window or the Current Folder window. Then, type this command to add NBS-Predict to your MATLAB path.

addpath(genpath(pwd));

Sample Data

You can download sample data using the following link: NITRC. This dataset comprises synthetic network data simulating regression and classification problems. Since the sample data is synthetic, please keep in mind that, results obtained following the analysis of this data do not imply any significant information. This sample dataset should only be used to check whether the toolbox works properly and to serve as an example input structure so that you can organize your input data accordingly.

Example

Type this command to start NBS-Predict

start_NBSPredict();

The welcome window of NBS-Predict will automatically appear on the screen.

Using this window, create a new workspace. To do this, write a name for the workspace and press the return key. A pop-up window asking the directory for the workspace will appear. After selecting the directory, hit the "Create" button to create the workspace. The created workspace will appear on the list box below. Then, hit the "Start" button to run the workspace.
The analysis setup window of NBS-Predict will automatically appear on the screen.

Select the directory that contains subjects' connectivity matrices. Then, select brain parcellation and design matrix files. Specify a contrast vector for the statistical model used selecting suprathreshold edges across folds. Optionally, you may define advanced parameters such as number of CV folds, number of CV repetitions, performance metrics. Once you finish selecting data and optional parameters, hit the RUN button to start the analysis. NBS-Predict will last from minutes to hours depending on the computer, sample size, and brain parcellation atlas used.

Following the analysis, the NBS-Predict Results Viewer window will automatically appear on the screen. Here, you view display weighted adjacency matrix, weighted network on a circular graph, weighted network on a 3D brain surface (BrainNet Viewer, Xia et al., 2013), and confusion matrix. Weights represent the presence of edges in the selected connected component across outer folds and their prediction performances. That allows us to evaluate the contribution of each edge to the overall model straightforwardly. You can further set a weight threshold to visualize a subnetwork comprising the most relevant edges. Also, by clicking the "Save Figure" button, you can save figures in several formats.

See the MANUAL.pdf file for the detailed user guide.

Additionally, see the Tutorial_HCP.pdf file for example use of NBS-Predict.

Test

You may test the performance of NBS-Predict in predicting target variable or identifying edges with ground truth on several synthetic networks (small-world, scale-free or random) data by typing this command:

test_NBSPredict(parameters);

The parameters are extensively documented in the test_NBSPredict.m function.

You may also use the simulation function that automatically runs the test function n times and using various parameters. To do that, type this command:

sim_testNBSPredict(parameters);

The parameters are extensively documented in the sim_testNBSPredict.m function.

Also, see the Tutorial_Simulation.pdf document for an example use of the simulation function.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Compatibility

NBS-Predict was developed on Matlab R2017b and tested on Matlab R2017b and R2018b.

Authors

NBS-Predict was designed by Emin Serin, Andrew Zalesky, Johann D. Kruschwitz and Henrik Walter, and developed by Emin Serin.

Contributing

You may contribute in this project in many ways such as bringing new features to NBS-Predict, improving documentantation or reporting bugs. See the CONTRIBUTING.md file for details

Citation

If you use the toolbox, please cite the following paper:

Serin, E., Zalesky, A., Matory, A., Walter, H., & Kruschwitz, J. D. (2021). NBS-Predict: A Prediction-based Extension of the Network-based Statistic. NeuroImage, 118625.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE.md file for details

References

References for the functions or toolboxes used in NBS-Predict toolbox:

Glmnet for Matlab (2013) Qian, J., Hastie, T., Friedman, J., Tibshirani, R. and Simon, N. http://www.stanford.edu/~hastie/glmnet_matlab/
Morel, (2018). Gramm: grammar of graphics plotting in Matlab. Journal of Open Source Software, 3(23), 568, https://doi.org/10.21105/joss.00568
Paul Kassebaum (2020). circularGraph (https://github.com/paul-kassebaum-mathworks/circularGraph), GitHub. Retrieved September 6, 2020.
Roland Pfister (2019). dlmcell (https://www.mathworks.com/matlabcentral/fileexchange/25387-dlmcell), MATLAB Central File Exchange. Retrieved September 3, 2019.
Xia, M., Wang, J., & He, Y. (2013). BrainNet Viewer: a network visualization tool for human brain connectomics. PloS one, 8(7), e68910.
Zalesky, A., Fornito, A., & Bullmore, E. T. (2010). Network-based statistic: identifying differences in brain networks. Neuroimage, 53(4), 1197-1207.

nbs-predict's People

Contributors

Stargazers

Watchers

Forkers

haitaoge m31glx 4n4574514 xiazc hillhillll

nbs-predict's Issues

Unrecognized function or variable 'adjMat'.

Hello!

I have ran into a new error with your toolbox. It seems that the code has run into an error when reaching the visualization step.
Here was the error message:

Unrecognized function or variable 'adjMat'.

My parameters are slightly different than my last issue so here they are again:

Contrast: [1,-1,0,0]
ML Model: Auto
K-Fold: 10
Repeat CV: 10
P-value: 0.01
Seed: 42
CPU Cores: 10
Method: Grid Search

Please let me know if this is trouble on my end!

Using pre-computed F/p matrices.

Hi Emin,
Thanks for making this toolbox. I was wondering if pre-computed NBS t or P maps can be used for the ML/model selection part for the toolbox. Due to the requirements of the dataset I am using, Ive run an NBS using a gamma regression, so have pre-computed F and p matrices. I'm now trying to assess whether the significant component from the NBS is associated with a set of behavioural variables and thought the ML models and selection procedure proposed in the toolbox might be useful for this.
Thanks,
Sid

One-class classification error

hello to everyone, unfortunately I encounter an error that I can't explain by simply comparing two groups.
I can only see the accuracy while for all the other parameters NBSPredict gives an error: "Warning: Only one class present! sensitivity does support one-class classification, and being set to NaN Please use Accuracy instead!"
thanks for any replies

Inconsistent Learning Performance with Default Tutorial Data and Parameters

Issue Description:

We are encountering an issue where the NBS-Predict algorithm is not learning effectively when using the default tutorial data and parameters. The output consistently shows average AUC scores around 0.5, which suggests that the model is not performing better than random chance.

Environment:

NBS-Predict Version: 1.0.0-beta.10
MATLAB Version: R2021b
Operating System: SUSE 15.3

Steps to Reproduce our issue:

Cloned the NBS-Predict repository from GitHub.
Used the sample data provided in the NITRC link in the documentation.
Followed the standard procedure as per the Tutorial_HCP.pdf file.
- Ran the analysis with the following initial parameters:
- Searching Algorithm: bayesOpt
- METRIC: auc
- Number of Folds: 10
- Number of Repetitions: 10

Results: The algorithm consistently returned AUC scores around 0.5. With the default tutorial data and parameters, I expected the algorithm to learn effectively and provide AUC scores significantly different from 0.5, but the following is the MATLAB command window output:

ESTIMATOR: LogReg
Searching Algorithm: bayesOpt
METRIC: auc
Number of Folds: 10
Number of Repetitions: 10
-------------
|   Score   |
-------------
|   0.490   |
|   0.496   |
|   0.475   |
|   0.498   |
|   0.507   |
|   0.490   |
|   0.506   |
|   0.507   |
|   0.498   |
|   0.500   |
-------------
10x10 repeated-CV: µScore: 0.497, σScore: 0.010
The elapsed time is 61.308250 seconds.
Permutation testing is running! Permutations: 1000

The confusion matrix also showed that the predictions were not reflect correct true positive and true negative values

To note: We have not modified the default settings or data in any significant way. The issue persists even after repeating the analysis on different computers with different OS, Matlab versions with different parameters.

Given our results we have the following questions:

Is there a known issue with the current version of NBS-Predict that might be causing this behaviour?
Could there be an issue with the data, the choice of model/hyperparameters, or a potential bug in the software?
Any assistance or guidance you could provide would be greatly appreciated.

Best regards,
Javier

Error using classreg.learning.Linear.prepareDataCR: Empty X or Y not supported.

Hello! I have been recently trying to run your repository utilizing some data of my own. To prepare my data, I made sure that the connectivity matrices were in .mat format, downloaded the respective schaefer brain regions, and created a design matrix. I made sure that none of the files had missing values, and I am certain that none of them do. However, I have been running into the following error:

Error using classreg.learning.Linear.prepareDataCR
Empty X or Y not supported.

This simply appears when running the program and leaving it to run for approximately 5 minutes. Here are the parameters I used:
Contrast: [1,1,0,0]
ML Model: Auto
K-Fold: 10
Repeat CV: 10
P-value: 0.01
Seed: 42
CPU Cores: 10
Method: Grid Search

I have also attached the history.mat file and the screen recording of the error in the following google drive link:
https://drive.google.com/file/d/1D8b2xyoL1UzV01rP58n4XVGZtCJ2mhFQ/view?usp=sharing

Thank you!

CV length error

Hello,
Thank you for this wonderful resource.

I am running into this error: Error using tabular/length
Undefined function 'LENGTH' for input arguments of type 'table'. Use the height, width, or size
functions instead.

Error in gen_cvpartition (line 20)
leny = length(y);

I was wondering where I should start for troubleshooting?

Thanks,
Alex

Starting up troubles:

I got these errors when starting Matlab (fresh) and following the start-up instructions on your help wiki:

In matlab 2022

In matlab 2018a

Interpretation of the results generated by NBS-predict

Dear Dr. Serin,
I am a psychiatrist form China. Recently, I am using functional conenctome to predict HAMD depression scores. I am seting the parameter to 10 fold, 10-repetad times, and the p value to 0.0005. The results showed the HAMD scores can be predicted with µScore: 0.155, σScore: 0.012 by using SVM method and with µScore: 0.147, σScore: 0.017 by using linear regression. However, I am confused about the results, cause when I choose the SVM regression and correlation method, I found the R squared value is 0.003, and r corelation is 0.155 for SVM method. While, when I choose the linear regression and correlation method, I found the R squared value is -0.122, and r correlation is 0.147 for linear correlation method. The brain regions presentd for the two different method are also quite different. Can you tell me which method shoul I choose, or I can choose randomly?

Second, the R squred for linear regression is -0.122 is unbelieveble. What's more, I can't tell it is a positive prediction or negative prediction of the HAMD scores. Can you tell me how to see if it is a positive corelation or negative corelation?

Looking forward for your reply. All the best,
Hua

Regression analysis

Is it possible to perform regression analysis between two groups, similar to NBS? For example:
1st column: intercept
2nd column: interaction between diagnosis and behavioural measure
3rd column: behavioural measure
4th column: patient or control

Confusion matrix plotted incorrectly

The elements on the leading diagonal of confmat need to be switched for the confusion matrix to plot correctly.

confmat = fliplr(fliplr(confmat)');

in function confMatPush_Callback() works as a hack.

help with error message

Estimate weights for confound regression in one group

Hi,

I am using NBS-predict to predict disease status (patients vs healthy controls) based on functional connectome data, with age and sex as nuisance variables.
Regarding confound regression, given the strong correlation between age and disease duration, I was afraid that estimating confound weights on the whole population might lead to regressing-out some disease-related effects from patients' data. Is this concern justified?

Would it be possible to estimate the weights only in the HCs and then regress-out the effect of physiological aging from all subjects?

Thank you in advance for your help and for this great resource!

Best,
Giuseppe

Problem with compute_modelMetrics and NaN values in Matlab 2021b

Dear all,

I get a quite peculiar warning which is not reproducible in all machines. When using "balanced_accuracy", it returns warnings in several test models/designs:

"y_pred contains classes not in y_true."

It originates at line 156 of this function and only happens in Matlab versions newer than 2020a. I know why the warning occurs (the division some lines above) but I can not figure out why, even when tracing it back. You have any ideas?
Best,
David

No graph produced when contrast [1 1]

Dear Emin,

If I try and run the toolbox by setting the contrast as [1 1] (which should be an F-test, or ANOVA, right?) I get some accuracy score but no meaningful graph, just a matrix where all connections are weighted 1. Is this an expected behavior?

Thank you,
Ramtin

FC Matrix

Hi

Emin, I was trying to test your software but when I tried to import the FC matrix (333x333x57), it doesn't allow me to select it from the selection bar! (no option for selecting CSV or mat file). Is this normal?