Hello,
thank you very much for this promising tool. I have few question before using it for my research.
Do you think that DAFdiscovery would be relevant for data from nontarget screening (when we collect MS1 and MS2 data from all the molecules we can ionized in our sample)?
If so, do you think the pipeline can be use with MS data collected in different ionization modes and maybe with MS data coming from different instruments (like LC and GC)? If I am not mistaken I think I only saw MS data collected in one ionization mode in the examples with the tutorials.
In that optic do you think the data should be scaled before calculating the correlation between MS signals and bioactivity? I am thinking that scaling the MS Areas by unit variance before running the script would avoid easily ionized molecules having a too important weight on the model compare to other molecules with a less intense signal.
One last thing: in the script for the case II (MS data and bioactivity) I noticed that the last correlation plot of MS feature was not loading properly because the column "corr_Bioact" was not renamed in the file corrDF. I fixed it by adding "corrDF2.columns.values[1] = "corr_BioAct"" before exporting MSinfo_corr.
Thanks for the help.
Jean