selbouhaddani / omicspls Goto Github PK
View Code? Open in Web Editor NEWR package for High dimensional data analysis and integration with O2PLS!
Home Page: https://doi.org/10.1186/s12859-018-2371-3
R package for High dimensional data analysis and integration with O2PLS!
Home Page: https://doi.org/10.1186/s12859-018-2371-3
Did you try to add VIP(variable influence on projection) function to calculate the VIP for the OmicsPLS?
Thanks!
Hi,
I'm trying to run OmicsPLS on a RNA-Seq and Methyl-array dataset. When I run crossval_o2m_adjR2, I get MSE of "NA". These results do not look valid, especially since there is no n value given. Do you have any insight into what is going on?
Thanks!
Jen
Command/output:
crossval_o2m_adjR2(methyl.shared.trans, rna.shared.trans, 1:3, 0:3, 0:3, nr_folds = 2, nr_cores = 4)
minimum is at n =
Elapsed time: 570.87 sec
MSE n nx ny
1 NA 1 0 3
2 NA 2 0 2
3 NA 3 0 3
E
and Ff
are set to zero in o2m2
and are not changed afterwards.
Line 477 in d32510a
Out of curiosity, is 't' in Tt
or 'f' in Ff
meaningful or are those just added due to potentially conflicting symbols in R or something?
Code:
r = o2m2(as.matrix(X), as.matrix(Y), 2, 0, 0) # works fine
print(r) # error
Error:
Error in if (x$flags$stripped) cat("O2PLS fit: Stripped \n") else if (x$flags$highd) cat("O2PLS fit: High dimensional \n") else cat("O2PLS fit \n") :
argument is of length zero
Calls: -> -> withVisible -> print -> print.o2m
But using:
r = o2m(as.matrix(X), as.matrix(Y), 2, 0, 0, p_thresh=0)
print(r)
works fine.
Dear Author:
I am new to machine learning.
When I am using crossvalidation I don't know how to set its parameters, I check the pdf in the article, crossval_o2m_adjR2(rna, metab, 1:3, c(0,1,5,10), c(0,1,5,10), nr_folds = 2, nr_cores = 4) I don't know which of these 1:3, c(0,1 ,5,10), c(0,1,5,10) how about determining. For example I have two matrices with 6 rows and 3000 columns. Should I take the columns randomly or is there any requirement?
Here's a partial screenshot of my data
Looking forward to your reply, thanks!
Line 783 in c2e3348
To me, it seems that the dots are the issue here (fit does not have attributes T_Yosc.
and U_Xosc.
, but has T_Yosc
and U_Xosc
)
Hi,
First, thanks for this tool! I have been using sO2PLS regularly for integrating omics datasets. I have been working on some visualisations to show the results of a sO2PLS analysis. In particular, I wanted to show the results of the sparsity cross-validation step, by plotting, for each joint component, the covariance mean and SD for the different values of keepx and keepy tested. However these are not currently returned by the function (I am using OmicsPLS version 2.0.2): the mean_covTU
and srr_covTU
matrices are overwritten by the next component, so that what is returned at the end is only the matrix of covariance mean and SD for the last component.
I think this can easily be fixed by adding in the crossval_sparsity
function the following code:
Add on line 299 in Crossval_OmicsPLS.R (just before if (method == "SO2PLS") {
line)
mean_covTU_list <- list()
srr_covTU_list <- list()
Then on line 348 and line 447 (both time before the 1-standard error rule code):
mean_covTU_list[[comp]] <- mean_covTU
srr_covTU_list[[comp]] <- srr_covTU
And then the returnon on line 491 would be:
return( list(Best = unlist(bestsp), Covs = mean_covTU_list, SEcov = srr_covTU_list))
This allows me to make plots like that (might need to be improved, but that's the idea):
Do you think it would be possible (and useful) to add this to the function?
Thanks!
Hi, selbouhaddani,
Could you please provide an example to implement the O2PLS?
For example, what is the difference between the two cross-validation functions, i.e., crossval_o2m_adjR2() and crossval_o2m()?
How to obtain the resulting common and distinctive matrices for each block?
When I open the OmicsPLS_vignette.Rmd in Firefox browser, the math formula are not recognizable.
Thanks.
I've got a couple of cases where I wished to run o2m but could not as the input checks failed: data with NaN is not accepted, it is impossible to perform O2PLS-DA (strict "less than" check of the number of components vs the number of columns in data; granted it is less common thing to do than OPLS-DA); in cross-validation checks the sum of requested components is checked against the number of columns, which of course will work for omics but not for many other datasets etc.
I understand that some limitations may arise from the implementation details (e.g. use of SVD for PCA) but, I wonder if it would be possible to relax some of the checks. Do you plan to support the cases I mentioned above in this package?
Or maybe would it be reasonable to provide a "force" argument to ignore the checks and let the user take the risk of failing miserably (when the algorithm does not indeed support specific case)?
R2Xcorr is currently computed as:
Lines 712 to 713 in 69086e5
I wonder why it is not R2Xcorr <- ssq(Tt %*% t(W)) / ssq(X_true)
as it would be suggested by the Table 2 of Evaluation of O2PLS in Omics data integration. I understand that there might be some compensation in the code which would make it equivalent but it eludes my comprehension of the codebase. I would be very grateful if you could hint me on that.
Also, I wanted to thank you for sharing your work and apologize for opening so many issues on GitHub; I can offer help in fixing the minor typos I found if you wish to accept PRs. To my knowledge, this is not only the only open-source package offering O2PLS, but also a well designed and documented one and I hope that I could contribute to make it more bulletproof and be able to use it again in the future!
Edit: I think that some other statistics may require more attention.
I just wanted to let you know that I was able to (roughly) reproduce Figure 12b from the (Trygg, and Wold, 2003) paper using (slightly modified) o2m2
. It occurred to me that if I replace the number of components (A
in the paper) in the first pass with n
rather than n + max(nx, ny)
the algorithm better reflects what I would read from the paper one and the recreated figure is more similar to the original one. SVD version works almost as well (+/- a flipping sign).
Relevant code:
Lines 130 to 134 in 913c3e5
Based on the comment ("larger principal subspace") I understand that there might be a reason for this modification and would be happy to learn if you could point me to a reference. If you don't have anything at hand, please feel free to close this issue - I wanted to put this up somewhere so another curious person (or future me) would not need to go through the debuging process again.
There is still some noise (which may have to do with the difference in cross-validation splits or with the differences in the OSC filtering) and the y-axis scales differ (I tried passing it through autoscailing, it did not help). I could not find anything what could explain the differences and it seems that not much more could be deduced from the original publication without having acess to their code.
Finally, thank you for all the recent improvements!
Best wishes, Michał
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.