Giter Club home page Giter Club logo

Comments (4)

stefpeschel avatar stefpeschel commented on August 14, 2024 1

You're right, the original data set does not contain any zeros after filtering. The zeros occur in the subsampling process. SPRING uses StARS for edge selection, which is a stability-based approach, where random subsamples of your data are drawn. For example, if ASV1 has non-zero values only in samples 8 and 9, and those samples are removed in the subsampling, then the total count of ASV1 in that particular subsample is zero.

But generally, SPRING would be able to handle subsamples, where certain ASVs are zero. Here, the main problem is, that no edges remain after the stability selection step. Maybe, the authors of SPRING would have a solution.

from netcomi.

stefpeschel avatar stefpeschel commented on August 14, 2024

Hi,

Generally, a sample size of 11 or 12 is really small. So, please consider that you won't get reliable association estimations with such a low sample size.

I assume, the error is caused by SPRING's subsampling process. As you might know, SPRING performs a random subsampling with a ratio of 80% (SPRING argument subsample.ratio). That means for your data, random subsamples with sample size 11*0.8 are randomly drawn and the association estimation process is repeated for each subset. These subsamples are probably so sparse that certain taxa contain only zeros, which causes problems. This is where the "There are variables in the data that have only zeros or only the same values." errors might come from.

I was actually able to reproduce the error with the American Gut data. It occurs if I use sub-samples with sample sizes equal to yours. I noticed that the "subscript out of bounds" error occurs because the estimated networks are completely empty. This case was actually catched by a check if the edge list equals NULL. For some reason, this does not work anymore. Probably due to a change in the igraph function. I will have to fix this so that you get a meaningful error.

Unfortunately, I was not able to produce non-empty networks with these low sample sizes. One difference from your single network to the joint one I noticed is the number of taxa. In the joint one, the filtering reduces the number of taxa to only 35. You could try setting the jointPrepro argument to TRUE so that the filtering is done for the joint data set. If it is FALSE (the default in your case), each set is filtered separately and the intersect of taxa is built at the end. There is a chance that this fixed the error for your data. Or you could set SPRING's threshold argument to a higher value to get a less sparse network. As very last option, you could use another association measure, which is not based on resampling. Proportionality for instance.

Best,
Stefanie

from netcomi.

sarahkosh avatar sarahkosh commented on August 14, 2024

Hi Stefanie,

Thank you for your response and insights. It sounds like my sample sizes are too small to get meaningful results anyway, but for my own edification I tried a couple of your suggestions.

When jointPrepro = TRUE, I get the same error message. I've also tried adjusting Spring's threshold value with no luck.

Are the zeros in the data from the 50 (or in the example above, 35) most variable nodes/taxa between the two groups?

I guess what I'm asking is, if filtTaxPar = list(highestVar = 50), but only 35 taxa are left after this filtering, where are the zeros coming from? I was assuming these 35 taxa were non-zero in both groups because of the filtering.

Thanks again!
Sarah

from netcomi.

sarahkosh avatar sarahkosh commented on August 14, 2024

Thanks for clarifying.

Sarah

from netcomi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.