Hello there, I am having difficulties to understand in my dataset ho

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Intuition adding covariates in formula about corncob HOT 6 CLOSED

adriaaula commented on August 27, 2024

Intuition adding covariates in formula

from corncob.

Comments (6)

adriaaula commented on August 27, 2024

I have found that the coefficients are (by default) on the logit relative abundance scale. But when I try to understand how log(p / (1-p)) I don't get the proper intuition of the term :)

from corncob.

bryandmartin commented on August 27, 2024

Hi @adriaaula ,

I want to test for differential abundance between the fractions. As there are monthly seasonal changes as covariates, should include them in the formula? How should I write it down?

The answer to this depends on what exactly you want to test. I go through several examples in the vignette, so I encourage you to browse that to see more examples and get more clarity, but I'll also answer your question here.

In your first chunk of code, you are testing for differential abundance across fractions, controlling for its effect on the dispersion. This is a perfectly reasonable test.

In your second chunk of code, you are also testing for differential abundance across fractions, controlling for its effect on the dispersion, but you are now also controlling for the effect of month on both the dispersion and the abundance. This is also a perfectly reasonable test, but it is a different test than the first one. It's hard for me to tell you which you should use because it depends on which test you want to perform. Do you want to control for month, or would you prefer to model without month? That's the question you need to answer to choose between the two models.

In terms of interpretation, you can interpret regression parameters similar to those in logistic regression due to the logit link. I agree with you in general that log(p/(1-p)) isn't an intuitive concept, and is an unfortunate but necessary consequence of how we model variables on the [0,1] scale (like in logistic regression). This is called the log-odds scale, and there are many examples online that attempt to explain this concept in an intuitive way due to the commonality of logistic regression. I'll point you to one here I found just by quickly Googling, but would encourage you to search for more interpretations of the log-odds if you want more detail. Here's a link I found: https://stats.idre.ucla.edu/stata/faq/how-do-i-interpret-odds-ratios-in-logistic-regression/

from corncob.

adriaaula commented on August 27, 2024

When I test both of the models I specified, for the first example I obtain 25 significant taxa, and with the second I obtain a total of 45. Only 13 are shared, and by checking some of them, I see the ones coming from the first model as congruent with what I should expect, but I cannot understand properly in the second case why some results appear as significant.

I guess this is more of a general understanding of regression properties but I posted it here because I still have many doubts, even with the examples in the vignettes. Maybe you could point me to some references regarding how to choose the more adequate model?

I found too a quite link explaining the log-odds with rmarkdown:
https://rstudio-pubs-static.s3.amazonaws.com/182726_aef0a3092d4240f3830c2a7a9546916a.html

from corncob.

bryandmartin commented on August 27, 2024

Hi @adriaaula ,

For a single-taxon, you could run a test such as the likelihood ratio test implemented in corncob to test whether a coefficient such as month helps explain a significant proportion of the variability in your response. However, when you are testing all the taxa at once, such as in differentialTest, I think the question becomes more theoretical and less statistical. In my opinion, I think the choice between the models depends on whether or not you think you should be controlling for month. One thing I can say is you definitely do not want to choose the model based on the taxa that are identified as significant. That would be highly statistically improper.

I'll ping @adw96 to see if she has any alternative suggestions.

Please let me know if you have any other questions!

from corncob.

adriaaula commented on August 27, 2024

By no means I was meaning to decide which covariates I choose based in the number of significant taxa. I was expecting a nested result in which all the results coming from the no-control-by-month would be inside the controlled one.

Thank you for your time it really helped me understand the log-odds link!

from corncob.

bryandmartin commented on August 27, 2024

Great, closing this issue. Feel free to open another later!

from corncob.

Intuition adding covariates in formula about corncob HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent