For gene ranking (for GSEA, for example) and visualizations (on volcano plots, for exa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Allow logFC shrinkage in DESeq about tidybulk HOT 4 OPEN

antass commented on September 21, 2024

Allow logFC shrinkage in DESeq

from tidybulk.

Comments (4)

antass commented on September 21, 2024 1

Given that, maybe a generic effect_size_estimator parameter could be helpful, with lfcShrink being one of the options. Could add more in the future.

Also, looks like lfcShrink can take contrast as parameter (but not for apeglm type), which could be passed down from the deseq2() function.

from tidybulk.

stemangiola commented on September 21, 2024

@mikelove any feedback/idea on this?

from tidybulk.

mikelove commented on September 21, 2024

I can help with this.

I would argue for practical reasons as well as semantic reasons, it may best be implemented as separate function from test_differential_abundance, as typically the effect size is estimated apart from the null hypothesis testing. If you would consider this, I'd suggest something like "compute/estimate" with "posterior LFC" or "effect size".

Motivation: Neither the basic (MLE) LFC nor t-statistic (equivalently p-value) are very good at ranking genes by effect size, which we described in Zhu et al 2019 (with the exception of after having filtered to an FDR bounded set). But for ranking the entire gene list, apart from resorting to null hypothesis testing, you need a good effect size estimate for plotting or some downstream applications.

We have two main implementations within lfcShrink: ashr (which can be used directly on a results table, e.g. pivot_transcripts then applying lfcShrink), and apeglm (which requires the counts again, as it uses the likelihood in performing effect size shrinkage). We could also run edgeR's predFC, which has a free parameter prior.count that one would want users to be able to access, via ellipses should be fine.

Practically, this would mean passing the tidybulk or SE object to a new function which could run various methods to output a posterior effect size and (optionally) posterior SD. Would recommend naming this something other than log2FoldChange to avoid collision.

One more detail is that in our implementation, ashr can operate on any previously calculated LFC which would be present in the columns of the object after test_differential_abundance, while apeglm requires specifying which coef should be shrunk (by number or name for example). The names of the coefficients can be obtained from the fitted DESeqDataSet with resultsNames. Most often users want to shrink the last coef, but this could be exposed with ellipses. So to mention again, ashr could be run even if the counts were dropped.

from tidybulk.

mikelove commented on September 21, 2024

Another good effect size ranking method that could be included is:

http://bioconductor.org/packages/release/bioc/html/topconfects.html
https://github.com/pfh/topconfects

from tidybulk.

Allow logFC shrinkage in DESeq about tidybulk HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent