Comments (16)
Heya just had a thought here, you have now flipped the deseq2 order of contrasts in tidybulk to match edgeR/limma is that right? i.e like in the workshop we use ~condition + type
instead of ~type + condition
for deseq2, same as the others. But is that documented anywhere in tidybulk so deseq2 users will know?
from tidybulk.
Ok :) So would something like this make sense?
"@param .formula A formula representing the desired linear model. If there is more than one factor, they should be in the order primary factor + additional factors
"
Also if deseq2 input formula is being altered by tidybulk to be similar to edgeR/limma-voom then would it make sense to have the contrasts input in a similar format e.g. also use dextrt-dexuntrt instead of dex,trt,untrt for deseq2 or would that be too tricky to implement?
from tidybulk.
Correct (I think, do not have proven evidence).
Just tested it & get error:
Error in eval(ej, envir = levelsenv) : object 'dextrt' not found
As I realised that this probably wouldn't work:
counts_scal_PCA %>%
test_differential_abundance(.formula = ~ 0 + cell + dex, .contrasts = c("dextrt - dexuntrt"))
as adding 0 only omits the intercept for the first factor with edgeR/limma
So better to omit the bit about contrasts & say
"
@param .formula A formula representing the desired linear model. If there is more than one factor, they should be in the order factor of interest + additional factors
"
from tidybulk.
Just pushed here #128
Also has fixes for a couple of typos I saw
from tidybulk.
Ok can you point me to the code you’re using & I’ll have a look?
from tidybulk.
Even better, would you mind doing an independent edgeR and deseq2 analysis (better if not with tidybulk) of Pasilla with this formula ~ condition + type
I verified if the conversion to SE was causing mess, but the SE object that goes into deseq2 seems ok, and yet the fold changes are totally unrelated.
(FYI the function for deseq2 is called get_differential_transcript_abundance_deseq2 in functions.R)
The key code I use is
tidybulk_to_SummarizedExperiment(!!.sample, !!.transcript, counts) %>%
# DESeq2
DESeq2::DESeqDataSet( design = .formula) %>%
DESeq2::DESeq() %>%
DESeq2::results()
Unless I am missing something big with deseq2
You see, tidybulk is doing what is expected
DESeq2
# see vignette for suggestions on generating
# count tables from RNA-Seq data
cnts <- matrix(rnbinom(n=1000, mu=100, size=1/0.5), ncol=10)
cond <- factor(rep(1:2, each=5))
# object construction
dds <- DESeqDataSetFromMatrix(cnts, DataFrame(cond), ~ cond)
# standard analysis
dds <- DESeq(dds)
results(dds)
log2 fold change (MLE): cond 2 vs 1
Wald test p-value: cond 2 vs 1
DataFrame with 100 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
1 95.9958 -0.664078 0.641516 -1.035170 0.3005895 0.759065
2 133.1349 -1.293486 0.709273 -1.823678 0.0682008 0.398194
3 100.5702 0.485982 0.585052 0.830665 0.4061631 0.834731
4 149.1573 -0.780331 0.550075 -1.418590 0.1560186 0.546065
5 140.8715 -0.850247 0.527924 -1.610549 0.1072781 0.477875
... ... ... ... ... ... ...
96 98.1838 0.633796 0.558242 1.135343 0.2562314 0.749565
97 90.5161 -1.236289 0.578663 -2.136457 0.0326422 0.314269
98 110.1423 -1.025949 0.495703 -2.069684 0.0384819 0.314269
99 58.6476 0.108693 0.661281 0.164368 0.8694416 0.960927
100 93.9168 0.155565 0.540197 0.287978 0.7733636 0.959362
t(cnts) %>% as.data.frame %>% as_tibble(rownames = "sample") %>% mutate(cond = cond) %>% gather(gene, count, -sample, -cond) %>% tidybulk(sample, gene, count) %>% test_differential_abundance(~ cond, method="deseq2", action="get")
tidybulk says: All methods use raw counts,
irrespective of if scale_abundance or adjust_abundance have been calculated,
therefore it is essential to add covariates such as batch effects (if applicable) in the formula.
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
tidybulk says: to access the raw results (fitted GLM) do `attr(..., "internals")$DESeq2`
# A tibble: 100 x 7
gene baseMean log2FoldChange lfcSE stat pvalue padj
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 V1 96.0 -0.664 0.642 -1.04 0.301 0.759
2 V2 133. -1.29 0.709 -1.82 0.0682 0.398
3 V3 101. 0.486 0.585 0.831 0.406 0.835
4 V4 149. -0.780 0.550 -1.42 0.156 0.546
5 V5 141. -0.850 0.528 -1.61 0.107 0.478
6 V6 59.4 0.201 0.669 0.301 0.764 0.959
7 V7 91.8 0.453 0.580 0.781 0.435 0.835
8 V8 97.4 -0.164 0.689 -0.238 0.812 0.961
9 V9 90.2 -1.10 0.618 -1.78 0.0748 0.399
10 V10 86.1 1.33 0.562 2.37 0.0177 0.314
from tidybulk.
Maybe the error is in the contrast for deseq2 maybe I am getting the statistics for the confounder and not the factor of interest
from tidybulk.
Solved, yes was the contract problem. :) Sorry to bother
from tidybulk.
Sorry only got to reply now yep it needs to be e.g. ~type + condition
for deseq2 as shown in the vignette, glad you figured it out 👍
from tidybulk.
But is that documented anywhere in tidybulk so deseq2 users will know?
no :)
from tidybulk.
primary factor + additional factors
What about
"
@param .formula A formula representing the desired linear model. If there is more than one factor (and contrasts are not specified), they should be in the order factor of interest + additional factors
"
contrasts input in a similar format
I though about that, but what about (?)
group_1 - 1/3 * group_2 - 1/3 * group_3 - 1/3 * hroup_4
This is a classic constrast for "one versus everything else"
The parsing of arbitrary contrasts from edgeR into deseq2 is not trivial, and I don't know deseq2 enough to realise if they could be mapped 1 to 1
from tidybulk.
"
@param .formula A formula representing the desired linear model. If there is more than one factor and contrasts are not specified, they should be in the order factor of interest + additional factors
"
sounds good to me
The parsing of arbitrary contrasts from edgeR into deseq2 is not trivial, and I don't know deseq2 enough to realise if they could be mapped 1 to 1
I don't know enough about deseq2 contrasts at this point either to be sure about that so better just keep it as you have it.
from tidybulk.
sounds good to me
cool, I edited with parenthesis. See what you think.
from tidybulk.
ok sounds good, just to check, the bit in parenthesis "(and contrasts are not specified)" means they can put the factors in any order if they specify contrasts and all the methods can handle that? e.g.
test_differential_abundance(
.formula = ~ 0 + cell + dex,
.contrasts = c("dextrt - dexuntrt")
gives the same result as below for all methods
test_differential_abundance(
.formula = ~ 0 + dex + cell,
.contrasts = c("dextrt - dexuntrt")
from tidybulk.
ok sounds good, just to check, the bit in parenthesis "(and contrasts are not specified)" means they can put the factors in any order if they specify contrasts and all the methods can handle that? e.g.
test_differential_abundance( .formula = ~ 0 + cell + dex, .contrasts = c("dextrt - dexuntrt")
gives the same result as below for all methods
test_differential_abundance( .formula = ~ 0 + dex + cell, .contrasts = c("dextrt - dexuntrt")
Correct (I think, do not have proven evidence).
The point is the sentence above makes sense when contrasts are not specified, if they are specified you directly define you factor of interest etc.. If not we assume the first covariate is your factor of interest.
from tidybulk.
Hello @mblue9, should I take action about this issue? I don't want to duplicate changes if you have already. Otherwise, I will do no problem!
from tidybulk.
Related Issues (20)
- Normalization method (scaling) HOT 9
- for SE save internals within the metadata rather than attributes.
- github action mac fail HOT 1
- DEG Analysis Between Single Group with Multiple Groups HOT 6
- "Proportion of variance explained" values on reduce_dimensions() output HOT 1
- github actions give preflight error for the whole tidy stack HOT 1
- ERROR; return code from pthread_create() is 22 HOT 3
- pathway analysis using tidybulk HOT 8
- Design and contrasts for tidybulk test differential abundance function HOT 9
- Remove batch effect in test_differential_abundance
- Warning in CHECK, mistery
- test
- Improve the documentation for `tidybulk` HOT 29
- turn pseudobulk from scRNAseq with logCPM value back to a seurat object with counts value? HOT 2
- `tidybulk` Remove hard dependency for `DESeq2` HOT 12
- in `test_gene_rank` check if any feature is NA, because pivot_feature will fail
- the attribute "internals" $ tt_columns abundance_scaled, carries the environment with it inflating the RDS file size
- Prepare for upcoming Seurat v5 release
- Add the method to simplify complete confounders is they are not factor of interest
- limma:MakeContrasts integration issues HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tidybulk.