kevinblighe / enhancedvolcano Goto Github PK

View Code? Open in Web Editor NEW

392.0 392.0 79.0 45.7 MB

Publication-ready volcano plots with enhanced colouring and labeling

R 98.56% TeX 1.44%

enhancedvolcano's People

Contributors

Stargazers

Watchers

Forkers

vd4mmind raonyguimaraes modarzi achalneupane dhanaprakashj vallurumk andrewskelton mxdeluca wenting42 metabolomicshk wj498624370 pagnini cgpu weshorton adrianbzg yiqianghuang kundjanasith namtk leonguos babasaraki dkulp2 yixiangzhang1996 lenaholab songbaozou hjanime gjhanchem abdubidopsis ichobits tools-jusue404 matthewlemke dotellmewhy mapawlak jokendo-collab noahpieta youhui2015 satellite119 choijamtsmunkhzul fabbondanza genomicsnx yedomon benostendorf amrr101 rpolicastro snijesh xjyx mathewchamberlain qiongmeng-m ran485 zhang0818k shumailasy elcega ipstone jkkbuddika ashwini-kr-sharma valosaurusrex andreagrioni ericblanc20 lananhle mohanbabu29 dunlapg anuragraj johnthomas75 ohoelske maj18 longtao-wu yuzhenpeng khemlalnirmalkar bharatm26 intoeden tianyishi2001 gautam-lk jadetree21 caucheteux jahernayeem xiaobo199405 benjibromberg sukses24 tobbyxy mcc5635

enhancedvolcano's Issues

Removed 351 rows containing missing values (geom_point)

Hi Kevin,

Your package works well for most of what I am trying to do. But I keep getting the following error when I try to make the volcano plot:
"Warning message:
Removed 351 rows containing missing values (geom_point)"
It doesn't seem to be a problem with the axes limits as I changed both of them to accommodate the max and min of both variables. Any idea what might be causing this problem? Thanks!

Rownames despite below threshold

[Hi,
Thank you for this awesome package. I managed to generate a volcano plot, however, most of the genes of interest are not greater than 2-fold change. Is there anyway to still force the plot to give me their names on the plot itself. Here is my code and my plot:

EnhancedVolcano(topT,
                lab = rownames(topT),
                x = 'log2FoldChange',
                y = 'pvalue',
                xlim = c(-8, 8),
                title = 'VD versus CSD',
                pCutoff = 10e-3,
                FCcutoff = 1.5,
                transcriptPointSize = 2.5,
                transcriptLabSize = 3.0,
                legend=c('NS','Log2 fold-change','P value',
                         'P value & Log2 fold-change'),
                legendPosition = 'right',
                legendLabSize = 16,
                legendIconSize = 5.0,
                colAlpha = 1)

p-value or adjusted-p-value

Hello,

Thank you for the fantastic package.

I was wondering whether the function uses p-values or adjusted p-values? Is adjusted p-value threshold determined by pLabellingCutoff argument?

Thank you again,
Homa

unused argument (colCustom = keyvals)

Hi there!

I'm using EnhancedVolcano (a really interesting package) and I want to change the colors like this:
EnhancedVolcano(c23_w.tt,lab = as.character(c23_w.tt$gtf_tags.gene_name), x = "logFC", y = "P.Value", ylim = c(0,6), selectLab = c("egln2","gnb3a") ,
title = "Down- and up-regulated genes in Clo2&Clo3 VS WT",
pCutoff = 0.05,
FCcutoff = 1,
colCustom = keyvals)
I'm obtaining the error:
Error in EnhancedVolcano(c23_w.tt, lab = as.character(c23_w.tt$gtf_tags.gene_name), :
unused argument (colCustom = keyvals)

I installed the devel version. This is the output of sessionInfo():
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.8.3 gplots_3.0.1.1 EnhancedVolcano_1.3.1 ggrepel_0.8.1 ggplot2_3.2.0
[6] edgeR_3.24.3 limma_3.38.3 DESeq2_1.22.2 SummarizedExperiment_1.12.0 DelayedArray_0.8.0
[11] BiocParallel_1.16.6 matrixStats_0.54.0 Biobase_2.42.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2
[16] IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0

loaded via a namespace (and not attached):
[1] bit64_0.9-7 splines_3.5.2 gtools_3.8.1 Formula_1.2-3 assertthat_0.2.1 latticeExtra_0.6-28 blob_1.2.0
[8] GenomeInfoDbData_1.2.0 pillar_1.4.2 RSQLite_2.1.1 backports_1.1.4 lattice_0.20-38 glue_1.3.1 digest_0.6.20
[15] RColorBrewer_1.1-2 XVector_0.22.0 checkmate_1.9.4 colorspace_1.4-1 htmltools_0.3.6 Matrix_1.2-17 XML_3.98-1.20
[22] pkgconfig_2.0.2 genefilter_1.64.0 zlibbioc_1.28.0 purrr_0.3.2 xtable_1.8-4 scales_1.0.0 gdata_2.18.0
[29] htmlTable_1.13.1 tibble_2.1.3 annotate_1.60.1 withr_2.1.2 nnet_7.3-12 lazyeval_0.2.2 survival_2.44-1.1
[36] magrittr_1.5 crayon_1.3.4 memoise_1.1.0 foreign_0.8-71 tools_3.5.2 data.table_1.12.2 stringr_1.4.0
[43] locfit_1.5-9.1 munsell_0.5.0 cluster_2.1.0 AnnotationDbi_1.44.0 compiler_3.5.2 caTools_1.17.1.2 rlang_0.4.0
[50] grid_3.5.2 RCurl_1.95-4.12 rstudioapi_0.10 htmlwidgets_1.3 labeling_0.3 bitops_1.0-6 base64enc_0.1-3
[57] gtable_0.3.0 DBI_1.0.0 R6_2.4.0 gridExtra_2.3 knitr_1.23 bit_1.1-14 zeallot_0.1.0
[64] Hmisc_4.2-0 KernSmooth_2.23-15 stringi_1.4.3 Rcpp_1.0.1 vctrs_0.2.0 geneplotter_1.60.0 rpart_4.1-15
[71] acepack_1.4.1 tidyselect_0.2.5 xfun_0.8

Could you tell me what I'm doing wrong?
Many thanks!

Add support for tibbles

Suggest that you replace all references to [,x], [,y] and [,label] with [[x]], etc.
Tibbles don't drop by default, so they return single column tibbles with [,x] construct.
Using [[x]] will support tidyr programming as well as conventional data.frames.

Hiding custom color legend

legendVisible = FALSE does not hide the legend created for the custom key-value pairs

How to plot the selectLab which overlapping with the double postive for logFC and P ?

About Legend

Hi Kevin,

Thank you for this tool.

I am using the Enhanced Volcano package and perceived that adding "legend=c('A','B,'C','D')" as indicated in the item 4.5 of vignette (https://www.bioconductor.org/packages/release/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html#draw-labels-in-boxes) does not generate the expected in the plot.

Despite of this, adding "legendLabels=c('A','B,'C','D')" creates the plot with the expected information.

Thanks for the attention,

Joao Gouveia

pinpointing the gene of interest

Hi Kevin,

First of all thank you for this very useful script. I'm really new in terms of using R for analysis, so sorry in advance if I don't provide enough information. Basically I wanted to make a volcanoplot, only highlighting my gene of interest:

EnhancedVolcano(topTab, lab=rownames(topTab), x="logFC", y="adj.P.Val", selectLab = c("Hoxa9", "Meis1"), pCutoff = 0.05, FCcutoff = 2.0, gridlines.major=FALSE, gridlines.minor=FALSE)

This worked fine and kind of did what I wanted. But the issue I had is, I couldn't pinpoint the exact dot of my genes (attached)

I tried using the DrawConnectors=TRUE,
EnhancedVolcano(topTab, lab=rownames(topTab), x="logFC", y="adj.P.Val", selectLab = c("Hoxa9", "Meis1"), pCutoff = 0.05, FCcutoff = 2.0, DrawConnectors = TRUE, widthConnectors = 5.0, colConnectors = "grey", gridlines.major=FALSE, gridlines.minor=FALSE)

but no line connector appeared on the plot. Is there any other way around this?

Thanks!

Deprecated pointSize, labSize args in vignette?

Hi Kevin,

Thank you for this R package!

I am running the vignette and I got the following error message:

Error in EnhancedVolcano(res2, lab = rownames(res2), x = "log2FoldChange",  : 
  unused arguments (pointSize = 4, labSize = 3)

Once I remove the arguments pointSize, labSize from the EnhancedVolcano::EnhancedVolcano() function everything works ok.

https://github.com/kevinblighe/EnhancedVolcano/blob/master/vignettes/EnhancedVolcano.Rmd#L152-L153

EDIT:

Also occurs for the following args:

boxedLabels
labCol
labFace

Error in EnhancedVolcano(res1, lab = rownames(res1), x = "logFC", y = "FDR", : unused argument (shape = 1)

I'm trying create a volcano plot, but i'm getting an error with shape parameter:

> library(EnhancedVolcano)
> setwd("/home/me/new-arriv_wint_pre-mig_trinity_out_dir/GRAPH/VOLCANO_PLOT_teste/")
> res1=read.csv(file='new-arriv_wint_pre-mig.gene.TMM.EXPR.matrix.newlly_arrived_vs_pre-migration.edgeR.DE_results.txt', sep = ",", header=TRUE)
> EnhancedVolcano(res1,
+ lab = rownames(res1),
+     x = 'logFC',
+     y = 'FDR',
+     selectLab = c('none'),
+     ylab = bquote(~-Log[10]~~italic(FDR)),
+     xlab = bquote(~Log[2]~ 'fold change'),
+     xlim = c(-10, 10),
+     title = 'new_arriv vs pre-mig',
+     pCutoff = 0.01,
+     FCcutoff = 1,
+     transcriptPointSize = 3.0,
+     transcriptLabSize = 3.0,
+     shape = 1,
+     colAlpha = 1,
+     legend=c('NS','Log2 FC','FDR','FDR & Log2 FC'),
+     legendPosition = 'bottom',
+     legendLabSize = 16,
+     legendIconSize = 5.0,
+     gridlines.major = FALSE,
+     gridlines.minor = FALSE)
Error in EnhancedVolcano(res1, lab = rownames(res1), x = "logFC", y = "FDR",  : 
  unused argument (shape = 1)

Attached my data file
new-arriv_wint_pre-mig.gene.TMM.EXPR.matrix.newlly_arrived_vs_pre-migration.edgeR.DE_results.txt

Cannot install "kevinblighe/EnhancedVolcano"

Dear Kevin,
I have just discovered this package and wanted to have look at it. However, I immediately encounter a hurdle that I cannot figure out. When I do: devtools::install_github("kevinblighe/EnhancedVolcano"), I get the following return:
Downloading GitHub repo kevinblighe/EnhancedVolcano@master
√ checking for file 'C:\Users\jada\AppData\Local\Temp\RtmpktRGsc\remotes10b0795d3486\kevinblighe-EnhancedVolcano-b7aab27/DESCRIPTION' ...

preparing 'EnhancedVolcano':
√ checking DESCRIPTION meta-information ...
checking for LF line-endings in source and make files and shell scripts
checking for empty or unneeded directories
building 'EnhancedVolcano_1.3.5.tar.gz'

installing source package 'EnhancedVolcano' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error: (converted from warning) package 'ggplot2' was built under R version 3.6.1
Execution halted
ERROR: lazy loading failed for package 'EnhancedVolcano'
removing 'C:/Program Files/R/R-3.6.0/library/EnhancedVolcano'
restoring previous 'C:/Program Files/R/R-3.6.0/library/EnhancedVolcano'
Error: Failed to install 'EnhancedVolcano' from GitHub:
(converted from warning) installation of package ‘C:/Users/jada/AppData/Local/Temp/RtmpktRGsc/file10b07e1c68c7/EnhancedVolcano_1.3.5.tar.gz’ had non-zero exit status

Hope you are able to advice me.
Thank you.
Jahn

Can natural log (not Log2FC) also be used equally well?

Hi Kevin,

I just came across your package, and am about to try it. Looks very exciting! Before that, I just wanted to double check the following. My DE list is actually from single-cell data, specifically from the Seurat package. Therefore, they are not log2-based FCs, but rather based on the natural log. Does this matter in any way? As far as I can see, it shouldn't, but I wanted to ask you, is the package more optimal for Log2FCs in any way? Hope I can still benefit from the package!

Also, are there any points/tips to keep in mind when using the package for single-cell data?

Many thanks!

boxedlabels = TRUE doesn't work

Hi,

First of all: what a great package!

Second, while applying it to my data, I wanted to use boxedlabels = TRUE, but this throws an error Error in EnhancedVolcano(cells, lab = rownames(cells), : unused argument (boxedlabels = TRUE).

How can I fix that?

Sander

package ‘EnhancedVolcano’ is not available (for R version 3.5.1)

Dear Kevin,

First of all thanks for developing new tools to improve the quality and interpretation of NGS data. I tried to install your package "EnhancedVolcano" first on R version 3.4.4 without success. I then updated to R 3.5.1 and I still keep on getting the following error message, regardless if I try with "biocLite" or "BiocManager"

###################################

biocLite("EnhancedVolcano")
BioC_mirror: https://bioconductor.org
Using Bioconductor 3.6 (BiocInstaller 1.30.0), R 3.5.1 (2018-07-02).
Installing package(s) ‘EnhancedVolcano’
Warning message:
package ‘EnhancedVolcano’ is not available (for R version 3.5.1)
###################################
BiocManager::install("EnhancedVolcano")
Bioconductor version 3.7 (BiocManager 1.30.3), R 3.5.1 (2018-07-02)
Installing package(s) 'EnhancedVolcano'
Warning message:
package ‘EnhancedVolcano’ is not available (for R version 3.5.1)
###################################

Please let me know if you or some of the users have gone through this.

Best,

Oscar

y-axis

It is possible to change y-axis from -log10, to another method, such as log2 or just simply from 0 to 1?
This is because in proteomics data, confidence (from 0 to 100%) is some ways more reliable than p-value.

Issue when trying to shade a key variable

Not sure if this is the correct location for my question.

Has anyone come across this error before. My key variable is not being shaded and I get this error message:

Computation failed in stat_density2d():
missing value where TRUE/FALSE needed.

New to R but have been able to make some nice plots with this excellent package!

Thank you.

FC cut off issue

How to add break y-axis?

My y-axis values range from 0 to about 300. Is it possible to break y-axis; may be from 50-175 so that it enlarges image from 0-50 ?

importing data

Feature request: custom point size

Color and shape of points can be customized based on key-value pairs which is great, but unless I missed it it appears I can't customize point size. This would be useful for encoding another attribute such as intensity.

Shade (annotate) part of the plot

See relevant post at SO:

https://stackoverflow.com/questions/53272541/how-do-i-shade-plot-subregion-and-use-ggrepel-to-label-a-subset-of-data-points

Feature request: options to have thresholds values

Hello, Kevin!
Sorry for veeeeery looooong delay to answer :)

I mean it would be useful to see values marked by red question marks. The most useful (for biologists) it would be to see at ordinate axis something like padj=0.01 (or padj=1.0 x 10-2) instead of simple "2"

Feature request: option to have thresholds values drawn.

It would be good to have option/parameter to mark threshold(s) values explicitly at axes. At least, at ordinate (Y) axis. Surely I can do it lately in Inkscape/Illustrator/CorelDraw/Acrobat. But it would be convenient to have it "automatically" drawn in the Enhanced Volcano.
Thanks in advance!

Add gradient color based on padj

Hi Kevin,

Having a lot of fun working with this tool. I'm trying to color the points based on a gradient based on padj values. I tried to use scale_color_gradient but I can't seem to work it into the variables.

Any suggestions on how to do this?
Thanks!
D

Question - How is the number of labels plotted decided?

As titled,

If the plot has, say, 1000 genes that meet the criteria to be labelled:

Which of the 1000 genes are labelled?
I know that some genes are not labelled to avoid overplotting.

I am wondering how it is decided which genes are plotted and which are not.

Also, I can not yet find a way to label, lets say, top 20 genes with the highest LFC, is there a way to do that?

Thanks so much for reviewing and devlopment!

Feature request: adjust color and Legend according to freely defined cutoff lines

Hi, Kevin. Thank you for your effort in EnhancedVolcano. It’s very useful in my work dealing with quantitative proteomics data.
I see you have added the functions 'hline' and 'vline' to display extra cutoff lines besides to the original lines defined via ‘pCutoff’ and ‘FCcutoff’. Is it possible to color each area segmented by cutoff lines, no matter it is ‘pCutoff’, ‘FCcutoff’, ‘hline’ or ‘vline’ that defined the cutoff?
For example, I used ‘pCutoff’ and ‘FCcutoff’ to set cutoff lines, I want to color the significantly up-regulated genes (p<0.05 & foldchange >logRatio) ‘red’, color the significantly down-regulated genes (p<0.05 & foldchange < -logRatio) ‘green’, and leave other points default (‘black’). Can not achieve this using current version of EnhancedVolcano.
For another specific circumstance, I want a plot that focus on the up-regulated proteins, with two different thresholds, e.g. 2>foldchange >=1, and foldchange >= 2. If I use a ‘hline’ to define ‘p<0.05’, use two ‘vline’ to define ‘2>foldchange >=1’ and ‘foldchange >= 2’, what can I do to color ‘’p<0.05 & foldchange >= 2” and “p<0.05 & 2>foldchange >=1” differently?
Moreover, the Legend corresponding to each area and its color defined by the above cutoff lines, may need to be adjusted.
P.S. If change the arguments like ‘transcriptPointSize’, ‘transcriptLabSize’ to ‘PointSize’, ‘LabSize’, I think it will be more concise and helpful, because it may be gene, protein, or other subjects besides to transcript.
Thank you very much!

Adding label to the cutoff vline

I had been trying to add a label of cutoff by using the command -
vline=c(-6, 6),label = c(-6,6),
but it says unused argument and nothing happens to the plot labels. Any help regarding this?
Thank you.

error message when using multiple shapes

When using my own data, I keep getting stuck when I want to use multiple shapes.

the command I sue is this

EnhancedVolcano(resMA,
    lab = rownames(resMA),
    x = 'log2FoldChange',
    y = 'pvalue',
    selectLab = c('ENSG00000180537','ENSG00000153885'),
    xlim = c(-6,7),
    xlab = bquote(~Log[2]~ 'fold change'),
    pCutoff = 10e-14,
    FCcutoff = 2.0,
    pointSize = 4.0,
    labSize = 5.0,
    shape = c(1,4,23,25), 
    colAlpha = 1,
    legendPosition = 'right',
    legendLabSize = 14,
    legendIconSize = 5.0)

and this is the error I get:

Error in `[[<-.data.frame`(`*tmp*`, i, value = c(NS = 1, FC = 4, P = 23, : replacement has 4 rows, data has 3
17. stop(sprintf(ngettext(N, "replacement has %d row, data has %d", "replacement has %d rows, data has %d"), N, nrows), domain = NA)
16. `[[<-.data.frame`(`*tmp*`, i, value = c(NS = 1, FC = 4, P = 23, FC_P = 25))
15. `[[<-`(`*tmp*`, i, value = c(NS = 1, FC = 4, P = 23, FC_P = 25 ))
14. modify_list(data, guide$override.aes)
13. FUN(X[[i]], ...)
12. lapply(layers, function(layer) { matched <- matched_aes(layer, guide, default_mapping) if (length(matched) > 0) { if (!is.null(names(layer$show.legend))) { ... 11. guide_geom.legend(X[[i]], ...)
10. FUN(X[[i]], ...)
9. lapply(gdefs, guide_geom, layers, default_mapping)
8. vapply(x, is.null, logical(1))
7. compact(lapply(gdefs, guide_geom, layers, default_mapping))
6. guides_geom(gdefs, layers, default_mapping)
5. build_guides(plot$scales, plot$layers, plot$mapping, position, theme, plot$guides, plot$labels)
4. ggplot_gtable.ggplot_built(data)
3. ggplot_gtable(data)
2. print.ggplot(x)
1. (function (x, ...) UseMethod("print"))(x)

Whan I remove the row shape = c(1,4,23,25), the command works.

What is the meaning of the error?

thanks

Assa

Using another continuous value (not logFC or p) to scale point size/gradient colour points

Hello,

Really loving this package and it is helping me to make some great volcano plots!

I have a question, I wonder if this is possible..

I am producing a volcano plot where I am trying to highlight which DEGs are present in an OpenTargets disease association list. Right now I have just made it so that pointSize = 3 if the DEG is present in the list of disease-associated DEGs and 1 if not. This looks great but I want to add another layer of complexity.

As I have quite a few disease-associated DEGs, I would like to somehow scale the points by the gene's association score (0-1) which is present in another dataframe. Could either do this with size, where higher association score = larger point (with all disease-associated DEGs having a larger point than non-disease associated), or what may possibly look better is using a colour gradient - so that all disease-associated DEGs are larger than non-disease associated, and they also scale from red for low association score to black for high association score for example. Then I would have another legend labelled e.g. "disease association score" which goes from 0-1 | red-black, so one can see overall (a) which DEGs are disease-associated (b) which ones have a stronger association. Does this make sense?

I saw that there is the option colGradient but looks like it is only for p-value, and the custom colours example is for discrete sets of genes rather than a continuous value. I can try to hack the custom colours functionality to get it to do what I need but was wondering if there was already a way to do this that I'm missing.

Thanks! I hope my question made sense :D

EDIT: so far I have created a "keyvals" vector with the colours I want (gradient red - black for disease-associated DEGs based on the association score, blue for p-value, green for fc and grey for ns). Now all I need is a way to have a legend such that grey, green and blue are labelled as such as in the default legend, and a separate gradient bar for disease association score. I will update if I manage it with the required code and minimal reproducible example (actual data is confidential).

Missing arrguments

Hi Kevin,

Thanks a lot for making this great package!
I am not so sure, but some of the arguments in the EnhancedVolcano function are missing for me: for example shape, hline, boxedlabels.
Am I wrong or ?

Control over axis is required.

First of all, thanks for useful script with VERY good documentation and ready-to-use examples! Unfortunately, modules like EnhancedVolcano are rather exception (as for documentation and examples) in R repository and (especially) Bioconductor.
It seems that it would be good to have more control over axes (ticks marks, frequency of main ticks etc.) I've tried to pass xaxt="n" to EnhancedVolcano and then use standard R axis() function but to no avail.

Title

Hi Kevin,
Thank you very much for this package. Just a quick question. Is there any way that we can remove the title? By default I get "Vaolcano Plot/ Bioconductor ....".
Thanks
Hossein

Running EnhancedVolcano in a loop renders blank output

Hi Kevin,

I just started using you tool, lovely piece of work. I'm stumbing on one issue though and not sure if it's my setup or what:

I run a piece of code in Rstudio which basically generates a DESeq output object for multiple comps in a dataset. For each result object, I return various plots and tables, one of which is an EnhancedVolcano pdf. Everytime I get a an empty pdf rendered.
If I plot the pdf for any one result object individually, it plots just fine. All other plots and tables output fine in the loop, just EnhancedVolcano is empty.

The loop looks like this:

for (t in unique(All_groups)){
  seen <- append(seen, t)
  for (t2 in All_groups){
    if(t2 == t) next
    if(t2 %in% seen) next 
    testGroup <- t
    ctrlGroup <- t2 #setdiff(All_groups,t)

...... run DiffExpr for test v ctrl .......

# 1. Dispersion Estimate
message("Plotting Dispersion")
pdf(file=paste(outDir,testGroup,"_v_",ctrlGroup,".dispersionEstimate.pdf", sep=""),width=10, height=8)
plotDispEsts(DE_dds)
dev.off()
# 2. MAplot of 2-way comparison
message("Plotting MA")
pdf(file=paste(outDir,testGroup,"_v_",ctrlGroup,".MA_plot.pdf", sep=""),width=10, height=8)
DESeq2::plotMA(result_out,ylim=c(-5,5), alpha=0.05, main=paste(testGroup, " v ", ctrlGroup))
dev.off()
# 2b. 2-way comparison using EnhancedVolcanoPlot 
message("Plotting Volcano")
pdf(file=paste(outDir,testGroup,"_v_",ctrlGroup,".Volcano_plot.pdf", sep=""),
    width=10,height=8)
EnhancedVolcano(result_out, x="log2FoldChange", y="padj",
                lab=rownames(result_out),
                FCcutoff = 1.5, pCutoff = 1e-4,
                #xlim=c(-5,5), ylim=c(0,-log10(10e-40)),
                colAlpha = 0.8,
                legendPosition = "right",
                legend=c("NS","Log2 FC","Adjusted p-value",
                         "Adj p-value & Log2 FC"),
                title = paste(testGroup, " v ", ctrlGroup),
                gridlines.major = FALSE,
                gridlines.minor = FALSE)
dev.off()
.... continue with loop .....

Any ideas by any chance?
Sean

Rowname questions

Hi:

The rowname in our table is from 0-100, how can I transform the column name(transcript label) to the rowname?

Gene label plotting

Hi Kevin,
thanks for the amazing package!

I recently moved to iOS and updated the package to 1.4.0 but I'm facing some issues when plotting gene labels.
The dot is correctly recognised on the plot but instead of being labeled with the gene name is instead labeled with a 4 digit number. Interestingly, the numbers seem to be specific for each gene and never change (ever after R session restart) but I can't find any reference for those numbers in my dataframe.

EnhancedVolcano(toptable = dataframe, lab = dataframe$gene_name, x = 'log2FoldChange', y = 'padj', selectLab = c("CLEC6A", "IL23A", "SLPI", "WNT9A", "MT1H", "MT1M") )

Interestingly, if I skip the selectLab option, all the gene labels are plotted with the correct names.

EnhancedVolcano(toptable = dataframe, lab = dataframe$gene_name, x = 'log2FoldChange', y = 'padj' )

Any clue of which might be the problem?

Thanks a lot!
Marco

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

attached base packages:
parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
pander_0.6.3 RColorBrewer_1.1-2 EnhancedVolcano_1.4.0 ggrepel_0.8.1 ggplot2_3.2.1 pheatmap_1.0.12 edgeR_3.28.0 DESeq2_1.26.0 SummarizedExperiment_1.16.0 DelayedArray_0.12.0. BiocParallel_1.20.0 matrixStats_0.55.0 Biobase_2.46.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.0 IRanges_2.20.0 S4Vectors_0.24.0 BiocGenerics_0.32.0 Vennerable_3.1.0.9000 limma_3.42.0

Legend Labels not changing

Hi Kevin,

Thank you for this great tool.

I am facing an issue while trying to change the legend names. I am recieving the below error when I tried to change it. Here is my code.

EnhancedVolcano(result, lab = result$Gene, x = "log2Ratio", y = "pvalue", pCutoff = 0.05, FCcutoff = 1.5, labSize = 2.5, legend = c("A","B","C","D"), xlab = bquote(~Log[2]~ 'Ratio'), shape = c(6, 6, 19, 16), title = "Volcano Plot", subtitle = "Changes", caption = "Log2Ratio cutoff - 1.5; p-value cutoff - 0.05", legendPosition = "bottom", legendLabSize = 14, col = c("grey30", "forestgreen", "royalblue", "red2"), colAlpha = 0.9, drawConnectors = TRUE, hline = c(10e-8), widthConnectors = 0.5)

Error is as below.

Error in EnhancedVolcano(result, lab = result$Gene, x = "log2Ratio", y = "pvalue", :
argument 8 matches multiple formal arguments

Even I am not able to see the legends changed in your tutorial vignettes also.

Thanks,
Athul

incorrect p-val cutoff line render

Hi EnhancedVolcano developers

I noticed that by default the EnhancedVolcano plot renders the p-value cutoff dotted line below the indicated cutoff value. I am just wondering if this behavior is by design or a bug?

Thanks,
Amin

No Labels

Hi Kevin
I would like to know if it is possible to completely remove the transcripts label.
I tried to put transcriptLabSize=0 but i still have labels (even if very small)
Thanks
Lorenzo

[bug?]class double problem

Hello, Thanks for such good tool.
Here my problem is get error when I parse tibble with double class columns, see the code:

> diff_expr_mt <- as.matrix(diff_expr)
> head(diff_expr, n=2L)
# A tibble: 2 x 7
  ENTREZID logFC AveExpr     t       P.Value adj.P.Val     B
     <dbl> <dbl>   <dbl> <dbl>         <dbl>     <dbl> <dbl>
1     1668  13.0    11.6  42.8 0.00000000297 0.0000475  8.74
2     1667  11.7    12.4  33.6 0.0000000142  0.000105   8.27
> head(diff_expr_mt, n=2L)
     ENTREZID    logFC  AveExpr        t      P.Value    adj.P.Val        B
[1,]     1668 12.95978 11.59565 42.78285 2.965640e-09 0.0000475481 8.735292
[2,]     1667 11.65022 12.37980 33.63170 1.421914e-08 0.0001046992 8.267310
> p <- EnhancedVolcano(diff_expr, x="logFC", y="P.Value", lab="ENTREZID", selectLab=NULL)
Error in EnhancedVolcano(diff_expr, x = "logFC", y = "P.Value", lab = "ENTREZID",  : 
  logFC is not numeric!
> p <- EnhancedVolcano(diff_expr_mt, x="logFC", y="P.Value", lab="ENTREZID", selectLab=NULL)

However, double should be a numeric.

It is a historical anomaly that R has two names for its floating-point vectors, double and numeric (and formerly had real).
double is the name of the type. numeric is the name of the mode and also of the implicit class. As an S4 formal class, use "numeric".
The potential confusion is that R has used mode "numeric" to mean ‘double or integer’, which conflicts with the S4 usage. Thus is.numeric tests the mode, not the class, but as.numeric (which is identical to as.double) coerces to the class.

Also I want to know how to hide point labels? I use selectLab=NULL but still ENTREZID annotation on my plot. Thanks.

[request]drawConnectors alway on or pinpoint dots

Hi Kevin,
Thank you for the enhanced volcanoPlot wrapper.

I am font of the
drawConnectors and selectLab options.
but there are plots that the genes of interest are quite sparse and can not be highlighted in the pool of dots.
Is there a possibility to:

make the drawConnectors always on even there is no overlap?
create a circle/square on the point of interest?
make use of specific shape for some of the genes?
I also tried the shapeCustom but I was not managed to make it work (do you have an examble?).

regarding 3 I tried to do something like that but couldn't
my code:

EnhancedVolcano(toptable = res.tables[[1]],
                lab = res.tables[[1]][, 'SYMBOL'],
                x = 'log2FoldChange',
                y = 'padj',
selectLab = c("Hdac6", "Mcm10", "Mcm8", "Krt14", "Rbm20", "Kras", "Cdkn2a", "Ccndf", "Chaf1b", "Pola2", "Mrm3", "Rad50", "Cdkn3", "Mettl21e", "Igfbp2", "hopx", "Sftpc", "Cdh1", "apln", "Pecam", "Cdh5","Cdkn3","Ccnf","Foxo3","Ccnd","YWHAE","SIRT1","Rbl1","Ccnb1","bcl6","Gadd45a","Cat","SOD2","Cdkn2a","Cdkn1d","PLK1","TNFSF10"),

                xlab = bquote(~Log[2]~ 'fold change'),
                ylab = bquote(~-Log[10]~adjusted~italic(P)),
                title = 'P21_SD versus P21_HFD, padj= 0.05, FC = 2',
                xlim=c(-6.5,5),
                pCutoff = 0.05,
                FCcutoff = 1.0,
                pointSize = 1.0,
                labSize = 5,
                col=c('grey60', 'green', 'grey70', 'red'),
                colAlpha = 4/5,
                drawConnectors = TRUE,
                widthConnectors = 0.5,
                lengthConnectors = unit(0.01, 'npc'),
                colConnectors = "black",
                )

Error: Aesthetics must be either length 1 or the same as the data (1267): colour

When i use the colcustom, the error was happened.

Error: Aesthetics must be either length 1 or the same as the data (1267): colour
In addition: Warning message:
One or more P values is 0. Converting to minimum possible value...

shape, colCustom and drawConnector arguments don't work

Hello,

When I'm trying to use "shape", "colCustom" or "drawConnectors" arguments, I get an error "unused arguments".

Can you help me with this issue ?

Btw very nice package !

Thank you

Emilie

Numbers of Y-axis padj value

Dear Kevin Blighe,

I would lie to know if it's possible to plot the numbers on the Y-axis according to the real padj value numbers from a DESeq2 data?
I used the following script code and the numbers of Y-axis (pCcutoffLine = 0.05) are not related to the real number of the padj value.
Thank you so much.

EnhancedVolcano(SED.HFD.PGE_PA.HFD.PGE,
lab = rownames(SED.HFD.PGE_PA.HFD.PGE),
x = 'log2FoldChange',
y = 'padj',
xlim = c(-3, 4),
ylim = c(0, 3),
selectLab = c('FALSE'),
title = 'SED.HFD.PGE_vs_PA.HFD.PGE',
xlab = bquote(~ Log[2] ~ 'fold-change'),
ylab = bquote(~ Adjusted ~p-value),
cutoffLineType = 'twodash',
cutoffLineWidth = 0.6,
pCutoff = 0.05,
FCcutoff = 1.0,
transcriptPointSize = 2.5,
transcriptLabSize = 3,
boxedlabels = FALSE,
col=c('grey0', 'green3', 'blue1', 'red2'),
colAlpha = 0.3,
shape = 20,
legend=c('Non-significant','Log2 fold-change','Adjust p-value','Adjust p-value & Log2 fold-change'),
legendPosition = 'right',
legendLabSize = 12,
legendIconSize = 5.0,
drawConnectors = FALSE,
widthConnectors = 1.0,
colConnectors = 'black',
gridlines.major = FALSE,
gridlines.minor = FALSE)

#--------------# Volcano Plot #---------------------------#

Omit arrow (drawConnectors)

Hi,
is it possible to omit the arrow and draw just a line between the label and the actual data point?
Best,
Axel

error with ggplotly

I am trying to use enhanced volcano on shiny but i get this error:

p <- EnhancedVolcano(gene_diff, lab = gene_diff$symbol, x = 'logfc', y = 'pval')
class(p)
[1] "gg" "ggplot"
p %>% ggplotly(tooltip = "tooltip")
Error in unique.default(x) : unique() applies only to vectors
In addition: Warning messages:
1: In if (nchar(axisTitleText) > 0) { :
the condition has length > 1 and only the first element will be used
2: In if (nchar(axisTitleText) > 0) { :
the condition has length > 1 and only the first element will be used

i can see the plot in R studio but not when trying to visualize it in shiny.

Thank you.

advice on LFC cutoff

Hi Kevin
thanks for developing the EV.
was wondering if you recommend a cut off for LFC for values that are shrunk?
my values are shrunk and in your vigentte you mentioned that LFC |1.5| is too stringent
that's true because i cant see any labels appearing with LFC 1.5
how do we determine the LFC for shrunk values and what's your recommendation?
thanks

pointSize unused argument

Dear Kevin,

Firstly, many thanks for a great package. I've been experiencing an issue similar to that noted by others, namely that the pointSize, labSize and boxedLabels options result in an 'unused argument' error:

p1 <- EnhancedVolcano(p,
lab = rownames(p),
x = 'log2FC',
y = 'padj',
xlim = c(-2,2.5),
ylim = c(0,3),
pointSize = 4)

Error in EnhancedVolcano(lab = p$labs, x = "log2FC", y = "padj", xlim = c(-2, :
unused argument (pointSize = 4)

I noticed that previous errors were related to outdated versions of either the package or R, but I don't think that is the issue here (I've installed the latest version from Github, rather than via Bioconductor):

platform x86_64-redhat-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 6.0
year 2019
month 04
day 26
svn rev 76424
language R
version.string R version 3.6.0 (2019-04-26)

packageVersion("EnhancedVolcano")
[1] ‘1.5.4’

I'd be grateful for any suggestions on what might be tripping the options up.

Many thanks again for a great package and best wishes

zero pvalues after lfcshrink

Hi,
Very nice function!
I tried to use res(from DESeq2) to do volcano plot, however, I found there is difference between lfcshrink and not shrinked. See below
volcano indeFilter_F_lfcshrink.pdf
volcano indeFilter_T_notlfcshrink.pdf

Any suggestion which one I should represent for a publication-ready plot?
Thanks!

dealing with zero pvalue

Hi @kevinblighe
I am using MAST to calculate scRNA-seq data.
For top DEGs, there are extreme small pvalues(0.000000e+00), figure below.
They will not be on the volcanoplot. Any suggestion how to plot them?

kevinblighe / enhancedvolcano Goto Github PK

enhancedvolcano's People

Contributors

Stargazers

Watchers

Forkers

enhancedvolcano's Issues

Recommend Projects

Recommend Topics

Recommend Org