hms-dbmi / upsetr Goto Github PK

View Code? Open in Web Editor NEW

736.0 38.0 147.0 30.69 MB

An R implementation of the UpSet set visualization technique published by Lex, Gehlenborg, et al..

Home Page: https://cran.rstudio.com/web/packages/UpSetR

License: Other

R 100.00%

upset upsetr visualization gehlenborglab rstats ggplot2

upsetr's Issues

New intersection function name

It appears that R linked both "intersection" and "intersect" with the intersect function help page. What would be a suitable name?

set up continuous integration with Travis

The package should be built and tested every time we push code to the repo. Integration with Travis seems to be the most popular way to get CI for R packages:

http://www.r-bloggers.com/continuous-integration-for-r-packages/

use lightgray for non-query data points in built-in attribute plots if queries are present

That would make it easier to see those data points highlighted in the color of the query.

set "Set Size" in regular font not bold

The font should match the "Intersection Size" font.

setting size of mainbar.y.max

Hi, thanks so much for the great package!

it seems that mainbar.y.max can not accept values that are less than the size of the largest intersection size bar. If doing so I get the following error: "Error: Aesthetics must be either length 1 or the same as the data (59): fill In addition: Warning message: Removed 4 rows containing missing values (position_stack)."

I was wondering whether you implemented this with coord_cartesian() in ggplot2 or not. If yes I was wondering why I might get this error.

BW
Philipp

images for README

Make circles in matrix proportional to row height.

It looks like the circle in the set intersection matrix have a fixed pixel value? They should be set to always be ~80% of the row height.

Here is an example where it looks awkward:

"Intersection" custom query should be implemented as a custom query function

Said custom query function should be included in the package.

Input Data Format

Dear UpSetR team,

Thank you for a great piece of work.

My input file has 7 columns with each column having genenames like this

head input.txt

Sample1,Sample2,Sample3,Sample4,Sample5,Sample6,Sample7
uc008vaw.1Rnu11,uc008vaw.1Rnu11,uc012aua.1AB339930,uc012ath.2Rn45s,uc008gfm.1AK197973,uc008gfm.1AK197973,uc012bec.1Mir122a
uc008ztz.1AK212710,uc008ztz.1AK212710,uc008ztz.1AK212710,uc008vaw.1Rnu11,uc008gfl.1AK181808,uc008gfl.1AK181808,uc008yaz.2Alb
uc009phb.3Apoa1,uc008gfl.1AK181808,uc033fml.1Mir8114,uc008ztz.1AK212710,uc012bhe.1Neat1,uc012bhe.1Neat1,uc008vxy.1Errfi1
uc008yaz.2Alb,uc008gfm.1AK197973,uc011zoa.1Mir320,uc008gfl.1AK181808,uc008gfk.1AK148054,uc008gfk.1AK148054,uc008eet.2Ttr
uc007puo.1Hist1h4c,uc011zxb.1Rnu12,uc008vaw.1Rnu11,uc012bhe.1Neat1,uc012bec.1Mir122a,uc008odl.1Pck1,uc009phb.3Apoa1
uc012aua.1AB339930,uc012aua.1AB339930,uc012bty.1Mir2861,uc008gfk.1AK148054,uc008yaz.2Alb,uc012bec.1Mir122a,uc008vxw.1Errfi1

But when I read this CSV file with read.csv function in R and issue the upset command, I get an error saying "Error in start_col:end_col : argument of length 0"

Should I transform my input file?

Kindly advice.

Thanks
G

bug in specific_intersections when keep == sets

specific_intersections <- function(data, first.col, last.col, intersections, order_mat,
                                   aggregate, decrease, cut, mbar_color){
  sets <- names(data[c(first.col:last.col)])
  keep <- unique(unlist(intersections))
  remove <- sets[which(!sets %in% keep)]
  remove <- which(names(data) %in% remove)
  data <- data[-remove]

when keep is equivalent to sets remove is set to integer(0) and data gets wiped out

Browse[2]> remove
integer(0)
Browse[2]> data[-remove]
data frame with 0 columns and 14632 rows

replace journal URLs in documentation

http://www.nature.com/nmeth/journal/v11/n8/full/nmeth.3033.html (only accessible for subscribers of Nature Methods and causing a 401 HTTP response if users are not logged in) should be http://www.nature.com/nmeth/journal/v11/n8/abs/nmeth.3033.html (open to everyone).

grid library not loaded automatically

Using df of values for upset:

require( upset )
upset( df )

returns the following error:

Error in theme(panel.background = element_rect(fill = "white"), plot.margin = unit(c(0.5, :
could not find function "unit"

Fixed by manually loading grid:

require( grid )
upset( df )
** plot produced **

Session info:

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] reshape2_1.4.1 ggplot2_1.0.1 xlsx_0.5.7 xlsxjars_0.6.1 rJava_0.9-7
[6] UpSetR_1.2.0 stringr_1.0.0 magrittr_1.5 dplyr_0.4.1 plyr_1.8.3

loaded via a namespace (and not attached):
[1] assertthat_0.1 colorspace_1.2-6 DBI_0.3.1 digest_0.6.8
[5] gridExtra_2.0.0 gtable_0.1.2 knitr_1.11 labeling_0.3
[9] lazyeval_0.1.10 MASS_7.3-44 munsell_0.4.2 parallel_3.1.1
[13] proto_0.3-10 Rcpp_0.12.1 scales_0.3.0 stringi_0.5-5
[17] tools_3.1.1

Document how default intersection limit can be deactivated

Is there a smart way to set the number of intersections to be shown to all? The documentation for the nintersect parameter does not describe anything like that, e.g. nintersect=NA or something along those lines.

Return value?

Hi. Thanks for such a great Package!

The problem I'm facing is due to the upset function not returning a value, as does ggplot, for example. I need to store the plot in a variable to print it later in a different context. Is it possible for the upset function to return a "printable" plot object?

Best wishes,

Juanje.

Query legend for plot when attribute plots selected

Left out option when recreated how attribute plots are added.

Axis break?

Hi, loving UpSetR!!

I'm working on an 8 way comparison, and would like to induce an axis break in the y-axis. I can't seem to find a way to make this happen in all of the info I've dug through on UpSetR... is there any way to do so? The count of intersections between all 8 sets is 4,000 and next greatest is 800. This results in some rather small bars after that first big one!

Alternatively, if this isn't possible, can UpSet log transform the counts?

change default color for set size bars to dark gray

allow rotation of bar labels on the intersection bar chart in 45 degree steps

In the plot below the labels at the top of bars are overlapping. There should be an option to rotate them in 45 degree increments.

add htmlwidget functionality

I love your work with UpSet, and I had the JavaScript piece on my list to do as an htmlwidget of the week at BuildingWidgets. I had no idea that you had a R package until I just spotted it on the CRAN feed. I'd love to volunteer to make this into a htmlwidget if you would like to pair the interactivity of the JavaScript with the engine of R.

Before htmlwidgets existed, I had played a little integrating UpSet with rCharts http://timelyportfolio.github.io/upset where I added a couple R datasets to the list. The ugly code is in a fork https://github.com/timelyportfolio/upset.

Applying metadata to the Matrix

Hi,
I've been trying to apply a simple coloring scheme to the matrix background but i cannot seem to get it to work. Using example 5 from the metadata vignette I identified these two (possibly related) errors:

When reversing the order of the metadata plots list (i.e. "matrix_row" before "hist") results in the following error:
Error in rep_len(1, ncol) : invalid 'length.out' value
When listing only "matrix_row" and no other plot in the plots list results in the following error:
Error intmp[[i]] : subscript out of bounds.

Happy to provide more details if needed.
Doron Betel [email protected]

Original matrix set order

Dear jake,
Congrat for this very nice and useful package ! I did have one question or request if there is a solution of course. I would like to order the final intersection by names. If I'm comparing 2 group with different members and I would like to see all the intersections within the 1st group first , then within the second one and finally between members of different group. UpsetR order the intersections by set size and I can not find any option to chose the order of the intersection by header column as it appear in the matrix for exemple.
I don't know if I was clear and thanks you in advance for your help

aggregate.by should be group.by

The parameter name aggregate.by is a misnomer. It should be group.by. Example 3 in the the basic usage vignette should be renamed, too.

Center set-lables and add tics to both sides

The set labels are in the middle of the set size bars and the set intersection matrix. As they equally label both of them, they should be symmetric, i.e., centered and the tic found for the matrix should also be there for the bars (or not at all).

create a basic shiny server application

Key features:

CSV data upload
selection of sets
creation of queries based on intersections
selection of attributes for box plots

improve input format documentation in R docs

reqs for formatting data should appear earlier in the R doc. Now buried at the end of a hard to find web page.

(via https://twitter.com/fabiencampagne/status/721073665077616641 / via @fac2003)

support inclusion of empty intersections in matrix and bar chart

To be controlled via a parameter empty.intersections that is FALSE by default.

change reference in upset documentation to UpSet InfoVis 2014 paper

The Nature Methods PoV should still be mentioned.

order of query colors mixed in main bar chart

library(UpSetR)

My_data <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=T, sep=";" )

upset_base(My_data, first.col = 3, last.col = 19, nsets = 6, att.x = "ReleaseDate",att.y = "AvgRating", point.size = 3, att.color = "black", main.bar.color = "black", show.numbers = "yes", queries = list(c("Drama", "Romance", "red"), c("Horror", "Drama", "Thriller", "green"), c("Drama", "Comedy", "blue")))

The order of the query colors in the main bar chart is mixed up for the bar in commit 8b0d810:

Restructure boxplot summary input

att.x and att.y functionality should be removed

This can be handled with a custom plot. We should include the scatter plot, histogram and boxplot example custom plots as methods in the package so that they can be used out of the box.

Choose a good default color palette.

Users should not have to (and typically shouldn't at all) specify colors for anything, we should instead provide good defaults.

I would suggest to use color only for selections. In this example here:

We could get rid of the blue shading of the background and the blue bars for the sets.The blue bars for the sets might make some sense to distinguish them from the intersections, but on the other hand, they are the same data type.

This is a good starting point for selection colors:

http://colorbrewer2.org/?type=qualitative&scheme=Paired&n=10

intersections not working with more than 5 sets, even when nsets is set

> upset(binarymatrix,nsets=6,intersections=list(list("Heart_Up","Muscle_Up"),
+                                               list("Heart_Down", "Muscle_Up"), 
+                                               list("Heart_Down","Muscle_Unchanged"),
+                                               list("Heart_Down","Muscle_Down"),
+                                               list("Heart_Unchanged","Muscle_Up"),
+                                               list("Heart_Unchanged","Muscle_Unchanged"),
+                                               list("Heart_Unchanged","Muscle_Down"))
+ )
Error in `[.data.frame`(data, keep) : undefined columns selected

5 sets works fine, and 6 sets works if I don't specify intersections

Lower required R version to minimum possible

Quoting @JakeConway:

It could probably work for lower versions. I only put that because thats the version I was using at the time I began working on it. The only limitation on how early the R version can be is if they can use ggplot2 version 1.0.1, gridExtra version 0.9.1, and plyr version 1.8.3

Report intersection size data along with plot

Control font size of y label and tick marks

Nice package!

The font size of mainbar.y.label and the y axis tick marks are too small for my liking. Is it possible to increase their size? Could something like the name.size argument be added for these components? The demo figures do not seem to have this problem, but I cannot reproduce the large font size on my local machine.

support for ggplot themes

http://docs.ggplot2.org/0.9.2.1/theme.html

Some discussion about which parts of the plot should not be affected by themes will be required.

Import Upset Values For Chowruskey

Hello UpSetR Team,

Is there a way that we can extract the intersection values from upset command into a text file?

Can upset feed its values to Chowruskey?

Thanks

keep.order only moves labels on sets, the bars don't change

Code to replicate:

library(UpSetR)
movies <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=TRUE, sep=";" )
UpSetR::upset(movies, keep.order = TRUE, sets = c("Drama", "Thriller", "Action"))
UpSetR::upset(movies, keep.order = TRUE, sets = c("Thriller", "Drama", "Action"))

In the first plot, "Drama" is the largest set in the barplot on the left. In the second plot, "Thriller" is the largest set.

intersection matrix background overplots labels on intersection bar chart

In some situations the (white) background of the matrix plot overplots the 0 on the y-axis of the intersection bar chart (see image). Can the background of the matrix be made transparent?

need a way to display specific empty intersections

My matrix looks like this

     ensembl_gene_id Heart_Down Heart_Unchanged Heart_Up Muscle_Down Muscle_Unchanged Muscle_Up
1 ENSMUSG00000000001          0               1        0           1                0         0
2 ENSMUSG00000000028          0               1        0           0                1         0
3 ENSMUSG00000000031          0               1        0           0                1         0
4 ENSMUSG00000000056          0               1        0           0                1         0
5 ENSMUSG00000000058          0               1        0           0                1         0
6 ENSMUSG00000000078          0               1        0           0                1         0
...

There are 9 meaningful intersections but two of them are empty. Then there are a lot of meaningless intersections. I need a way to display all 9.

change alternating row background color to light gray

The "empty" circles should be a slightly darker gray.

Modify default behavior to show more than 40 intersections?

Currently the nintersect parameter limits the number of intersections shown by default to 40.

Should we show all sets by default?

Adjust position of numbers

Feature to adjust position of numbers above bars to prevent overlapping. Begins occurring when a lot of intersections in plot

rename custom.plots to attribute.plots

That would better reflect what they are intended to be used for.

error reading data

Hello Jake,
I am having an issue when trying a simple intersection with UpSetR. The data is as follows:

Name;Amelonado;Contamana;Criollo;Curaray;Guianna;Iquitos;Maranon;Nacional;Nanay;Purus
Thecc1EG000181;1;0;0;0;0;0;0;0;0;0
Thecc1EG001933;1;0;0;0;0;0;0;0;0;0
Thecc1EG003999;1;0;0;0;0;0;0;0;0;0
Thecc1EG005677;1;0;0;0;0;0;0;0;0;0
Thecc1EG006000;1;0;0;0;0;0;0;0;0;0
.
.
.

I load the data

genes <- read.csv("mygenes",header=T,sep=";")

and try upset with:

upset(genes2, sets = c("Amelonado", "Contamana", "Criollo", "Curaray", "Guianna", "Iquitos", "Maranon", "Nacional", "Nanay", "Purus"), sets.bar.color = "#56B4E9",order.by = "freq", empty.intersections = "on")

I obtain the error:
Error in start_col:end_col : argument of length 0

but I tried comparing with your example sets and I cannot figure out what the potential problem could be.
Has anyone experienced this problem?

thanks
Omar

Fix boxplot.summary manual description

Using existing dataframe with UpSetR

Hello there,

Trying to use a dataframe in R I generated from an abundance matrix t, I keep getting this error when using UpSetR:

Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list?

Here's the beggining of the said dataframe:

      OTU_IDS Amphibolite Basalt Coal Coal-Upper Dolomite Hematite_Granite High-calcite_clay mica_schist
  1 KC442817.1.1478           1      0    0          0        0                0                 0
  2 JX222276.1.1475           0      1    0          0        0                0                 0
  3 HQ218444.1.1344           0      0    0          0        0                0                 1
  4 DQ517124.1.1380           0      0    0          0        0                0                 0
  5 KF827260.1.1400           0      1    0          0        0                0                 0
  6 EF125930.1.1497           0      0    0          0        0                0                 0

Any ideas?

Thanks in advance.

André

Allow for column set names with spaces

allow querying on set names with spaces for intersection query

create a vignette with examples for the R package

http://r-pkgs.had.co.nz/vignettes.html

We need to decide if we want to use sweave to keep our compatibility with R 2.5 and later of if we want to switch to a minimum requirement of R 3.0 and use rmarkdown for the vignette.

support label rotation on bars

0, 45 or 90 degrees should be supported. Look at how plot/axis are setting label orientation or how it is typically done in ggplot and follow that pattern.

hms-dbmi / upsetr Goto Github PK

upsetr's Issues

Recommend Projects

Recommend Topics

Recommend Org