bigomics / omicsplayground Goto Github PK

View Code? Open in Web Editor NEW

115.0 4.0 34.0 1.17 GB

Visual self-service analytics platform for big omics data.

Home Page: http://www.bigomics.ch

License: Other

R 97.26% CSS 0.54% HTML 0.07% JavaScript 0.74% Makefile 0.18% Dockerfile 0.15% SCSS 0.97% Shell 0.09%

omics genomics-visualization genomics rna-seq rna-seq-analysis

omicsplayground's Introduction

Omics Playground: Explore Omics Data Freely

Omics Playground is a comprehensive self-service analytics platform for the visualization, analytics and exploration of Big Omics Data. It allows biologists to apply a multitude of state-of-the-art analysis tools to their own data to explore and discover underlying biology without coding.

How to use Omics Playground?

The detailed documentation on how to load data and how to use the functionalities of omics playground can be found here for text-based tutorials, or here for video tutorials.

Platform components

The platform consists of two main components. The first component is off-line and addresses the data importing and preprocessing, which includes preparing the input data, filtering, normalising and precomputing of statistics for some analyses. The second part is composed of the online interface, which supports the real-time visualisation and interaction with users. The interface is subdivided into Basic and Expert modes to provide a customisable experience suited to each user's background.

The docker image and the installation script will contain some example data sets. To analyze your own data you can use the upload function, or create/modify the scripts in the scripts/folder. Creating a custom script is much more flexible and allows, if necessary, batch correction, quality filtering and/or translation of probe names.

More detailed information and feature explanation of Omics Playground is available in the online documentation.

Installation

You can either run the platform from the source code, or download the docker image. Running Omics Playground from the docker file is the easiest way.

Run using the Docker file

The docker file of the platform is available on Docker Hub. Follow the steps below to set up a running platform from the docker file:

Pull the docker image using the command:
```
docker pull bigomics/omicsplayground
```
Warning. The docker image requires about 5GB-8GB hard disk space. Note: download version v1.0 if you want the exact version of the NAR/GAB publication, otherwise docker will download the latest version by default.

Now run the docker with:

docker run --rm -p 4000:3838 bigomics/omicsplayground

Then open http://localhost:4000 in your browser to access the platform.

Run from source code / Start Developing

Omics Playground relies on 3 basic components that do the work on the background: playbase, bigdash, bigloaders. Thus, it is necessary install them manually within the R environmnet:

remotes::install_github('bigomics/playbase')
remotes::install_github('bigomics/bigdash')
remotes::install_github('bigomics/bigLoaders')

On top of these, a python interpreter is also necessary for the interactive plots. This can be aslo easily installed all within R via:

install.packages("reticulate")
reticulate::install_miniconda()

Then, everything is ready for installing omicsplayground:

git clone https://github.com/bigomics/omicsplayground.git

Note: download version v1.0 if you want the exact version of the NAR/GAB publication, otherwise GitHub will download the latest version by default.

Next, install all necessary R packages and dependencies by running from the omicsplayground folder:

cd omicsplayground
Rscript dev/requirements.R

Finally, you can run the omicsplayground platform. You can do this with the Makefile located in the root omicsplayground folder:

make run

Or you can launch the platform from within an R session:

shiny::runApp('components/app/R', launch.browser=TRUE)

If you have Shiny Server installed you can create a link to the
shiny folder in the system-wide shiny-server apps folder or in your
ShinyApps user folder.

omicsplayground's People

Stargazers

Watchers

omicsplayground's Issues

"Aw snap. Out of memory" error with Chrome

This has been a common complain among Chrome users. After a while the Platform crashes with the error below.

This is a recent problem that started about 3-4 weeks ago. Previous versions of Chrome did not show this error.

Multi/single-threading

It seems like the app itself is single-thread. Do you think that is a shiny-server limitation, or something inherent to the app itself? I've got some horsepower to throw at it, would be great to speed it up.

[originally posted by email by KT on 13.08.2020]

File Handling/Data control?

I set up a docker container to run Omics Playground (v2.5), and processed one of the public datasets to make a .ptx file. It uploaded ok, but when I clicked on "Compute!" there was one thing that raised some questions. 1) Where does the data get uploaded to when you're running your own docker instance? The docker container itself? Or is it getting uploaded to Omics Playground Cloud? Any info would be appreciated!

[originally posted on https://groups.google.com/d/msgid/omicsplayground/f964ce0a-8191-4ab8-ba50-94d258877dceo%40googlegroups.com]

[JS] Move inline javascript to .js file

All inline javascript should be moved to separate .js files. Improves code readability and makes the JavaScript code easier to maintain.
Might also be useful to create multiple .js files (for example: firebase.js, userAuthentication.js, customInteractions.js ...)

[R] Refactor pgx.initDatasetFolder

pgx.initDatasetFolder in global.R uses verbose = 1, whereas the functions inside use TRUE. Aim to be consistent in the variable classes.

upload fails if CSV has extra empty lines

Upload fails if CSV has extra empty lines. Some CSV files have erroneously trailing empty lines. Upload will fail and give an error by detecting "duplicated lines".

[R] Add busy indicator (spinner) while the app is calculating/plotting

Currently there is no feedback to the user while the app is calculating or plotting something. A simple spinner for example would solve this issue and improve the user experience.

[R] Refactor sidebar IDs

There are 16 instances of the ID sidebar, one for each tab page. I understand that there is only one sidebar visible at any given time, but they all exist on the page (just not visible) at the same time, which is not legal. You cannot have more than one element with the same ID.
If you pass the namespace to tabView then you can set unique IDs for each sidebar.

[R] Cleanup initialization of board modules in server.R

In server.R when initializing all the board modules, all of them take the entire env object as a sole parameter. This defeats one of the main purposes of using modules - being explicit about your requirements and limiting the scope of objects you can use/modify. Most of the boards only use one or two values (such as env[[“load”]][[“inputData”]] or env[[“expr”]][[“selected_gxmethods”]]) so each module should list the individual parameters it actually needs to use and only take those values

Batch corrected (beta)

the "non" in method slot it uses COMBAT batch correction. isn't better to change this name?

[R] Refactor navbarPage()

The header parameter of navbarPage() is meant to accept UI that will be shared among all the tabs. It's not meant to take arbitrary “setup” HTML. Instead, the entire navbarPage() can be wrapped in a tagList along with the other setup HTML. This also eliminates the double-tagList of the header and footer variables.

Partial connection network plots

I noticed an issue with the Partial connection networks plots (under the 'Expression' tab). The plots are not converted as pdf or png files, but instead saved as html files.

Add database statistics in drug CMap tab

Would be nice to show the statistics of the selected database: how many drugs, how many perturbation profiles.

Is it possible to choose different normalization methods?

[premsubr in googlegroups 03.05.2021]

Is it possible to choose different normalization methods? TMM, etc....

[R] Fix bugs in the data creation script

There also seem to be a few bugs in the data creation scripts, because even after all the packages have been installed, none of the build scripts run successfully.
Even after installing all packages I initially thought I need to install, when opening certain scripts there are packages which still aren't installed such as rworldmap.

Data files processed with older scripts break startup

I recently started using omicsplayground, and started with the docker version. It seemed to work, then seemed to randomly break. Upon further investigation (and checking with the other install I have set up), it seems that some of the older data scripts keep the whole thing from starting. The main culprit I've seen is GSE98638-scliver.pgx (script to create it is scripts/pgx-GSE98638-scliver.) Removing that file from the data folder allows the entire software to function again.

The log file section when this happens is:
`
'*****************************************
******** parsing OPTIONS file ***********
*****************************************'

TITLE = Omics Playground
USER_MODE = PRO
AUTHENTICATION = none
ENABLE_UPLOAD = TRUE
MAX_SAMPLES = 20
MAX_COMPARISONS = 5
MAX_GENES = 19999
WATERMARK = TRUE
DEV_VERSION = FALSE
MODULES_DISABLED = tcga multi

Error in rownames<-(*tmp*, value = character(0)) :
attempt to set 'rownames' on an object with no dimensions
Calls: runApp ... pgx.initDatasetFolder -> pgx.initDatasetFolder1 -> rownames<- -> rownames<-
Execution halted
`

Is there any way to rework things such that the app throws a more visible error for bad files?

[R] Cleanup global environment

Currently everything is accessible from everywhere and it's impossible to know where a variable comes from and who should use it.

A quick easy “solution” to add some organization and clarity is to simply not create many global variables from many different files. Variables created within the same file should be grouped together and should be more explicit about where they come from. For example, in global.R, instead of creating variables [OPG, RDIR, WATERMARK] and more, you can create a single .globals <- new.env() and the assign the same variables into .globals. This way it's more clear where a variable came from, they're grouped together, and you limit the number of objects in the global scope.
A better solution, but it would also require more work, is to use a more modular framework (not to be confused with shiny's module feature). It would allow you to explicitly import and export certain objects from each source file, which will improve long term maintainability. There are two options for this: the {modules} package and the {box} package. I personally recommend {box} since it's a much more widely used package and its author is a well established member of the R community with many great contributions over the years.

[FR] filtering datasets by uploaddate

[email Thorben S. 19.05.2021]

"And lastly, I have some more feedback for you: We think it would be very convenient if the home screen/data view would offer the function of filtering or sorting the datasets by the uploaddate."

isobaric labeling support

Hi, any plans to add support for isobaric labeling analyses?

[R] Check OS before using Sys.setlocale

Sys.setlocale does not work on windows

plotly Error (?) using latest changes -- program freezes

[17.05.2021 from Premsubr in googlegroups]

Hi,

After using the latest changes from GitHub (listed from 7-8 days or so ago) , using the source code version of omicsplayground the program freezes after displaying all the splash screen and just at the data up,oad window -- the window becomes gray and takes no inputs

Error in R is as follows:

DBG [BiomarkerBoard:<input_pdx_select>]  reacted
Warning: Error in if: argument is of length zero
  113: <reactive:corrected_counts> [modules/UploadModule.R#247]
   97: countsRT
   96: renderUI [modules/MakeContrastModule.R#89]
   95: func
   82: renderFunc
   81: output$load-upload_panel-makecontrast-UI
    1: shiny::runApp
[checkTables] called
[checkTables] 1 :
[checkTables] 2 :
[checkTables] 3 :
[checkTables] 5 :
[checkTables] 10 :
[ComputePgxServer:@enable] enabling compute button
DBG [LoadingBoard::inputData] ---------- reacted ---------------
DBG [LoadingBoard::inputData] AUTHENTICATION= none 
DBG [LoadingBoard::inputData] auth$logged= FALSE 
Warning in mapply(children, flex, FUN = function(el, flexValue) { :
  longer argument not a multiple of length of shorter
Warning: The 'plotly_restyle' event tied a source ID of 'pcoords' is not registered. In order to obtain this event data, please add `event_register(p, 'plotly_restyle')` to the plot (`p`) that you wish to obtain event data from.
DBG [BiomarkerBoard:<input_pdx_select>]  reacted
Warning: Error in <observer>: object 'COLLECTIONS' not found
  44: <observer> [boards/ClusteringBoard.R#1405]
   1: shiny::runApp
Warning in mapply(children, flex, FUN = function(el, flexValue) { :
  longer argument not a multiple of length of shorter

Remove pryr

Get rid of pryr, it's no longer maintained and a core dependency of the app.

Dataview tab issue

I'm experiencing an inconsistent error message that affects some but not all of my datasets. When I open the "Dataview/Samples" tab, I do not see any phenotype clustering plot and get instead the following error message (see attached image):

Orca-server not starting - Shiny-Server Install

I've been working on setting up a stand-alone instance on shiny-server, and kept getting a "something went wrong" error where the shiny app wouldn't load. Looking at the logs (attached), it boils down to the orca server not starting under shiny. I made sure it was in my path, I made sure it wasn't an issue with X11 forwarding, pretty much everything I could find. Eventually, I got it to work by running orca server in a docker container ("docker pull quay.io/plotly/orca") , and commenting out the lines that create the orca server and specifically check the local instance in shiny/global.R as below:

docker run --name orca_server -p 5151:9091 quay.io/plotly/orca (next time I'll run with the -d flag)
comment out "ORCA <- plotly::orca_serve(port=5151, keep_alive=TRUE, more_args="--enable-webgl")" and "message("local ORCA is alive = ",ORCA$process$is_alive())" lines in shiny/global.R

omicsplayground-kraig-20200812-154813-46793.log

Relevant log section below:

`*****************************************
***** starting local ORCA server ********
'*****************************************
Unable to init server: Could not connect: Connection refused
Unable to init server: Could not connect: Connection refused

(orca:6886): dbind-ERROR **: 15:48:20.326: AT-SPI: Couldn't connect to accessibility bus. Is at-spi-bus-launcher running?
Trace/breakpoint trap (core dumped)
Warning in system("orca --version", intern = TRUE) :
running command 'orca --version' had status 133
Error in if (orca_version() >= "1.1.1") "--graph-only" :
argument is of length zero
Calls: runApp ... source -> withVisible -> eval -> eval ->
Execution halted`'

System Info:
Ubuntu 18.04 LTS
Shiny Server v1.5.12.933 (Node.js v10.15.3)
Docker v19.03.12
(Let me know what other info is relevant and I will add it)

Volcano plot gives 'Must supply text to annotation' error

Problem: Volcano plot gives error ot "Must supply text to annotation" if logFC is too high and there are no significant genes.

Temporary solution: lower FC threshold

(on Timo's data)

correlation column not appearing when datatype is counts (Dataview/counts)

Correlation column not appearing when datatype is counts (Dataview/counts) [CS from PLX]

[R] Generate UI elements in the UI rather than in the server

All UI elements should be created from the UI part of a module which improves the code readability and performance.

Test Signatures Overlap plot issue

When I open the "Overlay/Similarity table" under the "Signature/Test Signatures" tab, I noticed that hovering over individual Histogram entries does not show the name of the entry, but shows instead "%{text}". I have included a screenshot below to highlight the bug:

[R] Create R package for non shiny code

Since there seems to be a lot of business logic that isn't part of the shiny app (most of the code in the /R folder), it would be worthwhile to consider converting this project into an R package. All of the non-shiny code would be exported from the package, and then you gain some of the best practices and tools that packages have. It'll naturally force your code to be more robust. The shiny app can be created inside the inst/ folder and will only contain code that is explicitly related to the web application, and this way it will be easier to manage the codebase - it'll be clear what parts of the code can be developed and tested independently of the shiny app. I don't think it would be a huge mission to convert the current project to a package because it seems the files are mostly organized in such a way that files are either fully shiny related or not. A file that doesn't use render*(), input$, output$, observe(), reactive() or UI elements is most likely a good file to move into the package and export its functions.

[CSS] define styling in classes rather than IDs

When adding styling to elements in a module, try to avoid using the element ID to set customised style, and instead create a new class and use that for the custom styling. If the module ID ever changes, the styling linked with the ID will no longer work, whereas the class will still show as expected.

Issue with the generation of PDF/PNG files

The outputs are being downloaded as html files rather than PNG or PDF files in the Gene Correlation Network tab. I noticed the same issue with the Gene Ontology Map plot.

[CSS] Refactor CSS

The vast majority of customised styling is included in .css files, however there are a few instances where tags$style or the style parameter is being used - all CSS should be provided in a CSS file.

create meaningful css files (for example: nav.css, footer.css, sidebar.css ...)
move tags$style /style parameters to css file

Move GSETS object into database (SQLite)

Reading this huge list takes several seconds, is it absolutely necessary? Maybe consider moving this data into eg. a SQLite database where you can reference the required IDs rather than loading the entire list on init.

[R] Remove whitespace in pgx-files with trimws()

trimws() is more efficient removing outer whitespace

MV Imputation strategies

I would love to see the implementation of MV imputation strategies into the platform.

Kind regards
Thorben

Delete obsolete code

There is a lot of dead code, which makes it more difficult to traverse the code and understand what's important and what's not.

cleanup global.R
refactor if(0) statements - those should use some boolean flag instead because it's not clear if it's meant to ever run or if it's test code. If these are meant for testing replace with {testthat} package
remove unused functions (for example social_buttons() and premium.feature())
refactor functions that are defined multiple times (such as dbg())
javascript code for detecting browser dimensions

BUG: stars not appearing in enrichment module

Star symbols not appearing in enrichment tab in Docker version but seems OK in source version. Stars appearing correctly in differential expression modules (docker+source).

[originally found by AM]

Program freezes using latest updates -- source version

Continuing on my previous issue.

On the source version in R, I pulled the updates and now I get a new error: (the webpage becomes gray and hangs)

Warning in mapply(children, flex, FUN = function(el, flexValue) { :
longer argument not a multiple of length of shorter
Warning: The 'plotly_restyle' event tied a source ID of 'pcoords' is not registered. In order to obtain this event data, please add event_register(p, 'plotly_restyle') to the plot (p) that you wish to obtain event data from.
DBG [BiomarkerBoard:<input_pdx_select>] reacted
Warning: Error in : object 'COLLECTIONS' not found
44: [boards/ClusteringBoard.R#1405]
1: shiny::runApp
Warning in mapply(children, flex, FUN = function(el, flexValue) { :
longer argument not a multiple of length of shorter

Unlocking package internals

Why do we unlock package internals?

omicsplayground/shiny/boards/FunctionalBoard.R

Line 280 in 6b63672

unlockBinding("geneannot.map", as.environment("package:pathview"))

I'm asking because this requires the package to be loaded in the global environment; we can't use the namespace for pathview

Warning - gx-heatmap

omicsplayground/R/gx-heatmap.r

Line 1226 in c129c58

warning("Discrepancy: Rowv is FALSE, while dendrogram is `",

Is this warnin supposed to fire when dendrogram <- "none" or all the time?

[R] Remove double sourced functions / files

There is a double sourcing of pgx-functions.R and pgx-files.R - in global.R and pgx-include.R

[R] Refactor project structure into ui.R / server.R / global.R

With an application as large as this, it is worth splitting out the app.R file into separate ui.R and server.R files. The UI and server logic are two separate pieces of functionality and are easier to deal with in separate files.

The creation of constants in the app.R and app-init.R files can be moved to the global.R file. This file automatically gets sourced before anything else, so these variables will always be created.
The global.R file doesn't need to be sourced explicitly
Any function definitions can be moved to a utils.R file.

rename master to main

Hi there!

Can someone with sufficient rights rename master to main?

https://www.git-tower.com/learn/git/faq/git-rename-master-to-main

Cheers

Yellow message box does not disappear

I (as well as several other users) have noticed this frequent issue, which appears to be OS- and browser-independent. The yellow message boxes do not disappear once a scroll down menu has been closed and remain on the browser even when changing module (see picture below).

Error message when using the "Similar Experiments" tab (discovered by K.S.)

An"out of bounds" error message (see picture below) appears when trying to visualise the leading edge graph.

cytoplot not working

I have a dataset (~900 cells, 8000 genes) of scRNAseq that is loaded in and working pretty decently. However, the cytoplot graph is throwing an error. Looking at the logs I get:

Warning: Error in kde2d: bandwidths must be strictly positive 
  187: stop
  186: kde2d
  185: pgx.cytoPlot
  184: <reactive>
  182: .func
  179: contextFunc
  178: env$runWith
  171: ctx$run
  170: self$.updateValue
  168: func
  167: renderPlot
  165: func
  125: drawPlot
  111: <reactive:plotObj>
   95: drawReactive
   82: origRenderFunc
   81: output$scell-sc_cytoplot-renderfigure
    1: runApp

I tried something suggested here: https://stackoverflow.com/questions/53075331/error-using-geom-density-2d-in-r-computation-failed-in-stat-density2d-b and replaced the following code:

    z1 <- kde2d( x1[j1], x2[j1], n=50)
    z2 <- kde2d( x1[j2], x2[j2], n=50)
    z3 <- kde2d( x1[j3], x2[j3], n=50)
    z4 <- kde2d( x1[j4], x2[j4], n=50)

with

    z1 <- kde2d( x1[j1], x2[j1], n=50, h = c(ifelse(bandwidth.nrd(x1[j1]) == 0, 0.1, bandwidth.nrd(x1[j1])), ifelse(bandwidth.nrd(x2[j1]) == 0, 0.1, bandwidth.nrd(x2[j1]))))
    z2 <- kde2d( x1[j2], x2[j2], n=50, h = c(ifelse(bandwidth.nrd(x1[j2]) == 0, 0.1, bandwidth.nrd(x1[j2])), ifelse(bandwidth.nrd(x2[j2]) == 0, 0.1, bandwidth.nrd(x2[j2]))))
    z3 <- kde2d( x1[j3], x2[j3], n=50, h = c(ifelse(bandwidth.nrd(x1[j3]) == 0, 0.1, bandwidth.nrd(x1[j3])), ifelse(bandwidth.nrd(x2[j3]) == 0, 0.1, bandwidth.nrd(x2[j3]))))
    z4 <- kde2d( x1[j4], x2[j4], n=50, h = c(ifelse(bandwidth.nrd(x1[j4]) == 0, 0.1, bandwidth.nrd(x1[j4])), ifelse(bandwidth.nrd(x2[j4]) == 0, 0.1, bandwidth.nrd(x2[j4]))))

to pre-compute the bandwidth value. That somewhat worked (some genes would plot), but gave a new error on the genes that didn't work (that I know are expressed in the dataset):

Warning: Error in contour.default: increasing 'x' and 'y' values expected
  188: stop
  187: contour.default
  185: pgx.cytoPlot
  184: <reactive>
  182: .func
  179: contextFunc
  178: env$runWith
  171: ctx$run
  170: self$.updateValue
  168: func
  167: renderPlot
  165: func
  125: drawPlot
  111: <reactive:plotObj>
   95: drawReactive
   82: origRenderFunc
   81: output$scell-sc_cytoplot-renderfigure
    1: runApp

Thoughts on what might be going on, or what else to try?

Refactor Orca server

A significant portion of the startup time is taken up by creating the Orca server (~half a minute for me). It seems the only use of it is to download static plotly plots. I would be worthwhile looking into an alternative such as plotly::orca and removing the initOrca function.
There is some issue between shiny and orca on Windows (at least for me) where it doesn't like the temporary file path, but can be resolved by making a temporary file within the project and copying that file to the file argument in downloadHandler.

drug connectivity - incorrect number of dimensions

Drug connectivity give "incorrect number of dimensions" if dataset has only 1 comparison.

[Axel GSE166321]

[R] Put settings into separate config file using the {config} package

define absolute paths in config file
add OPTIONS file settings
add settings (USER_MODE, DEV, DEBUG, WATERMARK, etc) from global.R to config file
add separate environments in the config file (for example: local development, production, ...)

[R] Define specific versions of packages for local installation

The local installation needs to be able to work without errors. There are several packages that are being used which are no longer available on CRAN/BioConductor. If these packages are required for the application to run, it would help to install the specific version of the package that works from the archives.