mani2012 / pathostat Goto Github PK
View Code? Open in Web Editor NEWThe purpose of this package is to perform Statistical Analysis on the PathoScope generated reports files.
The purpose of this package is to perform Statistical Analysis on the PathoScope generated reports files.
Need some kind of object to work with until the PathoStat object is ready to go. We can just add a plain old phyloseq object for the time being.
Check out this article.
http://shiny.rstudio.com/articles/modules.html
In order to work together effectively, and to make PathoStats as flexible as possible, we are going to need to write code that can be easily plugged into the PathoStats core. In the above article, the Rstudio folks have written up one possible way to do it, and I think it makes sense. I'm also open to other ideas.
I've started to do this for the Relative Abundance module, and I put an idea of what this might look like in this branch: https://github.com/mani2012/PathoStat/blob/newRAplots/R/relativeAbundance.R
My proposal is that each "top level" (relative abundance, diversity, differential expression, etc.) will live in its own file (within R/
directory). The module will be loaded into the shiny server. If you want to make additional tabs within that module, I would suggest that you create these as sub modules of the top-level module. This way all the code as well as UI elements for a tab/module are in one place and should be easy to change in the future.
PathoStat should have unified color scheme. Plus a way to easily select complementary/contrasting colors.
For example, with abundance barplots, members of the same phylum could be the same base color, OTUs within each phylum can be a shade of that base color. This is easier to read than random/sequential colors for each OTU.
Example from schizophrenia paper:
fig1.pdf
Currently, PathoStat takes a data matrix, batch information, and condition information as inputs, and is limited to those options. Instead, we're going to coerce any input (PathoScope report, .biom files, etc) into a phyloseq-class object, and pass that object along. This allows for the user to attach any number of covariates or phenotypic data along with the data matrix. This also means a general overhaul of almost every function in the package.
In its current form, PathoStat accepts "batch" and "condition" as possible discrete variables, and gives the user the option to color/group data (in various plots) by either of those. However, we're adding functionality: PathoStat will accept any number of covariates, such as patient age, weight, race, disease status, whatever. We still want to let users color/group data based on these things, but that doesn't make much sense for continous variables. Without binning, how do you group people by weight? You can, however, order data by continuous variables. We want to at least distinguish between the two types, and we may want to add functionality for continuous variables.
From coremicrobiome.R
Hi,
I launch pathostat interactive but the shini app does not plot anything (I tried with the example data too). You know what can be happening?
Thanks!
For those not at BU, we had a conversation today about how different analyses have different filtering requirements for the data. For example, you should not filter low-abundance OTUs for alpha diversity calculations, but there are other situations where you might want to filter for analysis or visualization. So we concluded:
There are other details that need to be sorted out, such as how to track if users upload pre-filtered data, etc.
We want to let users submit a range of data formats to PathoStat, but we want all of the processing, analyses, and outputs to be of a standard format. If we can coerce input data types (PathoScope reports, .biom files, Qiime output, etc) into phyloseq objects, it will make everything easier to work with. This carries the added advantage of compatibility with a bunch of outside packages/tools.
Core OTU tab uses this, other modules probably will as well.
The core microbiome functions depend on functions in https://github.com/microbiome/microbiome, which introduces a lot of additional dependencies.
runPathoStat fails if BatchQC is not installed:
Error in loadNamespace(name) : there is no package called ‘BatchQC’
Also found that batchQC can not be installed using biocLite, must use devtools::install_github.
From coremicrobiome.R
Metagenomics as a field is starting to move to longitudinal/temporal analysis and visualizations. Yet, the are not many tools or packages with this type of functionality. This would be a new feature for PathoStats in the form of a tab.
Sample data and code for alluvial plot:
alluvial_paper.R
https://www.dropbox.com/s/d8hpyj0bt8ddtmv/sample_data.zip?dl=0
There are a few issues with the way we have taxonomy table currently in PathoStat. Hope this can start a discussion about how we want to handle these things.
tax.name
is hard-coded in ui.R. I've solved this without hard-coding in the core OTU module by getting the ranks from the PathoStat object using rank_names()
. However I don't think that phyloseq enforces whether the rank names are ordered. I suggest we override rank_names()
and somehow enforce the hierarchical order.Taxonomy cache is loaded without checking that it makes sense with the PathoID reports. May need to go and fetch missing taxonomy IDs.
This should not be relevant once we start caching the PathoStat object.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.