Giter Club home page Giter Club logo

ouija's Introduction

Ouija

Ouija is a probabilistic pseudotime framework. Ouija

  • infers pseudotimes from a small number of marker genes letting you understand why the pseudotimes have been learned in terms of those genes (A)
  • provides parameter estimates (with uncertainty) for interpretable gene regulation behaviour (such as the peak time or the upregulation time) (B)
  • has a Bayesian hypothesis test to find genes regulated before others along the trajectory (C)
  • identifies metastable states, ie discrete cell types along the continuous trajectory (D)

Getting started

Installation

# install.packages("devtools")
devtools::install_github("kieranrcampbell/ouija")

To build the Ouija vignette install using

devtools::install_github("kieranrcampbell/ouija", local = FALSE, 
                          args = "--preclean", build_vignettes = TRUE)

Model fitting

Input is a cell-by-gene expression matrices that is non-negative and represents logged gene expression values. We recommend using log2(TPM + 1). This can either take the form of a matrix or a SingleCellExperiment (use of the SingleCellExperiment infrastructure is highly encouraged for single-cell analyses). By default the logcounts assay of a SingleCellExperiment will be used.

To fit the pseudotimes, pass the input data to the ouija function:

library(ouija)
data(example_gex) # synthetic gene expression data bundled
oui <- ouija(example_gex)
pseudotimes <- map_pseudotime(oui)

The map_pseudotimes function extracts the maximum-a-posteriori (MAP) estimates of the pseudotimes.

For further usage options see the vignette. A prebuilt vignette can be found here.

Authors

Kieran Campbell & Christopher Yau
Wellcome Trust Centre for Human Genetics, University of Oxford

Artwork

Artwork by cwcyau, the mysterious banksy-esque artist of the statistical genomics world.

ouija's People

Contributors

kieranrcampbell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ouija's Issues

Error in svd when response_type="transient"

Firstly thanks for adding the transient option!

However, when response_type="transient" i.e. no genes are designated switching svd throws an error:

library(ouija)
data(example_gex)
oui <- ouija(example_gex, response_type="transient")
Error in svd(x, nu = 0, nv = k) : a dimension is zero

Presumably this is due to attempting prcomp on no genes

ouija/R/ouija.R

Line 170 in 1ebc4ea

pc1 <- prcomp(Y_switch)$x[,1]

Perhaps just do a pca of the full Y?

Also in the readme the data is called synth_gex rather than example_gex.

Inference without modeling dropout?

Hi Kieran, thanks for the cool package. I am interested in learning more about Bayesian stuff so your other work seem interesting as well!

Recently there has been talk about how UMI count data in scRNA-seq is not zero-inflated. Instead it is recommended to model UMI counts using a negative binomial (or even a Poission) distribution. (I can share some papers if you'd like)

For this reason I was wondering if there was a way to omit the explicit modeling of zero counts. Also your thoughts on using the aforementioned distributions to directly use the gene counts instead of the log-transformed CPM data.

Thanks!

error: "Unknown or uninitialised column: 'Gene'."

Hello,
thanks for developing this tool! I am very eager to try it out.

I am having the problem that it seem like I am "loosing" the gene information in the ouija fit, but calling ouija works in priniple.

I am working on a sce, where I still see my gene names I am subsetting for:

SCEsmart[smart.top[1:10],]
class: SingleCellExperiment
dim: 10 94
metadata(1): log.exprs.offset
assays(2): counts logcounts
rownames(10): Jam2 Lgals3 ... Tcl1 Cthrc1
rowData names(12): ENSEMBL SYMBOL ... log10_total_counts AveCount
colnames(94): X00h_01_S24947 X00h_02_S24948 ... X48h_33_S25022 X48h_34_S25023
colData names(39): cellID timepoint ... pct_counts_in_top_500_features_Mito phases
reducedDimNames(0):
spikeNames(0):

options(mc.cores = parallel::detectCores())
oui <- ouija(SCEsmart[smart.top[1:10],],
single_cell_experiment_assay = "logcounts" ,
inference_type = "hmc")

print(oui)
A Ouija fit with 94 cells and 10 marker genes
Inference type: Hamiltonian Monte Carlo
MCMC info: 10000 iterations on 1 chains
(Gene behaviour) Switch/transient: 10 / 0

plot_expression(oui)
expression

so here I do not see the genenames anymore. Any downstream analysis produces errors like:

plot_switch_times(oui)
Unknown or uninitialised column: 'Gene'.Unknown or uninitialised column: 'Gene'.
Error in $<-.data.frame(*tmp*, "Gene", value = integer(0)) : replacement has 0 rows, data has 10

gene_regs <- gene_regulation(oui)
Error: All columns in a tibble must be 1d or 2d objects: * Column gene_i is NULL * Column gene_j is NULL Call rlang::last_error() to see a backtrace

I have to confess that my understanding of the under laying data structure is limited. Do you have any suggestions what could cause this behaviour and how I could fix it?

Thank you for any help!
Best,
Merrit

Neverending sampling chain 1, cycling CPU

Hi Kieran,
Thanks for developing ouija. I'm testing it out on a complete 1386 cell x 28000 gene matrix of single cell RNASeq counts. I tested 200GB-1T memory and 1-4 CPUs. It seems to use a steady 600GB memory and cycles between 1 and 2 CPU. With the code from the readme, it has not finished (converged?) in over a day. Is the matrix too big for ouija?

I also notice there are lots of warnings before it says "SAMPLING FOR MODEL 'ouija' NOW (CHAIN 1)", and no other notices beyond that. Thanks for your advice.

library(ouija)
library(Seurat)
load("Seurat.Object.RData")
options(mc.cores = parallel::detectCores())
oui <- ouija(as.matrix(seurat@data))

ouija

Update documentation

As it stands the documentation for the main function is pretty poor. In particular -

  • The title is "Fit a Ouija object." - could be more informative
  • The description is totally non-informative
  • The "return" section is close to useless
  • The description should contain a section on how to control inference - this can be taken from the main vignette
  • The description should at least reference the downstream analysis functions or reference to the relevant vignette section
  • The arguments sent to rstan should be documented and defaults provided!

Can't load sce data

Hi,
I am new to Ouija but quite excite about using it.
I have got my Seurat object transformed into a singleCellExperiment object. Unfortunately Ouija doesn't seem to see like this. I get this :

oui <- ouija(sample1.sce)
Error in t.default(x@assays[[single_cell_experiment_assay]]) : 
  argument is not a matrix

Am I doing something wrong?

Ouija not working w/ SCESet object

Exciting to try this out!

I didn't have any joy though, applying ouija to an SCESet object (see attached pic).

Looks like ouija could be clobbering the scater NAMESPACE on load? Or trying to access exprs() from scater when it actually comes from BiocGenerics/Biobase?

Let me know if you can't replicate - I'm using scater devel at the moment.

__dropbox_projects_hipsci-singlecell_-_master_-_rstudio

Unclear licensing terms

Please add a LICENSE (or LICENSE.md or LICENSE.txt) file to the repository, ideally using an OSI https://opensource.org/ approved open source licence compatible with the wider R Bioinformatics community to maximise uptake and reused. e.g. MIT license.

Ouija with precomputed pseudotime

Apologies in advance as I realise this is partially a clone of #5. But I'm looking for several of the functionalities of ouija, but with a precomputed pseudotime from another pseudotime method. In particular, the plot_switch_times (with the error bounds) as well as the gene regulation and metastable state testing would be interesting. But I haven't been able to find similar functionality in switchde. Is there a good way to do this using either package?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.