Giter Club home page Giter Club logo

speedi's Introduction

SPEEDI

Table of Contents

Overview of SPEEDI

Overview of SPEEDI

Single-cell Pipeline for End to End Data Integration (SPEEDI) is a fully automated, end-to-end pipeline that facilitates single cell data analysis and improves robustness and reproducibility. SPEEDI computationally infers batch labels and automates the application of state of the art processing and analysis tools. Additionally, SPEEDI implements a reference-based cell type annotation method coupled with a majority-vote system. SPEEDI takes raw count feature-by-barcode single cell data matrices as input and outputs an integrated and annotated single-cell object, a log file with auto-selected analysis parameters, and a set of preliminary analyses.

Using the SPEEDI Website

The SPEEDI Website allows users to upload their single cell datasets to our server for processing. Users can then view and download results once processing completes. Please visit the website to learn more!

Running SPEEDI Locally

To install the SPEEDI R package locally, you can use devtools and BiocManager:

if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
devtools::install_github('FunctionLab/SPEEDI', repos = BiocManager::repositories())

All R-related dependencies should be installed automatically. Note that RTools is required to install the SPEEDI R package in Windows. To learn how to use the SPEEDI R package, please view the SPEEDI vignette.

Citing SPEEDI

The SPEEDI manuscript is currently under review.

Need Help?

If you encounter any issues using SPEEDI, feel free to contact a SPEEDI administrator ([email protected]).

speedi's People

Contributors

williamthistle avatar yuanwang0 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

speedi's Issues

Handling of SeuratData References

Originally, in LoadReference, for a given reference, the function used data to load the reference data (e.g., data("kidneyref")). However, this doesn't seem to work with these reference data, as you get an error message like:

Warning message:
In data("kidneyref") : data set ‘kidneyref’ not found

Note that the data command does work for non-reference data distributed by SeuratData. This issue has been reported by other users: satijalab/seurat-data#53

We need to figure out how to successfully use these references. I will do some more testing and report back!

Turn SPEEDI into R package and submit to Bioconductor (or CRAN)

This issue contains two steps:

  1. Prepare SPEEDI as an R package (keeping in mind Bioconductor standards)
  2. Submit SPEEDI to Bioconductor and go through the review process

Before SPEEDI is accepted to Bioconductor, we can provide instructions for how to install SPEEDI directly from GitHub. Reviewers may need to use this approach if we are still waiting for Bioconductor approval when we submit the paper.

Importantly, Bioconductor follows a release schedule where packages are released every ~six months (usually in April and October). If this schedule is completely at odds with the publication schedule, we may need to consider submitting to CRAN instead, but I think this is unlikely.

Are the functions in utils.R necessary?

It seems like the functions in utils.R are all connected with the PCA function in that file. Is this PCA function actually used anywhere now? I saw a line using it that was commented out in the original VisualizeIntegration function (since removed), but I don't see it used anywhere else. It looks like we're just using the standard Seurat::UsePCA. Can we delete utils.R?

Add Doublet Detection to SPEEDI?

Should we consider adding doublet detection as an optional addition to SPEEDI? The relevant code can be run immediately after creation of the SeuratObject - something like this as a starting point:

# Load doublet package
library(scDblFinder)
# Find doublets
sc_obj <- as.Seurat(scDblFinder(as.SingleCellExperiment(sc_obj), samples = "sample"))
# See distribution of doublets in each sample
doublet_sc_obj <- subset(x = sc_obj, subset = scDblFinder.class %in% "doublet")
print(table(doublet_sc_obj$sample))
rm(doublet_sc_obj)
# Remove doublets
sc_obj <- subset(x = sc_obj, subset = scDblFinder.class %in% "singlet")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.