Giter Club home page Giter Club logo

scdc's Introduction

SCDC: Bulk Gene Expression Deconvolution by Multiple Single-Cell RNA Sequencing References

Travis build status CRAN status

SCDC is a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expressions from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the batch-effect confounding.

SCDC framework

Citation

Meichen Dong, Aatish Thennavan, Eugene Urrutia, Yun Li, Charles M Perou, Fei Zou, Yuchao Jiang, SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references, Briefings in Bioinformatics, , bbz166, https://doi.org/10.1093/bib/bbz166

License: MIT

Installation

You can install the released version of SCDC from GitHub with:

if (!require("devtools")) {
  install.packages("devtools")
}
devtools::install_github("meichendong/SCDC")

Dependency package problem regarding to 'xbioc' could be resolved by:

install.packages("remotes")
remotes::install_github("renozao/xbioc")

Vignettes

Please see the vignettes page.

The SCDC paper is published at Briefings In Bioinformatics.

Questions regarding to the package can be emailed to: [email protected]

FAQs / Notes

  • When there is only 'one subject/individual' in the single cell dataset, please use SCDC_qc_ONE(), SCDC_prop_ONE() functions.

Aspects that could affect the deconvolution results:

  • data format: are bulk and single cell samples both raw counts / same format? We expect the data format to be consistent and comparable.
  • gene filtering: did you filter out lowly expressed genes / ribosomal genes / mitochondrial genes? These genes may affect the downstream analysis.
  • cell size and library size factors: for a single cell, do you think the sum of all gene counts (the library size) could reflect its real cell size? This is one of our assumptions: the ratio of library sizes between cell types can reflect the ratio of real cell sizes between cell types. If not, you can manually input the cell size factor when constructing the "basis matrix".
  • similar cell types: are there cell types that could potentially confound the analysis? For example, cell types that have very similar profiles /marker genes.
  • missing major cell types / technical issues: do you expect the sequencing procedure to make a big difference in bulk and sc even the technique is the same? Sometimes single cell reference data may lose information for some cell types. For example, there's fat cells in your bulk samples, but somehow you don't have it for single cell data.
  • deconvolution using a single reference dataset: did you try to use one reference dataset to test if the results make sense generally? I see you tried Bisque. Have you tried other methods like CIBERSORTx? If results from other "one-reference" deconvolution methods make more sense, then you can input these directly using our ENSEMBLE step.

scdc's People

Contributors

meichendong avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.