Giter Club home page Giter Club logo

downsample's Introduction

downsample

downsample columns such that the sum is equal in all columns

Installation

devtools::install_github('scottyler89/downsample')

Examples

How to downsample a count matrix

library("downsample")
## For illustrative purposes, we'll generate a 
## negative binomial distributed count matrix
in_mat<-matrix(rnbinom(1000*20000,.1,c(0.5,0.5,0.5)),ncol=1000,nrow=20000)

In the context of scRNAseq, you can think of this as your count matrix with cells in columns and genes in rows.

Now you'll typically want to look at the distribution of colSums & filter out the garbage with low sums that frequently make it through.

mat_col_sums <- colSums(in_mat)
hist(log10(mat_col_sums),breaks=20)

In real scRNAseq data this spread will be much bigger, will likely be multimodal, and will almost always have a long tail of low depth cells. In a more typical case, we'll filter out that long tail of low count cells & use a cutof for the minimum number of UMI that we actually want to include in the final cleaned dataset

umi_cutoff <- 1900
keep_cols <- which(mat_col_sums>=umi_cutoff)
## in this case we're keeping ~95% of cells as noted below
print(length(keep_cols)/dim(in_mat)[2])
## now we'll actually do the downsampling
out_mat <- downsample_mat(in_mat[,keep_cols])
## Now you'll see that all 'cells' have the same total UMI
print(head(colSums(out_mat)))

If you're using downsampling as a way to normalize for UMI depth on different datasets to be merged. Then you'll want to use the same cutoff for each dataset!

downsample's People

Contributors

scottyler89 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

eegk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.