Giter Club home page Giter Club logo

clustifyrdata's Introduction

clustifyrdata

!!!! This repository has been archived. Please see clustifyrdatahub. !!!!

R-CMD-check

clustifyrdata provides 42 external data sets for cell-type assignment with clustifyr and reproducible scripts to build data objects.

Commonly used references:

name desc ntypes ngenes org from_pub
ref_MCA Mouse Cell Atlas 713 8601 mouse from
ref_tabula_muris_drop Tabula Muris (10X) 112 23341 mouse from
ref_tabula_muris_facs Tabula Muris (SmartSeq2) 175 23341 mouse from
ref_mouse.rnaseq Mouse RNA-seq from 28 cell types 28 21214 mouse from
ref_moca_main Mouse Organogenesis Cell Atlas (main cell types) 37 26183 mouse from
ref_immgen Mouse sorted immune cells 253 22134 mouse from
ref_hema_microarray Human hematopoietic cell microarray 38 21246 human from
ref_cortex_dev Human cortex development scRNA-seq 47 56864 human from
ref_pan_indrop Human pancreatic cell scRNA-seq (inDrop) 14 20125 human from
ref_pan_smartseq2 Human pancreatic cell scRNA-seq (SmartSeq2) 12 25525 human from

See the reference page for available data sets, and individual ref download page. Additionally these datasets will be made available as a Bioconductor ExperimentHub (clustifyrdatahub)

Data sets have uniform prefixes / suffixes:

  • ref_* : the prebuilt reference expression matrix.

  • *_matrix : single-cell RNA expression matrix.

  • *_avg : average expression caluculated from a single-cell RNA expression matrix.

  • *_meta : metadata from a single-cell RNA-seq experiment.

  • *_vargenes : variable genes used for dimension reduction, determined by Seurat.

  • *_markers : marker genes determined by Seurat.

  • *_M3Drop : variable genes used for dimension reduction as determined by M3Drop.

Installation

N.B.: clustifyrdata is a large data package (nearly 350 Mb uncompressed).

# install.packages("pak")
pak::pkg_install("rnabioco/clustifyrdata")

clustifyrdata's People

Contributors

jayhesselberth avatar kriemo avatar raysinensis avatar sidhantpuntambekar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

clustifyrdata's Issues

problem installing clustifyrdata

After installing devtools I get the following error after trying to install clustifyrdata

devtools::install_github("rnabioco/clustifyrdata")

Downloading GitHub repo rnabioco/clustifyrdata@master
✓  checking for file ‘/private/var/folders/fq/_zr6nbj12vlg13hz53xj2vgr0000gn/T/RtmpSyRqhZ/remotes163bb3b598210/rnabioco-clustifyrdata-8582364/DESCRIPTION’ ...
─  preparing ‘clustifyrdata’:
✓  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  looking to see if a ‘data/datalist’ file should be added
   /Library/Frameworks/R.framework/Resources/bin/build: line 10: 91104 Done                    echo 'tools:::.build_packages()'
        91105 Killed: 9               | R_DEFAULT_PACKAGES= LC_COLLATE=C "${R_HOME}/bin/R" --no-restore --slave --args ${args}
Error: Failed to install 'clustifyrdata' from GitHub:
  System command 'R' failed, exit status: 137, stdout + stderr:
E> * checking for file ‘/private/var/folders/fq/_zr6nbj12vlg13hz53xj2vgr0000gn/T/RtmpSyRqhZ/remotes163bb3b598210/rnabioco-clustifyrdata-8582364/DESCRIPTION’ ... OK
E> * preparing ‘clustifyrdata’:
E> * checking DESCRIPTION meta-information ... OK
E> * checking for LF line-endings in source and make files and shell scripts
E> * checking for empty or unneeded directories
E> * looking to see if a ‘data/datalist’ file should be added
E> /Library/Frameworks/R.framework/Resources/bin/build: line 10: 91104 Done                    echo 'tools:::.build_packages()'
E>      91105 Killed: 9               | R_DEFAULT_PACKAGES= LC_COLLATE=C "${R_HOME}/bin/R" --no-restore --slave --args ${args}
> 

tibble issue when building vignettes

@raysinensis I'm seeing lots of these errors when building vignettes. Are these out of date?

   Error: processing vignette 'otherformats.Rmd' failed with diagnostics:
   `df` must be a data frame without row names in `column_to_rownames()`.
   --- failed re-building ‘otherformats.Rmd’

ERROR when working with seurat4

I have a seurat4 object bu found it cannot work with clustifyrdata:

> SeuratIntegrate
An object of class Seurat 
30479 features across 191241 samples within 2 assays 
Active assay: integrated (2000 features, 2000 variable features)
 1 other assay present: RNA
 2 dimensional reductions calculated: pca, umap
> res <- clustify(
+     input = SeuratIntegrate,
+     ref_mat = ref_hema_microarray,
+     cluster_col = "integrated_snn_res.0.4"
+ )
found variable genes in SCT slot
Error in object_data.Seurat(s_object, "var.genes", n_genes) : 
  trying to get slot "var.features" from an object of a basic class ("NULL") with no slots

Any ideas on how to solve the problem?

Better annotate data sources

  • Edit for correct links (@source), use DOI when appropriate. Can also add @seealso for other refs.

  • Don't use @format, it's automatically added by roxygen2

  • Use @family tags to group matrix, meta, etc together.

  • Remove list from README page and link to improved reference.

Change compression of data files

They'll likely be smaller if you use devtools::use_data(..., compress = "xz")

> tools::checkRdaFiles("data")
                                    size ASCII compress version
data/gtex_bulk_matrix.rda        4291717 FALSE    bzip2       2
data/immgen_ref.rda             17368211 FALSE    bzip2       2
data/mouse.rnaseq_ref.rda        3425344 FALSE    bzip2       2
data/pan_indrop_avg.rda          1285269 FALSE    bzip2       2
data/pan_indrop_markers.rda       250580 FALSE    bzip2       2
data/pan_indrop_matrix.rda      19337051 FALSE    bzip2       2
data/pan_indrop_meta.rda          152107 FALSE    bzip2       2
data/pan_indrop_vargenes.rda        9632 FALSE    bzip2       2
data/pan_smartseq2_avg.rda       1803778 FALSE    bzip2       2
data/pan_smartseq2_matrix.rda   31234423 FALSE    bzip2       2
data/pan_smartseq2_meta.rda        42874 FALSE    bzip2       2
data/pan_smartseq2_vargenes.rda    18535 FALSE    bzip2       2
data/pbmc4k_avg.rda              1025525 FALSE    bzip2       2
data/pbmc4k_markers.rda           146440 FALSE    bzip2       2
data/pbmc4k_markers_M3Drop.rda     11754 FALSE    bzip2       2
data/pbmc4k_matrix.rda           8269555 FALSE    bzip2       2
data/pbmc4k_meta.rda               98494 FALSE    bzip2       2
data/pbmc4k_vargenes.rda           10593 FALSE    bzip2       2
data/pbmc5_markers.rda            191771 FALSE    bzip2       2
data/pbmc5_matrix.rda           18104217 FALSE    bzip2       2
data/pbmc5_meta.rda               171490 FALSE    bzip2       2
data/pbmc_bulk_matrix.rda         591426 FALSE    bzip2       2
data/pbmc_pca.rda                   2279 FALSE     gzip       2
data/ref_tabula_muris_drop.rda  10525886 FALSE    bzip2       2
data/ref_tabula_muris_facs.rda  20130052 FALSE    bzip2       2
data/yan_ref.rda                  284913 FALSE    bzip2       2

Download PBMC5

Had you downloaded the dataset : pbmc5 : PBMCs of a healthy donor - 5' gene expression
I can not download successfully.Could you share for me if you done?Thanks

s_small3 warning

Is there a way to get rid of this message when loading the package? Is this an attribute that wasn't stripped?

Warning: namespace ‘Seurat’ is not available and has been replaced
by .GlobalEnv when processing object ‘s_small3’

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.