Comments (13)
Hello! Just checking, were you able to resolve this? Because I'm facing the same issue as well using the CellBender v0.2.0.
from cellbender.
Just to add that I'm also facing the same issue and @GreenGilad 's suggestion doesn't work for me. I'm using 10X V3 and the latest CellBender 0.2. And the following are the errors when I tried to use GreenGilad's solution. I'm running it on macOS Catalina 10.15.7
> data.data <- Read10X_h5(filename = data.file, use.names = TRUE)
Error in `[[.H5File`(infile, paste0(genome, "/", feature_slot)) :
An object with name matrix/gene_names does not exist in this group
> f <- hdf5r::H5File$new(data.file, mode="a")
Error in H5File.open(filename, mode, file_create_pl, file_access_pl) :
HDF5-API Errors:
error #000: ../../src/hdf5-1.12.0/src/H5F.c in H5Fopen(): line 793: unable to open file
class: HDF5
major: File accessibility
minor: Unable to open file
error #001: ../../src/hdf5-1.12.0/src/H5VLcallback.c in H5VL_file_open(): line 3500: open failed
class: HDF5
major: Virtual Object Layer
minor: Can't open object
error #002: ../../src/hdf5-1.12.0/src/H5VLcallback.c in H5VL__file_open(): line 3465: open failed
class: HDF5
major: Virtual Object Layer
minor: Can't open object
error #003: ../../src/hdf5-1.12.0/src/H5VLnative_file.c in H5VL__native_file_open(): line 100: unable to open file
class: HDF5
major: File accessibility
minor: Unable to open file
error #004: ../../src/hdf5-1.12.0/src/H5Fint.c in H5F_open(): line 1590: file is already open for read-only
class: HDF5
major: File accessibility
minor: Unable to open fil
Besides, I have also tried to load the same date by scanpy. I also meet an error. I notice another issue is talking about it, I'll update details about my error to #57
from cellbender.
Okay, so the if
statement that @GreenGilad mentioned above is indeed the problem. Since we create a file using PyTables, there will be a PYTABLES_FORMAT_VERSION
attribute.
My proposal is to substitute their current if
statement
https://github.com/satijalab/seurat/blob/fe93b05745e55ec2f66e3f0b4c4196aad9f4d5a7/R/preprocessing.R#L1155
with
if (hdf5r::existsGroup(infile, 'matrix'))
For now, I think the code below is a potential workaround. I may try to submit a pull request to Seurat to incorporate this, as I think it makes more sense than relying on a version attribute from PyTables.
library(Matrix)
ReadCB_h5 <- function(filename, use.names = TRUE, unique.features = TRUE) {
if (!requireNamespace('hdf5r', quietly = TRUE)) {
stop("Please install hdf5r to read HDF5 files")
}
if (!file.exists(filename)) {
stop("File not found")
}
infile <- hdf5r::H5File$new(filename = filename, mode = 'r')
genomes <- names(x = infile)
output <- list()
if (hdf5r::existsGroup(infile, 'matrix')) {
# cellranger version 3
message('CellRanger version 3+ format H5')
if (use.names) {
feature_slot <- 'features/name'
} else {
feature_slot <- 'features/id'
}
} else {
message('CellRanger version 2 format H5')
if (use.names) {
feature_slot <- 'gene_names'
} else {
feature_slot <- 'genes'
}
}
for (genome in genomes) {
counts <- infile[[paste0(genome, '/data')]]
indices <- infile[[paste0(genome, '/indices')]]
indptr <- infile[[paste0(genome, '/indptr')]]
shp <- infile[[paste0(genome, '/shape')]]
features <- infile[[paste0(genome, '/', feature_slot)]][]
barcodes <- infile[[paste0(genome, '/barcodes')]]
sparse.mat <- sparseMatrix(
i = indices[] + 1,
p = indptr[],
x = as.numeric(x = counts[]),
dims = shp[],
giveCsparse = FALSE
)
if (unique.features) {
features <- make.unique(names = features)
}
rownames(x = sparse.mat) <- features
colnames(x = sparse.mat) <- barcodes[]
sparse.mat <- as(object = sparse.mat, Class = 'dgCMatrix')
# Split v3 multimodal
if (infile$exists(name = paste0(genome, '/features'))) {
types <- infile[[paste0(genome, '/features/feature_type')]][]
types.unique <- unique(x = types)
if (length(x = types.unique) > 1) {
message("Genome ", genome, " has multiple modalities, returning a list of matrices for this genome")
sparse.mat <- sapply(
X = types.unique,
FUN = function(x) {
return(sparse.mat[which(x = types == x), ])
},
simplify = FALSE,
USE.NAMES = TRUE
)
}
}
output[[genome]] <- sparse.mat
}
infile$close_all()
if (length(x = output) == 1) {
return(output[[genome]])
} else{
return(output)
}
}
Loading a CellBender remove-background
output file in a legacy CellRanger v2 format:
Loading a CellBender remove-background
output file in the newer CellRanger v3+ format:
(I had to subset to "Gene Expression" to successfully use CreateSeuratObject
)
from cellbender.
from cellbender.
A temporary workaround that seemed to solve the problem is that in the case of a CellRanger V3 (where the matrix
group has the features
group) to remove the "PYTABLES_FORMAT_VERSION"
attribute using:
f <- hdf5r::H5File$new("file_name.h5", mode="a")
f$attr_delete("PYTABLE_FORMAT_VERSION")
f$close_all()
In the case of a CellRanger V2 attribute is present and also the "gene_names" slot is present - so no problems there
from cellbender.
Thanks for letting me know, this is helpful. Are you using Seurat3?
from cellbender.
Yes. Using Seurat 3.2 (which I have just updated too from 3.1 recently)
I did compare the code of the Read10x_h5 function with the previous version to see that it isn't in the Seurat side
from cellbender.
Thanks. Okay, yes, I will definitely make it a priority to have Seurat input compatibility by the time we release this version officially.
from cellbender.
Just to add, I am using CellRanger v3. Looking into the source code, it seems like having CellRanger v3 will alter the group creation for the output file generation, as compared to v0.1.0 where the group creation is standard regardless of which Cell Ranger version. Thank you so much for your help on this!
from cellbender.
Oh, I see I forgot my promise above! I will take another look and see if I can figure out a small change that will enable Seurat 3 to automatically read these h5 files. I admit I do not yet understand exactly what the issue is.
from cellbender.
scanpy
loading has (hopefully) been addressed in a satisfactory way in #57
I will continue to look into Seurat
loading
from cellbender.
I'm not sure if my issue is related but I'm mostly interested in understanding the issue here.
For example of I try to read in this 'all_timepoints_subsampled.h5' file into R I get this error:
DA <- Read10X_h5(filename = subset_h5_path,
use.names = F,
unique.features = TRUE)
Error in `[[.H5File`(infile, paste0(genome, "/shape")) :
An object with name X/shape does not exist in this group
and I tried all the above mentioned solutions but none seems to work.
If I understood correctly the problem is caused by the fact that theh5
file was generated with Cell Ranger software (version 2.1.0) and aligned to the GRCh37/hg19, but this file format is deprecated and not supported anymore by Seurat
and this is why I get the error message, right? Similarly to what happened here satijalab/seurat#732
I'm running Seurat
v3.3.3 on R v3.6.2.
Thanks for any elucidation.
from cellbender.
@Ni-Ar yes I think you are probably right about that
from cellbender.
Related Issues (20)
- cellbender v3.0 doesn't generate most of the output files, but doesn't have any errors
- "Trying to use CUDA, " \ AssertionError: Trying to use CUDA, but CUDA is not available.
- Number of cells after cellbender much more than number of cells from cellranger (filtered)
- New fileformat output from BD rhapsody HOT 1
- Should I keep decreasing the learning rate?
- Question about input h5 file HOT 1
- Computing the output in asynchronous chunks in parallel takes longer than 144 hours
- Unhandled division by zero
- Never mind
- Can't Computing target noise counts per gene for MCKP estimator HOT 1
- Importance of model loss
- Background Fraction HOT 2
- OOM posterior inference for chimeric sample even using --posterior-batch-size 1
- Not saving ckpt.tar.gz checkpoint HOT 7
- Fixed Single cell RNA seq HOT 1
- Cellbender on multiplexed chemistries
- RuntimeError: CUDA driver error: invalid argument - Google Container Registry (GCR)
- CellBender for .BAM file
- Increased number of UMIs per barcode after running remove-background HOT 1
- Mismatch between summary for algorithm convergence and learning curve
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cellbender.