Comments (3)
Hi, yes there seems to be something with the sha1 sum on the download page that does not match the file. I have tested the file on s3 and it works as expected. I will look further into it. If the download would be corrupted, the file would not be openable with archs4py (e.g. archs4py.ls(...)). I am also getting: ae96de0519b9f008b0dc3a9f944ee9007daf2f6a
from archs4.
Hi, thanks for looking into this.
I can load most of the data, but am having trouble with these four samples:
- GSM6998368
- GSM6998371
- GSM6998380
- GSM6998386
If I try to load these specific samples with H5py, I get OSError: Can't synchronously read data (inflate() failed)
, which I thought might mean there is corruption localized to a particular chunk.
If I try to load these samples with archs4py, it just looks like they have zero counts:
>>> a4.data.samples(file, ["GSM6998368","GSM6998371","GSM6998380","GSM6998386"])[:].sum()
GSM6998368 0
GSM6998371 0
GSM6998380 0
GSM6998386 0
dtype: uint64
>>>
Edit: I noticed archs4py.data.get_sample
returns an array of zeros on exception, and doesn't raise. When I modified this function to raise the exception, it was the same exception raised from h5py: OSError: Can't synchronously read data (inflate() failed)
from archs4.
I can load most of the data, but am having trouble with these four samples:
- GSM6998368
- GSM6998371
- GSM6998380
- GSM6998386
Actually, this doesn't seem to be linked to the SHA1 mismatch because I have the same problem loading 5 samples from mouse_gene_v2.5.h5 (even though the hash matches the download page):
- GSM3723071
- GSM7230982
- GSM7230984
- GSM7230985
- GSM7230988
from archs4.
Related Issues (20)
- Updating R Scripts with New h5 Matrices HOT 1
- Metadata info HOT 5
- Update blurb on the landing page HOT 1
- Meta data HOT 3
- Latest pipeline to create ARCHS4 Version 2.1.2 h5 files? HOT 1
- Error when downloading gene expression files HOT 1
- elysium HOT 9
- Programmatic way to submit fastq files HOT 1
- gene counts format HOT 1
- RequestTimeout
- ENSG genes when using gene_symbols HOT 3
- [Question]: License page clarificaion HOT 2
- CPM and TPM from gene_abundance.tsv HOT 1
- Fix the footer HOT 2
- Duplicate gene symbols HOT 2
- Some questions regarding h5 files HOT 1
- What gencode version for 'human_matrix_v1.11.h5' gene name annotation? HOT 1
- Duplicate Gene Names Implications? HOT 1
- Gene correlation files on the site HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from archs4.