Comments (6)
It is not. TransposedDataset
is a transposed HDF5.Dataset
, the type was created to avoid reading the entire dataset into memory. So all of TransposedDataset
's operations are forwarded to HDF5.Dataset
, after permuting the indices if necessary. HDF5.Dataset
does not define strides
either.
from muon.jl.
Makes sense, except...
An H5DF Dataset
is not a Julia AbstractArray
. Instead it provides read
and readmmap
(when possible), which allows getting such an array. The result does support strides
etc.
In contrast, TransposedDataset
does claim to be an AbstractArray
, but does not support strides
. It also works similarly to an H5DF Dataset but It does not support readmmap
, ismappable
and iscontiguous
.
So the API of TransposedDataset
provides neither the complete API of an H5DF Dataset
, nor the complete API for a Julia AbstractArray
.
Intuitively, it should do both. That is, provide readmmap
and ismappable
and iscontiguous
, just like an H5DF Dataset
; and also provide strides
(when possible), just like a Julia AbstractArray
.
Naturally readmmap
would need to return the Transpose
of the memory-mapped matrix (using the LinearAlgebra
package). That's a zero-copy view of the data so would still be memory-mapped.
from muon.jl.
strides
is not part of the AbstractArray
interface. AbstractArray
only mandates that size
and getindex
are implemented, everything else is optional. strides
is part of the strided Array interface.
from muon.jl.
Technically correct; however, not supporting strides
for arrays which are actually strided (memory-mappable vectors and matrices) disables all sort of optimizations when actually working with these arrays.
My workaround for now is to memory-map the array using the internal dset
(and transpose it), completely ignoring the fact that TransposedDataset
is an AbstractArray
.
Of course this runs into issue #24 so it only works for annotations written by the Python anndata
package...
from muon.jl.
Note that the strided Array interface also mandates an implementation of Base.unsafe_convert
, which returns a pointer to the memory block where the array is stored. This is impossible with either HDF5.Dataset
or TransposedDataset
.
from muon.jl.
Good point. So I guess the readmmap
workaround is the only option, which depends on #24 (for data written by the Julia package).
Thanks!
from muon.jl.
Related Issues (16)
- structured array I/O
- Error reading h5ad AnnData object file HOT 2
- Should `.mod` be an OrderedDict? HOT 3
- anndata 0.8 spec compatibility
- Sparse matrices should be parsed
- Array datasets are always written as chunked. HOT 6
- New release? HOT 2
- Use PooledArrays.PooledArray for unordered categorical columns
- use FileIO to read/write files in a Julian way HOT 1
- subsetting + views of MuData/AnnData HOT 1
- custom ordered set implementation with key and index access HOT 1
- TagBot trigger issue HOT 4
- Installation error HOT 1
- AnnData functions? HOT 1
- Logo text HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from muon.jl.