Giter Club home page Giter Club logo

mappedtensor's People

Contributors

dylanmuir avatar farhi avatar ingiehong avatar morroth avatar patricialuna avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mappedtensor's Issues

a=cast(a) could be implemented with strClass

When casting the entire MappedTensor, it seems it could be a trivial operation by just returning the header with the strClass changed. Then there would be no need to being the array into memory, as it does currently.

>> a=MappedTensor(100);
>> b=single(a);
Warning: --- MappedTensor: Warning: This command will allocate memory for the
entire tensor! 
> In MappedTensor.MappedTensor>MappedTensor.cast at 1368
  In MappedTensor.MappedTensor>MappedTensor.single at 1352 

Add overloaded `prod` method

Hi Dylan,

A little bug, invalid imaginary index input not detected:

>> a=MappedTensor(1);
>> a(i)=1
Error using MappedTensor>ConvertColonsCheckLims (line 1861)
*** MappedTensor: Index exceeds matrix dimensions.
Error in MappedTensor>mt_write_data (line 1715)
   [vnLinearIndices, vnDataSize] = ConvertColonsCheckLims(sSubs.subs,
   vnTensorSize, hRepSumFunc);
Error in MappedTensor/subsasgn (line 503)
            mt_write_data(mtVar.hShimFunc, mtVar.hRealContent, subs,
            mtVar.vnOriginalSize, mtVar.strClass, mtVar.nHeaderBytes, tfData ./
            mtVar.fRealFactor, mtVar.bBigEndian, mtVar.hRepSumFunc,
            mtVar.hChunkLengthFunc); 

Also I wonder if you would consider adding these utility functions?

length
ndims
mean
prod
isfloat
isinteger

>> a=MappedTensor(1,'class','single');
>> isfloat(a)
ans =
     0
>> a=MappedTensor(1,'class','int8');
>> isinteger(a)
ans =
     0

Also, would you consider setting properties (SetAccess = private, GetAccess = public) so we can see the internals of the class using details() function?

Last thing, this is a little tricky, what should MappedTensor return for class()? Should it be of the contained values (single, double, int8, etc.)? I'm asking since a lot of (my) code checks for class and so MappedTensors require extra handling.

Thanks for your help!
Mark

Overloaded `plus` method bug

Sorry, another bug report.

>> a=MappedTensor(100);
>> a=a+1;
Error using MappedTensor/SliceFunction (line 1128)
Not enough input arguments.
Error in  +  (line 772)
         SliceFunction(mtVar, fhAddition, nSliceDim); 

Indexing a mapped tensor after permutation

Hi,

First off, fantastic utility and much appreciated. Makes handling my 5-10GB data sets so much easier than the rather clunky and slow Matlab native memmapfile. I am using the MappedTensor version from Matlab File Exchange. I have a particular use case wherein I permute my tensor, and then attempt to populate with some values. Simple example:

mt = MappedTensor(100,200,10);
mt = permute(mt,[3,1,2]);
[nr,nc,nf] = size(mt);
mt(:,:,1) = randn(nr,nc);

This gives the following error ...

Error using mapped_tensor_shim
*** mapped_tensor_shim: In an assignment A(I) = B, the number of elements in B and I must be the
same.

Error in MappedTensor>mt_write_data (line 1837)
hShimFunc('write_chunks', hDataFile, mnFileChunkIndices, vnUniqueDataIndices, vnDataSize, strClass, nHeaderBytes, cast(tData, strClass), double(bBigEndian));

Error in indexing (line 598)
mt_write_data(mtVar.hShimFunc, mtVar.hRealContent, S, mtVar.vnOriginalSize, mtVar.strStorageClass, mtVar.nHeaderBytes, tfData, mtVar.bBigEndian, mtVar.hRepSumFunc, mtVar.hChunkLengthFunc);

Should I be accessing the permuted tensor differently? Any help would be appreciated.

Cheers,

Robert

Shortcut casting still causes errors

As before, the order of the cast/scale operations is fixed so it is going to be wrong some of the time. I don't think there is an easy remedy :(

>> a=MappedTensor(1);
>> a(1)=1.3;
>> a=uint8(2*a);a(1)
ans =
    2
>> a=1.3;
>> a=uint8(2*a);a(1)
ans =
    3

Constructor fails with complex input

When converting a regular matrix to a MappedTensor, it fails if the matrix is complex.

>> a=MappedTensor(1,'Convert');
>> a=MappedTensor(1i,'Convert');
Error using mapped_tensor_shim
*** mapped_tensor_shim: 'bReadOnly' must be a logical value.

Error in MappedTensor/make_complex (line 1389)
         mtVar.hCmplxContent = mtVar.hShimFunc('open', mtVar.strCmplxFilename, mtVar.strMachineFormat,
         mtVar.bReadOnly);

Error in MappedTensor (line 382)
            make_complex(mtVar);

Error in MappedTensor (line 270)
                        mtVar = MappedTensor(size(tfSourceTensor), varargin{:}, 'Like', tfSourceTensor);

Shortcut casting causes errors

It looks like the strClass way of casting added here causes problems if the scaling factor is not 1. The order of the operations is scale then cast with the MappedTensor, even if they were applied in a different order, which creates inconsistencies.

Example:

>> a=MappedTensor(1);
>> a(1)=1.5;
>> a=2*uint8(a);
>> a(1)
ans =
    3
>> a=1.5;
>> a=2*uint8(a);
>> a(1)
ans =
    4

I really like the shortcut way of doing things but it would need to track the order of the operations if there are more than 1 of them, as discussed a little bit here.

Probably best to revert this change in the meantime - sorry for the lack of insight when requesting it!

class char silently converts to double

While tinkering with the code to add a new constructor I kept seeing this problem with char arrays being converted to double.

>> b=MappedTensor(1,3,'class','char')
b = 
  MappedTensor object, containing: char [1 3].
  Methods
>> b(:)='t'
b = 
  MappedTensor object, containing: double [1 3].
  Methods

It seems to happen on lines 303-4

         % - Initialise dimension order
         mtVar.vnDimensionOrder = 1:numel(vnTensorSize);

but that doesn't make any sense to me, So I'm kind of stuck. I don't use char arrays much but it's bizarre....

Data is corrupted when subsampling a large file (8Gb) on windows

Dear Doctor Muir,

I am having a problem displaying a sub-sampled version of a 8gb mapped tensor file with the "imagesc" fnction.
If i choose the x and y gap per frame to be equal or greater than 5, the image will be destroyed at the 234th frame, which is around 2gb of data.
It always happens at the 234th frame, regardless where i start displaying. For example if i start at frame 234 the image will be destroyed immediately.
Data type used was "single".
Trying to display all the frames with a sub-sampling of 4x4 or less will result in no errors at all.
The mex'ed version was used. Trying to use he non mex'ed version did not work.
I would be happy to get your insight into why this is happening.

Best regards,
Jerome

Direct conversion of a tensor to a `MappedTensor` in the constructor

Do you think this is a good idea for a constructor?

>> a=rand(50,'single');
>> a=MappedTensor(a);
Error using MappedTensor (line 266)
*** MappedTensor: Error: 'vnTensorSize' must be a positive integer vector.

It gives a simple way do dump the variable to disk if it's getting large, e.g.

>> a=randn(50);
>> w=whos('a');
>> if w.bytes > 1e9; a=MappedTensor(a); end

There is obviously some ambiguity if the argument is a vector so maybe it would need a preceding string, e.g. a=MappedTensor('convert',a), or something like that.

Empty index causes error

>> a=MappedTensor(1);
>> a([])
Error using mapped_tensor_shim
*** mapped_tensor_shim: Could not allocate data buffer.

Error in MappedTensor>mt_read_data (line 1763)
   tData = hShimFunc('read_chunks', hDataFile, mnFileChunkIndices, vnLinearIndices,
   vnReverseSort, vnDataSize, strClass, nHeaderBytes, double(bBigEndian));

Error in MappedTensor/my_subsref (line 466)
            tfData = mtVar.fRealFactor .* mt_read_data(mtVar.hShimFunc,
            mtVar.hRealContent, S, vnReferencedTensorSize, mtVar.strStorageClass,
            mtVar.nHeaderBytes, mtVar.bBigEndian, mtVar.hRepSumFunc,
Error in MappedTensor/subsref (line 375)
               [varargout{1}] = my_subsref(mtVar, subs);

As compared to a standard matrix:

>> b=zeros(1);
>> b([])
ans =
     []

Also the empty matrix is not caught in the constructor:

>> a=MappedTensor([]);
Attempted to access vnTensorSize(0); index must be a positive integer or logical.

Error in MappedTensor (line 312)
         if (vnTensorSize(end) == 1) && (numel(vnTensorSize) > 2)

Sorry for all these nit-picky bugs, I'm just putting them up here as I run into them.

Allow integer types for index variables

In MappedTensor.m, you could generalize the validation check to allow all integer types (int16, int32, etc.). Line 1590: validateattributes(oRefs, {'numeric'}, {'integer', 'real', 'positive'});

For the validateattributes() function, Matlab's documentation states:

  "The string 'numeric' is an
   abbreviation for the classes uint8, uint16, uint32, uint64,
   int8, int16, int32, int64, single, double."

Works for me. Thanks for creating this file, it's a real improvement over memmapfile.

Error with uint indexes

I was getting this error when accessing TIFFStack data using uint32 indexes:

Error using TIFFStack>isvalidsubscript
*** TIFFStack: Subscript indices must either be real positive integers or logicals.

I solved this by changing line 1488 in TIFFStack.m from:
validateattributes(oRefs, {'single', 'double'}, {'integer', 'real', 'positive'});
to:
validateattributes(oRefs, {'numeric'}, {'integer', 'real', 'positive'});

Seems like a logical fix to me, but if I'm not understanding something please let me know.

Cheers,
Marcel

340 Gigabyte data and error while creating .mex from mapped_tensor_shim.c

Dylan,
I use windows 7 x64. When I try to compile .mex function from mapped_tensor_shim.c I can such error. I deleted the whole line 33, that includes "#define UINT64_C(c) c ## i64" and it was compiled normally. I could use MappedTensor.

I tryed to work with data of 340 Gigabytes.
mtVar = MappedTensor([r_path r_file],[8240 42654189], 'Class', 'uint8');
So I have a matrice of uint8 of size(mtVar) = [8240 42654189]. Then I use the following command to get the data (I need 42654189 elements that weigh about 42 Megabytes):
tic; mtVar(1,:); toc
The elapsed time is almost 9 hours. By the way it doesn't require much RAM or CPU unlike memmapfile, which consumes 5 Gigabytes of RAM (all my RAM) in 5 minutes and my machine is hung.
Is it possible to speedup access to such data?
By the way if try to get 10^5 elements:
tic; mtVar(1,1:10^5); toc
It takes time about 66 seconds. If I then rerun this command then elapsed time will be les than 1 second. Why?

dylan

Does it have to inherit from a handle object?

One thing that is tricky with handle object is that copies all point to the same object. This can be non-intuitive - best illustrated with an example:

>> a=MappedTensor(1,'Convert');
>> b=2*a;
>> a(1)

ans =

     2

I made a few changes to avoid the handle inheritance. The main thing was to declare a cleanup function (trivial) and it seems to work just fine. I haven't fully tested it so there may be some other side-effects.

I've uploaded the changes here. Sorry I don't really understand how github works properly to do clever things like a pull request.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.