{"id":27290307,"url":"https://github.com/dylanmuir/mappedtensor","last_synced_at":"2025-04-11T21:28:07.354Z","repository":{"id":31185570,"uuid":"34746140","full_name":"DylanMuir/MappedTensor","owner":"DylanMuir","description":"Better memory-mapped files in Matlab","archived":false,"fork":false,"pushed_at":"2020-05-18T20:11:11.000Z","size":1155,"stargazers_count":11,"open_issues_count":2,"forks_count":6,"subscribers_count":2,"default_branch":"master","last_synced_at":"2023-10-20T19:34:03.830Z","etag":null,"topics":["lazy-loading","matlab","matlab-tensor","memory-mapped","tensor"],"latest_commit_sha":null,"homepage":"dylan-muir.com/articles/mapped_tensor","language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DylanMuir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-04-28T17:44:45.000Z","updated_at":"2023-03-08T09:55:03.000Z","dependencies_parsed_at":"2022-08-07T16:15:28.016Z","dependency_job_id":null,"html_url":"https://github.com/DylanMuir/MappedTensor","commit_stats":null,"previous_names":[],"tags_count":1,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DylanMuir%2FMappedTensor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DylanMuir%2FMappedTensor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DylanMuir%2FMappedTensor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DylanMuir%2FMappedTensor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DylanMuir","download_url":"https://codeload.github.com/DylanMuir/MappedTensor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248482391,"owners_count":21111321,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lazy-loading","matlab","matlab-tensor","memory-mapped","tensor"],"created_at":"2025-04-11T21:28:06.735Z","updated_at":"2025-04-11T21:28:07.342Z","avatar_url":"https://github.com/DylanMuir.png","language":"MATLAB","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Transparent lazy data access for matlab\n\nBy default, [Matlab][1] matrices must be fully loaded into memory. This can make allocating and working with\nhuge matrices a pain, especially if you only _really_ need access to a small portion of the matrix at a time.\n[`memmapfile`][2] allows the data for a matrix to be stored on disk, but you can't access the matrix transparently\nin functions that don't expect a [`memmapfile`][2] object without reading in the whole matrix. `MappedTensor` is\na matlab class that looks like a simple matlab tensor, with all the data stored on disk.\n\nA few extra niceties over [`memmapfile`][2] are included, such as built-in per-slice access; fast addition,\nsubtraction, multiplication and division by scalars; fast negation; permutation; complex support.\n\nTensor data is automatically allocated on disk in a temporary file, which is removed when all referencing\nobjects are cleared. Existing binary files can also be accessed. `MappedTensor` is a handle class, which means\nthat assigning an existing mapped tensor to another variable _will not_ make a copy, but both variables will point\nto the same data. Changing the data in one variable will change both variables.\n\nMappedTensor internally uses `mex` functions, which need to be compiled the first time MappedTensor is used. If\ncompilation fails then slower, non-mex versions will be used.\n\n## Download and install\n\nDownload [MappedTensor][3]  \nUnzip the `@MappedTensor` directory to somewhere on the [Matlab][1] path. The *@* ampersand symbol is important,\nas it signals to [Matlab][2] that this is a class directory. Then type:\n```matlab\naddpath /path/to/Mappedtensor\n```\n\n**Note**: MappedTensor provides an accelerated MEX function for performing\nfile reads and writes.  MappedTensor will attempt to compile this\nfunction when a MappedTensor variable is first created.  This requires\nmex to be configured correctly for your system.  If compilation fails,\nthen a slower pure Matlab version will be used.\n\n## Creating a MappedTensor object\n\n```M = MappedTensor([dim1 dim2...], ...)```\n```M = MappedTensor(dim1, dim2, ...)```\n```M = MappedTensor(dim, ...)```\nallocates a MappedTensor object to store an array\nwith specified size [d1 d2...]. When only one dimension value is specified, \nthe actual dimension is for a square array [d d].\n\n```M = MappedTensor(FILENAME, [d1 d2...], 'FORMAT', class, ...)``` \nconstructs a \nMappedTensor object that re-uses an existing map file FILENAME for an array\nwith dimensions [d1 d2...]. The full size, class, and offset of the file must  \nbe known and specified in advance.  This file will not be removed when all \nhandle references are destroyed.\n\n```M = MappedTensor(ARRAY)``` \nconstructs a MappedTensor object that maps a \nnumeric array ARRAY into a temporary file. The array *must be 2D or \nmore* (not scalar, nor vector that would conflict with a dimension setting).\nThis syntax implies to already allocate the initial array, which limits the \nsize of the MappedTensor. For large arrays, it is more efficient to \npre-allocate the object with specified dimensions or the 'Size' property\nand then set the content, per chunks.\n\n```M = MappedTensor(..., PROP1, VALUE1, PROP2, VALUE2, ...)``` \nconstructs a\nMappedTensor object, and sets the properties of that object that are named in\nthe argument list (PROP1, PROP2, etc.) to the given values (VALUE1, VALUE2,\netc.). All property name arguments must be quoted strings (e.g.,\n'writable'). Any properties that are not specified are given their default\nvalues.\n\n**Note**: When a variable containing a MappedTensor object goes out of scope\nor is otherwise cleared, the memory map is automatically closed.\nYou may also call the DELETE method to force clear the object.\n\n## Tensor properties\n\nAll properties can be accessed with syntax e.g. `M.property`. All these properties can also be set when building the tensor.\n\n| Property | Description |\n|----------|-------------|\n| Data  |  The actual Data |\n| Filename  |  Binary data file name on disk (real part of tensor) |\n| FilenameCmplx  |  Binary data file name on disk (complex part of tensor) |\n| Format  |  The class of this mapped tensor |\n| MachineFormat  |  The desired machine format of the mapped file |\n| Offset  |  The number of bytes to skip at the beginning of the file |\n| Temporary  |  A flag which records whether a temporary file was created |\n| Writable  |  Should the data be protected from writing? |\n\nWe detail below the use of these properties, especially to set an initial tensor.\n\n##### `Format`: Char string (defaults to 'double').\nFormat of the contents of the mapped region. \nFormat specifies that the mapped data is to be accessed as a single\nvector of type specified by Format's value. \nSupported char arrays are 'int8', 'int16', 'int32', 'int64', \n'uint8', 'uint16', 'uint32', 'uint64', 'single', and 'double'.\nComplex arrays are supported. Sparse arrays are not supported.\nYou can change later the storage class of the object with the CAST \nmethod, however this is usually not recommended.\n\n##### `Offset`: Non-negative integer (defaults to 0).\nNumber of bytes from the start of the file to the start of the\nmapped region. Offset 0 represents the start of the file. This \nallows to skip over the beginning of an (existing) binary file, by\nthrowing away the specified number of header bytes. You can use \nmethdos FREAD and FWRITE to read this header region.\n\n##### `Writable`: True or false (defaults to false).\nAccess level which determines whether or not Data property (see\nbelow) may be assigned to.\nThis property can be changed after object creation.\n\n##### `Temporary`: True or false (default to true when created from array).\nWhen false, the associated file is kept when the object is cleared.\nSuch files can be further reused. When the object is created from\nan array, Temporary is true. When creating from an existing map file\nTemporary is false. You can change this property after creation.\nWhen saving an object, the Temporary state is set to false.\nThis property can be changed after object creation.\n\n##### `MachineFormat`: big-endian ('ieee-be') or little-endian ('ieee-le')\nIf not specified, the machine-native format will be used.\n\n##### `Data`: array\nArray to assign to the mapped object.\nThis property can be changed after object creation.\nYou can also set the Data with syntax: \n```matlab\nset(M, 'Data', array)\nM(:)            = whole_array;\nM([ 1 3 5... ]) = slice; \n```\n\n##### `Filename`: Char array.\nContains the name of the file being mapped. You can also get the\nmapped file with FILEPARTS.\n\n##### `FilenameCmplx`: Char array.\nContains the name of the file being mapped (complex part). You can also get the\nmapped file with FILEPARTS.\n\n## Additional Name/Value pair options at build only\n\n##### `TempDir`: Directory path\nDirectory where the mapped file(s) should stored. The default path\nis e.g. TMPDIR or /tmp. You may also use /dev/shm on Linux systems\nto map the file into memory. This can be very efficient in terms of I/O, and\nan be coupled with tensor compression with the `pack` method.\n\n##### `Like`: array\nSpecified array dimension and class is used to preallocate a new\nobject. Note that sparse arrays are not supported.\n\n##### `Size`: [d1 d2 ...] array\nVector which specifies the size of the mapped array. This is the same as specifying dimensions as first arguments (see above).\n\nAll the properties above may also be accessed after the MappedTensor object\nhas been created with the GET method. For example,\n```matlab\nset(M, 'Writable', true); % or M.Writable = true;\n```\nchanges the Writable property of M to true.\n\n## Loading from data files\n\nThe LOAD method allows to lazy import binary data sets with syntax\n```matlab\nm = load(MappedTensor, 'filename');\n```\n\nwith the following data formats.\n\n| Extension         | Description               |\n|-------------------|---------------------------|\n| EDF               | ESRF Data Format  (2D)        |\n| POS               | Atom Probe Tomography  (4 columns)   |\n| NPY               | Python NumPy array  (nD)      |\n| MRC MAP CCP4 RES  | MRC MRC/CCP4/MAP electronic density map (3D) |\n| MAR               | MAR CCD image (2D)            |\n| IMG MCCD          | ADSC X-ray detector image SMV (2D) |\n\n## Using the array\n\nThe MappedTensor array can be used in most cases just as a normal Matlab\narray, as many class methods have been defined to match the usual behaviour.\n\nYou may access the array with indices as in `M(I,J,..)`. The full tensor content\nis retrieved with `M(:)` as a column, or `M(:,:)` as pages, and finally as \n`M.Data` to get the raw shaped array.\n\nMost standard Matlab operators just work transparently with MAPPEDTENSOR.\nYou may use single objects, and even array of tensors for a vectorized\nprocessing, such as in:\n\n`\nm=MappedTensor(rand(100)); n=copyobj(m); p=2*[m n];\n`\n\nThese objects contain a reference to the actual data. Defining n=m actually\naccess the same data. To make a copy, use the COPYOBJ method.\n\nTransparent casting to other classes is supported in O(1) time. Note that\ndue to transparent casting and tranparent O(1) scaling, rounding may\noccur in a different class to the returned data, and therefore may not\nmatch Matlab rounding precisely. If this is an issue, index the tensor\nand then scale the returned values rather than rely on O(1) scaling of\nthe entire tensor.\n\nTo work efficiently on very large arrays, it is recommended to employ the\nARRAYFUN method, which applies a function FUN along a given dimension. This\nis done transparently for many unary and binary operators (with ARRAYFUN2).\n\nThe NUMEL method returns 1 on a single object, and the number of elements\nin vectors of objects. To get the number of elements in a single object, \nuse NUMEL2(M) or PROD(SIZE(M)). This behaviour allows most methods to be \nvectorized on sequences on tensors.\n\nIf you need to handle many such tensors, it may be a good idea to compress them\nwith `pack(m)` while you are not using them. This can be done for instance right\nafter loading content. Decompression is performed transparently while you access\nthe tensor array. Think about re-compressing afterwards to save disk/memory. \nCompression is usually extremely efficient on data with low randomness.\n\nAn efficient processing pipeline could be:\n- load tensors \n- compress them with `pack`\n- do whatever you need (extraction is performed automatically)\n- recompress as soon as possible with `pack`\n\nA list of available methods is shown below.\n\n| Method | Description |\n|--------|-------------|\n| abs  |   Absolute value. (unary op) |\n| acos  |   Inverse cosine, result in radians. (unary op) |\n| acosh |   Inverse hyperbolic cosine. (unary op) |\n| addlistener |   Add listener for event. |\n| all  |  True if all elements of a tensor are nonzero. (unary op) |\n| and  |  \u0026 Logical AND. (binary op) |\n| any  |  True if any element of a tensor is a nonzero number or is (unary op) |\n| arrayfun  |  Apply a function on the entire array, in slices. |\n| arrayfun2  |  Apply a function on two similar arrays, in slices. |\n| asin  |  Inverse sine, result in radians. (unary op) |\n| asinh  |  Inverse hyperbolic sine. (unary op) |\n| atan  |  Inverse tangent, result in radians. (unary op) |\n| atanh  |  Inverse hyperbolic tangent. (unary op) |\n| cast  |  Cast a variable to a different data type or class. |\n| ceil  |  Round towards plus infinity. (unary op) |\n| char  |  Convert tensor representation to character array (string). |\n| conj  |  Complex conjugate. (unary op) |\n| copyobj  |  Make deep copy of array. |\n| cos  |  Cosine of argument in radians. (unary op) |\n| cosh  |  Hyperbolic cosine. (unary op) |\n| ctranspose  |  ' Complex conjugate transpose. |\n| cumprod  |  Cumulative product of elements. (unary op) |\n| cumsum  |  Cumulative sum of elements. (unary op) |\n| del2  |  Discrete Laplacian. (unary op) |\n| delete  |  Delete the file, if a temporary file was created for this variable |\n| disp  |  LAY Display array (long). |\n| display  |  Display array (short). |\n| double  |  SINGLE Convert tensor representation to double precision (float64). |\n| end  |  Last index in an indexing expression |\n| eq  |  == Equal. (binary op) |\n| exp  |  Exponential. (unary op) |\n| fileparts  |  Return the files associated with the data |\n| find  |  Find indices of nonzero elements. (unary op) |\n| findobj  |  Find objects matching specified conditions. |\n| findprop  |  Find property of MATLAB handle object. |\n| floor  |  Round towards minus infinity. (unary op) |\n| fread  |  Read binary data from file. |\n| fwrite  |  Write binary data from file. |\n| ge  |  \u003e= Greater than or equal. (binary op) |\n| get  |  Get MATLAB object properties. |\n| getdisp  |  Specialized MATLAB object property display. |\n| gt  |  \u003e Greater than. (binary op) |\n| imag  |  Complex imaginary part. (unary op) |\n| int16  |  Convert tensor representation to signed 16-bit integer. |\n| int32  |  Convert tensor representation to signed 32-bit integer. |\n| int64  |  Convert tensor representation to signed 64-bit integer. |\n| int8  |  Convert tensor representation to signed 8-bit integer. |\n| ipermute  |  Inverse permute array dimensions. |\n| ischar  |  True for character array (string). |\n| isempty  |  True for empty array. |\n| isequal  |  True if arrays are numerically equal. (binary op) |\n| isfinite  |  True for finite elements. (unary op) |\n| isfloat  |  True for floating point arrays, both single and double. |\n| isinf  |  True for infinite elements. (unary op)  |\n| isinteger  |  True for arrays of integer data type. |\n| islogical  |  True for logical array. |\n| ismatrix  |  True if array is a matrix (not a scalar). |\n| isnan  |  True for Not-a-Number. (unary op) |\n| isnumeric  |  True for numeric arrays. |\n| isreal  |  True for real array. |\n| isscalar  |  True if array is a scalar. |\n|  isvalid  |  Test handle validity. |\n| ldivide  |  .\\ Left array divide. (binary op) |\n| le  |  \u003c= Less than or equal. (binary op) |\n| length  |  Length of vector. |\n| load | Lazy loading from data files. |\n| loadobj  |  Load filter for objects. |\n| log  |  Natural logarithm. (unary op) |\n| log10  |  Common (base 10) logarithm. (unary op) |\n| logical  |  UINT8 Convert tensor representation to logical (true/false). |\n| lt  |  \u003c Less than. (binary op) |\n| max  |  Largest component. |\n| mean  |  Average or mean value. (unary op) |\n| median  |  Median value. (unary op) |\n| min  |  Smallest component. |\n| minus  |  - Minus. (binary op) |\n| mldivide  |  \\ Backslash or left matrix divide. (binary op) |\n| mpower  |  ^ Matrix power. (binary op) |\n| mrdivide  |  / Slash or right matrix divide. (binary op) |\n| mtimes  |  * Matrix multiply. (binary op) |\n| ndims  |  Number of dimensions. |\n| ne  |  ~= Not equal. (binary op) |\n| nonzeros  |  Nonzero matrix elements. (unary op) |\n| norm  |  Matrix or tensor norm. (unary op) |\n| not  |  ~ Logical NOT. (unary op) |\n| notify  |  Notify listeners of event. |\n| numel  |  Number of objects in a vector. Use `prod(size(M))` or `numel2` for number of elements in an object. |\n| numel2  |  NUMEL2 Number of elements in an array, same as `prod(size(M))` |\n| or  |  | Logical OR. (binary op) |\n| pack | Compress mapped data files |\n| permute  |  Permute array dimensions |\n| plot  |  Plot an array. |\n| plus  |  + Plus. (binary op) |\n| power  |  .^ Array power. (binary op) |\n| prod  |  Product of elements. (unary op) |\n| rdivide  |  ./ Right array divide. (binary op) |\n| real  |  Real part. (unary op) |\n| reducevolume  |  reduce an array size |\n| reshape  |  Reshape array. |\n| round  |  Round towards nearest integer. (unary op) |\n| runtest  |  runs a set of tests on object methods |\n| saveobj  |  Save filter for objects. |\n| set  |  Set MATLAB object property values. |\n| setdisp  |  Specialized MATLAB object property display. |\n| sign  |  Signum function. (unary op) |\n| sin  |  Sine of argument in radians. (unary op) |\n| single  |  Convert tensor representation to single precision (float32). |\n| sinh  |  Hyperbolic sine. (unary op) |\n| size  |  Get original tensor size, and extend dimensions if necessary |\n| sqrt  |  Square root. (unary op) |\n| subsasgn  |  Subscripted assignment |\n| subsref  |  Subscripted reference. |\n| sum  |  Sum of elements. |\n| tan  |  Tangent of argument in radians. (unary op) |\n| tanh  |  Hyperbolic tangent. (unary op) |\n| times  |  .* Array multiply. (binary op) |\n| transpose  |  .' Transpose. |\n| uint16  |  Convert tensor representation to unsigned 16-bit integer. |\n| uint32  |  Convert tensor representation to unsigned 32-bit integer. |\n| uint64  |  Convert tensor representation to unsigned 64-bit integer. |\n| uint8  |  Convert tensor representation to unsigned 8-bit integer. |\n| uminus  |  - Unary minus. (unary op) |\n| unpack | Decompress mapped data files |\n| uplus  |  + Unary plus (copyobj). |\n| var  |  Variance. (unary op) |\n| version  |  Return class version |\n| xor  |  Logical EXCLUSIVE OR. (binary op) |\n\n\n## Examples\n\n```matlab\n   % To create a mapped file for a given input array:\n   % A temporary file is created to hold the data.\n   m = MappedTensor(rand(100,100,100));\n\n   % To reuse a previously existing mapped file:\n   m = MappedTensor('records.dat', [100 100 100], ...\n         'format','double', 'writable', true);\n   m(:) = rand(100, 100, 100);  % assign new data\n   m(1:2:end) = 0;\n```\n\n## Publications\n\nThis work was published in [Frontiers in Neuroinformatics][4]: DR Muir and BM Kampa. 2015. [_FocusStack and StimServer:\nA new open source MATLAB toolchain for visual stimulation and analysis of two-photon calcium neuronal imaging data_][5],\n**Frontiers in Neuroinformatics** 8 _85_. DOI: [10.3389/fninf.2014.00085](http://dx.doi.org/10.3389/fninf.2014.00085).\nPlease cite our publication in lieu of thanks, if you use this code.\n\nThis version of the code has been heavily revamped by\n\u003cemmanuel.farhi@synchrotron-soleil.fr\u003e. Please cite the following\npublication:\n- E. Farhi et al., J. Neut. Res., 17 (2013) 5. DOI: 10.3233/JNR-130001\n\n[1]: http://www.mathworks.com\n[2]: http://www.mathworks.com/help/techdoc/ref/memmapfile.html\n[3]: https://github.com/farhi/MappedTensor/releases/latest\n[4]: http://www.frontiersin.org/neuroinformatics\n[5]: http://dx.doi.org/10.3389/fninf.2014.00085\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdylanmuir%2Fmappedtensor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdylanmuir%2Fmappedtensor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdylanmuir%2Fmappedtensor/lists"}