{"id":26999838,"url":"https://github.com/tofuproject/datastock","last_synced_at":"2025-04-04T03:18:04.232Z","repository":{"id":37790762,"uuid":"463159175","full_name":"ToFuProject/datastock","owner":"ToFuProject","description":"Provides a generic DataStock class, useful for containing classes and multiple data arrays, with interactive plots","archived":false,"fork":false,"pushed_at":"2025-03-31T05:00:23.000Z","size":1074,"stargazers_count":3,"open_issues_count":17,"forks_count":0,"subscribers_count":3,"default_branch":"devel","last_synced_at":"2025-03-31T05:45:12.609Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ToFuProject.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-24T13:30:42.000Z","updated_at":"2025-03-31T04:58:45.000Z","dependencies_parsed_at":"2023-11-14T02:40:40.695Z","dependency_job_id":"d66c6a81-f1ac-402f-9298-4b6055fa5847","html_url":"https://github.com/ToFuProject/datastock","commit_stats":{"total_commits":446,"total_committers":6,"mean_commits":74.33333333333333,"dds":0.6883408071748879,"last_synced_commit":"a7c8bf415f9d04b2481d7b1851ae0099ed3a53fa"},"previous_names":[],"tags_count":48,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToFuProject%2Fdatastock","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToFuProject%2Fdatastock/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToFuProject%2Fdatastock/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToFuProject%2Fdatastock/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ToFuProject","download_url":"https://codeload.github.com/ToFuProject/datastock/tar.gz/refs/heads/devel","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247112756,"owners_count":20885606,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-04T03:18:03.522Z","updated_at":"2025-04-04T03:18:04.221Z","avatar_url":"https://github.com/ToFuProject.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Conda]( https://anaconda.org/conda-forge/datastock/badges/version.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/downloads.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/latest_release_date.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/platforms.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/license.svg)](https://github.com/conda-forge/datastock/blob/master/LICENSE.txt)\n[![](https://anaconda.org/conda-forge/datastock/badges/installer/conda.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://badge.fury.io/py/datastock.svg)](https://badge.fury.io/py/datastock)\n\n\n\ndatastock\n=========\n\nProvides a generic class for storing multiple heterogeneous numpy arrays with non-uniform shapes and built-in interactive visualization routines.\nAlso stores the relationships between arrays (e.g.: matching dimensions...)\nAlso provides an elegant way of storing objects of various categories depending on the storeed arrays\n\n\nThe full power of datastock is unveiled when using the DataStock class and sub-classing it for your own use.\n\nBut a simpler and more straightforward use is possible if you are just looking for a ready-to-use interactive visualization tool of 1d, 2d and 3d numpy arrays by using a shortcut\n\n\nInstallation:\n-------------\n\ndatastock is available on Pypi and anaconda.org\n\n``\npip install datastock\n``\n\n``\nconda install -c conda-forge datastock\n``\n\nExamples:\n=========\n \n\nStraightforward array visualization:\n------------------------------------\n\n``\nimport datastock as ds\n\n# any 1d, 2d or 3d array\naa = np.random((100, 100, 100))\n\n# plot interactive figure using shortcut to method\ndax = ds.plot_as_array(aa)\n``\n\nNow do **shift + left clic** on any axes, the rest of the interactive commands are automatically printed in your python console\n\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DirectVisualization_3d.png\" width=\"600\" alt=\"Direct 3d array visualization\"/\u003e\n\u003c/p\u003e\n\n\nThe DataStock class:\n--------------------\n\nYou will want to instanciate the DataStock class (which is the cor of datastock) if:\n* You have many numpy arrays, not just one, especially if they do not have the same shape\n* You want to define a variety of objects from these data arrays (DataStock can be seen as a class storing many sub-classes)\n\n\nDataStock has 3 main dict attributes:\n* `dref`: to store the size of each dimension, each under a unique key\n* `ddata`: to store all numpy arrays, each under a unique key\n* `dobj`: to store any number of arbitrary sub-dict, each containing a category of object\n\nThanks to dref, the class knows the relationaships between all numpy arrays.\nIn particular it knows which arrays share the same references / dimensions\n\n\n```python\nimport numpy as np\nimport datastock as ds\n\n# -----------\n# Define data\n# Here: time-varying profiles representing velocity measurement across the radius of a tube\n# we assume 5 measurement campaigns were conducted, each yielding a different number of measurements, all sampled on 80 radial points\n\nnc = 5\nnx = 80\nlnt = [100, 90, 80, 120, 110]\n\nx = np.linspace(1, 2, nx)\nlt = [np.linspace(0, 10, nt) for nt in lnt]\nlprof = [(1 + np.cos(t)[:, None]) * x[None, :] for t in lt]\n\n# ------------------\n# Populate DataStock\n\n# instanciate \ncoll = ds.DataStock()\n\n# add references (i.e.: store size of each dimension under a unique key)\ncoll.add_ref(key='nc', size=nc)\ncoll.add_ref(key='nx', size=nx)\nfor ii, nt in enumerate(lnt):\n    coll.add_ref(key=f'nt{ii}', size=nt)\n\n# add data dependening on these references\n# you can, optionally, specify units, physical dimensionality (ex: distance, time...), quantity (ex: radius, height, ...) and name (to your liking)\n\ncoll.add_data(key='x', data=x, dimension='distance', quant='radius', units='m', ref='nx')\nfor ii, nt in enumerate(lnt):\n    coll.add_data(key=f't{ii}', data=lt[ii], dimension='time', units='s', ref=f'nt{ii}')\n    coll.add_data(key=f'prof{ii}', data=lprof[ii], dimension='velocity', units='m/s', ref=(f'nt{ii}', 'x'))\n\n# print in the console the content of st\ncoll\n```\n\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_refdata.png\" width=\"600\" alt=\"Direct 3d array visualization\"/\u003e\n\u003c/p\u003e\n\nYou can see that DataStock stores the relationships between each array and each reference\nSpecifying explicitly the references is only necessary if there is an ambiguity (i.e.: several references have the same size, like nx and nt2 in our case)\n\n\n``\n# plot any array interactively\ndax = coll.plot_as_array('x')\ndax = coll.plot_as_array('t0')\ndax = coll.plot_as_array('prof0')\ndax = coll.plot_as_array('prof0', keyX='t0', keyY='x', aspect='auto')\n``\n\nYou can then decide to store any object category\nLet's create a 'campaign' category to store the characteristics of each measurements campaign\nand let's add a 'campaign' parameter to each profile data\n\n``\n# add arbitrary object category as sub-dict of self.dobj\nfor ii in range(nc):\n    coll.add_obj(\n        which='campaign',\n\t    key=f'c{ii}',\n        start_date=f'{ii}.04.2022',\n        end_date=f'{ii+5}.05.2022',\n        operator='Barnaby' if ii \u003e 2 else 'Jack Sparrow',\n        comment='leak on tube' if ii == 1 else 'none',\n        index=ii,\n    )\n\n# create new 'campaign' parameter for data arrays\ncoll.add_param('campaign', which='data')\n\n# tag each data with its campaign\nfor ii in range(nc):\n    coll.set_param(which='data', key=f't{ii}', param='campaign', value=f'c{ii}')\t\n    coll.set_param(which='data', key=f'prof{ii}', param='campaign', value=f'c{ii}')\t\n\n# print in the console the content of st\ncoll\n``\n\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_Obj.png\" width=\"600\" alt=\"Direct 3d array visualization\"/\u003e\n\u003c/p\u003e\n\nDataStock also provides built-in object selection method to allow return all\nobjects matching a criterion, as lits of int indices, bool indices or keys.\n\n``\nIn [9]: coll.select(which='campaign', index=2, returnas=int)\nOut[9]: array([2])\n\n# list of 2 =\u003e return all matches inside the interval\nIn [10]: coll.select(which='campaign', index=[2, 4], returnas=int)\nOut[10]: array([2, 3, 4])\n\n# tuple of 2 =\u003e return all matches outside the interval\nIn [11]: coll.select(which='campaign', index=(2, 4), returnas=int)\nOut[11]: array([0, 1])\n\n# return as keys\nIn [12]: coll.select(which='campaign', index=(2, 4), returnas=str)\nOut[12]: array(['c0', 'c1'], dtype='\u003cU2')\n\n# return as bool indices\nIn [13]: coll.select(which='campaign', index=(2, 4), returnas=bool)\nOut[13]: array([ True,  True, False, False, False])\n\n# You can combine as many constraints as needed\nIn [17]: coll.select(which='campaign', index=[2, 4], operator='Barnaby', returnas=str)\nOut[17]: array(['c3', 'c4'], dtype='\u003cU2')\n\n``\n\nYou can also decide to sub-class DataStock to implement methods and visualizations specific to your needs\n\n\nOther useful built-in methods:\n-----------------------------\n\nDataStock provides built-in methods like:\n* `get_nbytes()`: return a tuple (size, dsize) where:\n    - size is the total size of all data stored in the instance in bytes\n    - dsize is a dict with the detail (size for each item in each sub-dict of the instance)\n* `save()`: will save the instance\n* `coll.load()`: will load a saved instance\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftofuproject%2Fdatastock","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftofuproject%2Fdatastock","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftofuproject%2Fdatastock/lists"}