https://github.com/jcmgray/xyzpy
Efficiently generate and analyse high dimensional data.
https://github.com/jcmgray/xyzpy
distributed multidimensional-arrays pandas parallel plot xarray
Last synced: about 1 month ago
JSON representation
Efficiently generate and analyse high dimensional data.
- Host: GitHub
- URL: https://github.com/jcmgray/xyzpy
- Owner: jcmgray
- License: mit
- Created: 2016-05-24T18:41:21.000Z (about 9 years ago)
- Default Branch: main
- Last Pushed: 2024-09-05T17:58:17.000Z (9 months ago)
- Last Synced: 2025-04-30T03:49:40.508Z (about 1 month ago)
- Topics: distributed, multidimensional-arrays, pandas, parallel, plot, xarray
- Language: Python
- Homepage: http://xyzpy.readthedocs.io
- Size: 23.6 MB
- Stars: 69
- Watchers: 3
- Forks: 10
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README

[](https://github.com/jcmgray/xyzpy/actions/workflows/tests.yml)
[](https://codecov.io/gh/jcmgray/xyzpy)
[](https://app.codacy.com/gh/jcmgray/xyzpy/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
[](https://xyzpy.readthedocs.io)
[](https://pypi.org/project/xyzpy/)
[](https://anaconda.org/conda-forge/xyzpy)-------------------------------------------------------------------------------
[`xyzpy`](https://github.com/jcmgray/xyzpy) is python library for efficiently generating, manipulating and plotting data with a lot of dimensions, of the type that often occurs in numerical simulations. It stands wholly atop the labelled N-dimensional array library [`xarray`](http://xarray.pydata.org). The project's documentation is hosted on [readthedocs](http://xyzpy.readthedocs.io).
The aim is to take the pain and errors out of generating and exploring data with a high number of possible parameters. This means:
- you don't have to write super nested for loops
- you don't have to remember which arrays/dimensions belong to which variables/parameters
- you don't have to parallelize over or distribute runs yourself
- you don't have to worry about loading, saving and merging disjoint data
- you don't have to guess when a set of runs is going to finish
- you don't have to write batch submission scripts or leave the notebook to use SGE, PBS or SLURMAs well as the ability to automatically parallelize over runs, ``xyzpy``
provides the ``Crop`` object that allows runs and results to be written to disk,
these can then be run by any process with access to the files - e.g. a batch system
such as SGE, PBS or SLURM - or just serve as a convenient persistent progress mechanism.Once your data has been aggregated into a ``xarray.Dataset`` or ``pandas.DataFrame``
there exists many powerful visualization tools such as
[`seaborn`](https://seaborn.pydata.org), [`altair`](https://altair-viz.github.io) and
[`holoviews`](https://holoviews.org) / [`hvplot`](https://hvplot.holoviz.org).
To these ``xyzpy`` adds also a simple 'oneliner' interface for interactively plotting the data
using [`bokeh`](https://bokeh.pydata.org), or for static, publication ready figures
using [`matplotlib`](https://matplotlib.org), whilst being able to see the dependence on
up to 4 dimensions at once.
Please see the [docs](http://xyzpy.readthedocs.io) for more information.