An open API service indexing awesome lists of open source software.

https://github.com/hyoklee/h5p

HDF5 for Parallel
https://github.com/hyoklee/h5p

mpi parallel-computing

Last synced: 3 months ago
JSON representation

HDF5 for Parallel

Awesome Lists containing this project

README

          

# h5p

## Goals

* Rewrite HDF5 model and format for parallel processing on distributed systems.
* Optimize HDF5 library & tools for parallel processing on distributed systems.
* Improve security and reliability for parallel processing on distributed systems.

## Problems

The existing HDF5 Parallel Library / MPI-IO has some issues.

* Can't build.
* Can't test.
* Can't scale.

## Backgrounds

pnetcdf can't create NetcDF-4/HDF5, only NetCDF-3.
You need to use NetCDF-3 to NetCDF-4 conversion tool.

Parquet is great for distributed system.
You need to use Pandas to convert parquet to HDF5.

## Solutions

Hide MPI/Dask/Spark calls.

```c
#include

h5p_use("mpi"); /* replace mpi with dask or spark */
H5P_FILE* fp = h5p_open("s3://test.h5p", "w");
h5p_write(fp, "/g/d", data);
h5p_close(fp);

H5P_FILE* fp = h5p_open("s3://test.h5p", "r");
data = h5p_read(fp, "/g/d");
h5p_close(fp);
```

## Experiments

* bin/h.bat: test script for Intel OneAPI on Windows
* bin/d.bat: debugging script for Intel OneAPI on Windows