https://github.com/frobnitzem/mpi_list
A package for working with lists distributed over MPI
https://github.com/frobnitzem/mpi_list
data-science hpc map-reduce mpi4py
Last synced: 10 months ago
JSON representation
A package for working with lists distributed over MPI
- Host: GitHub
- URL: https://github.com/frobnitzem/mpi_list
- Owner: frobnitzem
- License: mit
- Created: 2021-04-02T23:16:25.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-09-08T17:48:35.000Z (over 2 years ago)
- Last Synced: 2025-03-16T16:39:59.033Z (10 months ago)
- Topics: data-science, hpc, map-reduce, mpi4py
- Language: Python
- Homepage: https://mpi-list.readthedocs.io/en/latest/
- Size: 45.9 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.rst
- License: LICENSE.txt
- Authors: AUTHORS.rst
Awesome Lists containing this project
README
========
mpi list
========
This package implements the `DFM` class.
The `DFM` is a useful abstraction for working with
lists distributed over a set of MPI ranks.
The acronym stands for distributed free monoid,
which is just a fancy way to say it's a list.
If you're familiar with spark, it's like an RDD,
but only holds a list.
Quick Start
===========
.. code-block::
from mpi_list import Context, DFM
C = Context() # calls MPI_Init via mpi4py
# After each of the three lines below:
# 1. each rank now has 1000//C.procs consecutive numbers
# 2. each rank now has a list of strings
# 3. only numbers containing a '2' remain
dfm = C . iterates(1000) \
. map(lambda i: f"String {i}") \
. filter(lambda s: '2' in s)
if C.rank == 0:
# Caution! Uncommenting this will deadlock your program.
# Collective calls must be called by all ranks!
#print( dfm . head(10) )
pass
# This is OK, since all ranks now have 'ans'
ans = dfm.head(10)
if C.rank == 0:
print( ans )
ans = dfm . filter(lambda s: len(s) <= len("String nn")) \
. collect()
if ans is not None: # only rank 0 gets "collect"
print( ans )
Launch your program with `mpirun python my_prog.py`.
If you're using a supercomputer, consider installing
`spindle `_,
and then use `spindle mpirun python my_prog.py`.
.. _pyscaffold-notes:
Note
====
This project has been set up using PyScaffold 4.0.1. For details and usage
information on PyScaffold see https://pyscaffold.org/.