{"id":20433751,"url":"https://github.com/intelpython/sharded-array-for-python","last_synced_at":"2025-06-29T04:38:32.817Z","repository":{"id":232610882,"uuid":"784730144","full_name":"IntelPython/sharded-array-for-python","owner":"IntelPython","description":null,"archived":false,"fork":false,"pushed_at":"2025-04-04T14:32:22.000Z","size":870,"stargazers_count":2,"open_issues_count":4,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-12T21:06:17.667Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IntelPython.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-10T12:49:41.000Z","updated_at":"2025-01-23T16:58:17.000Z","dependencies_parsed_at":"2024-07-12T15:14:07.532Z","dependency_job_id":"04943ec5-9466-4199-bbe9-aaa0062ec9f6","html_url":"https://github.com/IntelPython/sharded-array-for-python","commit_stats":null,"previous_names":["intelpython/sharded-array-for-python"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsharded-array-for-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsharded-array-for-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsharded-array-for-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelPython%2Fsharded-array-for-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IntelPython","download_url":"https://codeload.github.com/IntelPython/sharded-array-for-python/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248631685,"owners_count":21136562,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-15T08:20:52.219Z","updated_at":"2025-04-12T21:06:23.196Z","avatar_url":"https://github.com/IntelPython.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI](https://github.com/IntelPython/sharded-array-for-python/actions/workflows/ci.yml/badge.svg?event=push)](https://github.com/IntelPython/sharded-array-for-python/actions/workflows/ci.yml)\n\n***This software package is not ready for production use and and merely a proof of concept implementation.***\n\n# Sharded Array For Python\n\nA array implementation following the [array API as defined by the data-API consortium](https://data-apis.org/array-api/latest/index.html).\nParallel and distributed execution currently is MPI/CSP-like. In a later version support for a controller-worker execution model will be added.\n\n## Setting up build environment\n\nInstall MLIR/LLVM and Intel® Extension for MLIR (IMEX, see https://github.com/intel/mlir-extensions). Make sure you use `-DLLVM_ENABLE_RTTI=ON` when configuring LLVM and use build target `all`.\n\n```bash\ngit clone --recurse-submodules https://github.com/IntelPython/sharded-array-for-python\ncd sharded-array-for-python\nconda create --file conda-env.txt --name sharpy\nconda activate sharpy\nexport MPIROOT=$CONDA_PREFIX\nexport MLIRROOT=\u003cyour-MLIR-install-dir\u003e\nexport IMEXROOT=\u003cyour-IMEX-install-dir\u003e\n```\n\n## Building Sharded Array For Python\n\n```bash\npython -m pip install .\n```\n\nIf your compiler does not default to a recent (e.g. g++ \u003e= 9) version, try something like `CC=gcc-9 CXX=g++-9 python setup.py develop`\n\n## Running Tests\n\n```bash\n# single rank\npytest test\n# distributed on multiple ($N) ranks/processes\nmpirun -n $N python -m pytest test\n```\n\n## Running\n\n```python\nimport sharpy as sp\nsp.init(False)\na = sp.arange(0, 10, 1, sp.int64)\n#print(a)       # should trigger compilation\nb = sp.arange(0, 100, 10, sp.int64)\n#print(b.dtype) # should _not_ trigger compilation\nc = a * b\n#print(c)\nd = sp.sum(c, [0])\n#del b          # generated function should _not_ return b\nprint(a, c, d) # printing of c (not a!) should trigger compilation\nsp.fini()\n```\n\nAssuming the above is in file `simple.py` a single-process run is executed like\n\n```bash\npython simple.py\n```\n\nand multi-process run is executed like\n\n```bash\nmpirun -n 5 python simple.py\n```\n\n### Distributed Execution without mpirun\n\nInstead of using mpirun to launch a set of ranks/processes, you can tell the runtime to\nspawns ranks/processes for you by setting SHARPY_MPI_SPAWN to the number of desired MPI processes.\nAdditionally set SHARPY_MPI_EXECUTABLE and SHARPY_MPI_EXE_ARGS.\nAdditionally SHARPY_MPI_HOSTS can be used to control the host to use for spawning processes.\n\nThe following command will run the stencil example on 3 MPI ranks:\n\n```bash\nSHARPY_FALLBACK=numpy \\\n  SHARPY_MPI_SPAWN=2 \\\n  SHARPY_MPI_EXECUTABLE=`which python` \\\n  SHARPY_MPI_EXE_ARGS=\"examples/stencil-2d.py 10 2000 star 2\" \\\n  python examples/stencil-2d.py 10 2000 star 2\n```\n\n## Contributing\n\nPlease setup pre-commit hooks like this\n\n```bash\npre-commit install -f -c ./.pre-commit-config.yaml\npre-commit autoupdate\n```\n\n## Overview\n\n### Deferred Execution\n\nTypically, operations do not get executed immediately. Instead, the function returns a transparent object (a future) only.\nThe actual computation gets deferred by creating a promise/deferred object and queuing it for later. This is not visible to users, they can use it as any other numpy-like library.\n\nOnly when actual data is needed, computation will happen; that is when\n\n- the values of array elements are casted to bool, int, float or string\n- this includes when the array is printed\n\nIn the background a worker thread handles deferred objects. Until computation is needed it dequeues deferred objects from the FIFO queue and asks them to generate MLIR. Objects can either generate MLIR or instead provide a run() function to immediately execute. For the latter case the current MLIR function gets executed before calling run() to make sure potential dependencies are met.\n\n### Distribution\n\nArrays and operations on them get transparently distributed across multiple processes. Respective functionality is partly handled by this library and partly IMEX dist dialect.\nIMEX relies on a runtime library for complex communication tasks and for inspecting runtime configuration, such as number of processes and process id (MPI rank).\nSharded Array For Python provides this library functionality in a separate dynamic library \"idtr\".\n\nRight now, data is split in the first dimension (only). Each process knows the partition it owns. For optimization partitions can actually overlap.\n\nSharded Array For Python currently supports one execution mode: CSP/SPMD/explicitly-distributed execution, meaning all processes execute the same program, execution is replicated on all processes. Data is typically not replicated but distributed among processes. The distribution is handled automatically, all operations on Sharded Arrays For Python can be viewed as collective operations.\n\nLater, we'll add a Controller-Worker/implicitly-distributed execution mode, meaning only a single process executes the program and it distributes data and work to worker processes.\n\n### Array API Coverage\n\nCurrently only a subset of the Array API is covered by Sharded Array For Python\n\n- elementwise binary operations\n- elementwise unary operations\n- subviews (getitem with slices)\n- assignment (setitem with slices)\n- `empty`, `zeros`, `ones`, `linspace`, `arange`\n- reduction operations over all dimensions (max, min, sum, ...)\n- type promotion\n- many cases of shape broadcasting\n\n### Other Functionality\n\n- `sharpy.to_numpy` converts a sharded array into a numpy array.\n- `sharpy.numpy.from_function` allows creating a sharded array from a function (similar to numpy)\n- In addition to the Array API Sharded Array For Python also provides functionality facilitating interacting with sharded arrays in a distributed environment.\n  - `sharpy.spmd.gather` gathers the distributed array and forms a single, local and contiguous copy of the data as a numpy array\n  - `sharpy.spmd.get_locals` return the local part of the distributed array as a numpy array\n- sharpy allows providing a fallback array implementation. By setting SHARPY_FALLBACK to a python package it will call that package if a given function is not provided. It will pass sharded arrays as (gathered) numpy-arrays.\n\n## Environment variables\n\n### Compile time variables\n\nRequired to compile Sharded Array For Python:\n\n- `MLIRROOT`: Set path to MLIR install root.\n- `IMEXROOT`: Set path to Intel MLIR Extensions install root.\n\n### Optional runtime variables\n\n- `SHARPY_VERBOSE`: Set verbosity level. Accepted values 0-4, default 0.\n- `SHARPY_FALLBACK`: Python package to call in case a function is not found in `sharpy`. For example, setting `SHARPY_FALLBACK=numpy` means that calling `sharpy.linalg.norm` would call `numpy.linalg.norm` for the entire (gathered) array.\n- `SHARPY_PASSES`: Set MLIR pass pipeline. To see current pipeline run with `SHARPY_VERBOSE=1`.\n- `SHARPY_FORCE_DIST`: Force code generation in distributed mode even if executed on a single process.\n- `SHARPY_USE_CACHE`: Use in-memory JIT compile cache. Default 1.\n- `SHARPY_OPT_LEVEL`: Set MLIR JIT compiler optimization level. Accepted values 0-3, default 3.\n- `SHARPY_NO_ASYNC`: Do not use asynchronous MPI communication.\n- `SHARPY_SKIP_COMM`: Skip all MPI communications. For debugging purposes only, can lead to incorrect results.\n\nDevice support:\n\n- `SHARPY_DEVICE`: Set desided device, e.g., `\"cpu\"`, or `\"gpu:0\"`. By default CPU is used.\n- `SHARPY_GPUX_SO`: Force path to GPU driver library file.\n\nFor [spawning MPI processes](#distributed-execution-without-mpirun):\n\n- `SHARPY_MPI_SPAWN`: Number of MPI processes to spawn.\n- `SHARPY_MPI_EXECUTABLE`: The executable to spawn.\n- `SHARPY_MPI_EXE_ARGS`: Arguments to pass to the executable.\n- `SHARPY_MPI_HOSTS`: Comma-separated list of hosts for MPI.\n- `PYTHON_EXE`: Path to Python executable. Will be used if `SHARPY_MPI_EXECUTABLE` is undefined.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintelpython%2Fsharded-array-for-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fintelpython%2Fsharded-array-for-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintelpython%2Fsharded-array-for-python/lists"}