{"id":18422045,"url":"https://github.com/spcl/arrow-matrix","last_synced_at":"2025-04-07T14:32:18.861Z","repository":{"id":225316216,"uuid":"764570426","full_name":"spcl/arrow-matrix","owner":"spcl","description":"Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication","archived":false,"fork":false,"pushed_at":"2024-03-25T16:17:26.000Z","size":72,"stargazers_count":15,"open_issues_count":1,"forks_count":3,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-22T19:45:08.374Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spcl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-28T10:17:15.000Z","updated_at":"2024-12-11T06:45:47.000Z","dependencies_parsed_at":"2024-11-06T04:29:37.119Z","dependency_job_id":"3ae2800b-9ce9-4f19-86f1-d4e7f55922e1","html_url":"https://github.com/spcl/arrow-matrix","commit_stats":null,"previous_names":["spcl/arrow-matrix"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Farrow-matrix","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Farrow-matrix/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Farrow-matrix/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Farrow-matrix/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spcl","download_url":"https://codeload.github.com/spcl/arrow-matrix/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247670072,"owners_count":20976497,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T04:27:49.921Z","updated_at":"2025-04-07T14:32:18.537Z","avatar_url":"https://github.com/spcl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Arrow Matrix Decomposition - Fast SpMM for Tall-Skinny Matrices\n\nWe propose a novel approach to iterated sparse matrix dense matrix multiplication, a fundamental computational kernel in scientific computing and graph neural network training. In cases where matrix sizes exceed the memory of a single compute node, data transfer becomes a bottleneck. An approach based on dense matrix multiplication algorithms leads to sub-optimal scalability and fails to exploit the sparsity in the problem. To address these challenges, we propose decomposing the sparse matrix into a small number of highly structured matrices called arrow matrices, which are connected by permutations. Our approach enables communication-avoiding multiplications, achieving a polynomial reduction in communication volume per iteration for matrices corresponding to planar graphs and other minor-excluded families of graphs. Our evaluation demonstrates that our approach outperforms a state-of-the-art method for sparse matrix multiplication on matrices with hundreds of millions of rows, offering near-linear strong and weak scaling.\n\nThis project contains the code for the paper\n[Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication, Gianinazzi et al., PPoPP 2024](https://dl.acm.org/doi/10.1145/3627535.3638496)\n\n## Key Features\n\n**Scalable and Distributed Computing**: With support for mpi4py and Cray-MPICH, the module is designed for scalability, facilitating distributed computing across multiple nodes and GPUs.\n\n**Efficient SpMM Operations**: By integrating CSRMM kernels and leveraging GPU acceleration, our module offers highly efficient SpMM operations suitable for large-scale scientific computing tasks.\n\n**Advanced Decomposition Techniques**: The use of linear arrangement frameworks and pruning, coupled with the innovative decomposition algorithm, ensures optimal performance and resource utilization in SpMM operations.\n\n**Compatibility and Versatility**: The implementation's reliance on widely-used and well-supported libraries and frameworks ensures broad compatibility and application across various computing environments and use cases.\n\n## Installation\n\nThe package can be installed using pip:\n```\npip install -e .\n```\nTo enable gpu support you additionally need to install [cupy](https://docs.cupy.dev/en/stable/install.html)\n\nFor example:\n```commandline\npip install cupy-cuda11x\n```\nor:\n```commandline\npip install cupy-cuda12x\n```\n\n\nTo verify the installation, you can run the tests:\n```\ncd scripts\nchmod +x run_tests.sh\n./run_tests.sh\n```\n\n## Quick Start\n\nUsing the arrow matrix spmm requires two steps:\n\n1) decompose the matrix\n2) perform the spmm\n\nWe provide two implementations for 1., one in python and one in Julia.\nThe python implementation may be called from the `arrow_decompose` commandline call.\n\nExample Usage (.mat input):\n```commandline\narrow_decompose --dataset_dir ~/data --dataset_name graph1 graph2 --format 'matlab' --width 10000\n```\n\nExample Usage Matrix Market (.mtx) input:\n```commandline\narrow_decompose --dataset_dir ~/data --dataset_name graph1 graph2 --format 'mtx' --width 10000\n```\nOptions:\n* For a directed graph, pass `--directed True`.\n* To visualize the arrow matrices, pass `--visualize True`. \n* Pass `save_input_graph True` to save the input graph in order to speed up later invocations of the script.\n\nThe Julia implementation may be called from the `ArrowDecompositionMain.jl` script.\nIt is necessary to convert its output to the npy format using the `convert_to_csr.jl` scripy\n\nTo multiply 10 times with the decomposed matrix on random right-hand sides, you can use the `spmm_arrow` commandline call.\n```commandline\nmpiexec -n 8 spmm_arrow --path ./data/graph1_B --width 10000 --features 16 --device cpu --iterations 10\n```\n\nTo use your custom right-hand sides, you need to \nuse the `ArrowDecompositionMPI` class directly, as defined in `arrow_dec_mpi.py`.\nTo see how to use that class, refer to `arrow_bench.py`.\n\n## Provided SpMM Implementations\n\n### Arrow Matrix\n\nThe arrow matrix decomposition-based kernel can be invoked via the spmm_arrow entry point.\nIt requires that the arrow matrices have been decomposed and are available in the specified directory.\n\nExample usage:\n```commandline\nmpiexec -n 8 ./scripts/spmm_arrow_main.py --path ./data/graph1_B --width 10000 --features 16 --device gpu --iterations 10\n```\n\n### 1.5D A-Stationary\n\nThe 1.5D A-Stationary-based kernel can be invoked via the spmm_15d entry point.\n\nExample usage:\n```commandline\nmpiexec -n 8 ./scripts/spmm_15d_main.py --dataset file --file /path/to/matrix.mat --iterations 10 --device gpu\n```\n\nTo run a benchmark with a random float32 sparse matrix having 100,000 vertices and 1,000,000 edges on a GPU:\n```commandline\nmpiexec -n 8 ./scripts/spmm_15d_main.py --vertices 100000 --edges 1000000 --device gpu\n```\n\n### Hypergraph-Partitioning-Based PeTSc-style\n\nThe hypergraph partitioning-based kernel can be invoked via the spmm_petsc entry point.\n\n## Usage\n```commandline\npython ./scripts/spmm_petsc_main.py --type float64 --file matrix.part.1.slice.2.npz --gpu-tiling True\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Farrow-matrix","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspcl%2Farrow-matrix","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Farrow-matrix/lists"}