{"id":26436174,"url":"https://github.com/anderson101866/cualgo","last_synced_at":"2025-07-14T08:07:30.217Z","repository":{"id":199484595,"uuid":"702647703","full_name":"anderson101866/cualgo","owner":"anderson101866","description":"A cross-platform Pytnon library for fundamental algorithm with GPU-accelerated computing","archived":false,"fork":false,"pushed_at":"2023-12-14T04:31:08.000Z","size":1581,"stargazers_count":26,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-23T06:04:38.666Z","etag":null,"topics":["algorithm","cuda","gpu","gpu-acceleration","gpu-computing","numpy","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anderson101866.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-10-09T18:05:19.000Z","updated_at":"2025-03-17T08:00:01.000Z","dependencies_parsed_at":"2023-12-20T13:24:40.664Z","dependency_job_id":"9d0e3baf-994d-473c-82eb-57f581bc2373","html_url":"https://github.com/anderson101866/cualgo","commit_stats":{"total_commits":32,"total_committers":1,"mean_commits":32.0,"dds":0.0,"last_synced_commit":"0c64ecd0f7942d86d5737288cdf67b54205958f0"},"previous_names":["anderson101866/cualgo"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/anderson101866/cualgo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anderson101866%2Fcualgo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anderson101866%2Fcualgo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anderson101866%2Fcualgo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anderson101866%2Fcualgo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anderson101866","download_url":"https://codeload.github.com/anderson101866/cualgo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anderson101866%2Fcualgo/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265262545,"owners_count":23736411,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","cuda","gpu","gpu-acceleration","gpu-computing","numpy","python"],"created_at":"2025-03-18T08:15:21.131Z","updated_at":"2025-07-14T08:07:30.109Z","avatar_url":"https://github.com/anderson101866.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CuAlgo\n\nCuAlgo is a Python library benefiting from GPU-accelerated computing, featuring a collection of fundamental algorithms implemented with CUDA. Currently, it includes the Floyd-Warshall algorithm for graph analysis, showcasing the potential of GPU acceleration.\n\n[![PyPI package](https://repology.org/badge/version-for-repo/pypi/python:cualgo.svg?header=lastest%20version)](https://repology.org/project/python:cualgo/versions) [![PyPI - Version](https://img.shields.io/pypi/v/cualgo)](https://pypi.org/project/cualgo/) [![Python Versions](https://img.shields.io/pypi/pyversions/cualgo.svg)](https://pypi.org/project/cualgo/) [![CuAlgo build](https://github.com/anderson101866/cualgo/actions/workflows/python-app.yml/badge.svg)](https://github.com/anderson101866/cualgo/actions/workflows/python-app.yml)\n\n## Key Features\n#### Graph Algorithms: \n - Floyd-Warshall algorithm\n\n## Why CuAlgo?\n\n- **Significant Speedup**: Experience substantial performance gains with CuAlgo's GPU-accelerated algorithms compared to their CPU approaches.\n- **User-Friendly Python Interface**: CuAlgo provides convenient interface for Python users. It is compatible with **NumPy**, allowing for easy data interchange with existing scientific computing workflows. Ensuring that python developers can leverage GPU acceleration without delving into CUDA programming details.\n- **Cross-Platform Compatibility**: Developed with CMake, CuAlgo supports cross-platform development, enabling seamless compilation on various operating systems.\n\n## Performance Evaluation\nExplore the Floyd-Warshall implementation using different datasets of sizes N=40, N=1000, and N=2000. This section presents a comprehensive analysis of the efficiency improvements achieved through GPU acceleration.\n\n### Methodology\n- **CPU Version**: The algorithm is executed on the CPU without GPU acceleration.\n- **CPU (12 threads) Version**: Runs on the CPU with 12 threads using OpenMP.\n- **GPU (Unoptimized) Version**: Initial GPU implementation with similar parallelism as the next GPU (Optimized) Version.\n- **GPU (Optimized) Version**: GPU implementation with optimizations, including loop/block unrolling, dynamic parallelism, and coalesced memory access, fully leveraging GPU resources efficiently.\n\n\u003cimg src=\"https://github.com/anderson101866/cualgo/assets/15830675/9d6d4b2e-d4fa-4db1-9a52-fd3d42d325cc\" width=\"600\"\u003e\n\u003cimg src=\"https://github.com/anderson101866/cualgo/assets/15830675/4e3a0fd1-ff81-4d92-9531-b06c1483a9d0\" width=\"600\"\u003e\n\nThe charts illustrate the speedup achieved by CuAlgo's GPU-accelerated algorithms over CPU-based implementations. Notably, the optimized GPU version outperforms both the unoptimized GPU and CPU versions when N grows large, emphasizing the impact of optimization on algorithm efficiency.\n\n#### Hardware and Software Information:\n| \u003c!--  --\u003e | \u003c!--                            --\u003e |\n|-----------|-------------------------------------|\n| CPU       | AMD Ryzen 9 5900X 12-Core Processor |\n| GPU       | NVIDIA GeForce RTX 3060 Ti - 8GB    |\n| RAM       | 32GB DDR4 3600 Mhz                  |\n| CUDA Toolkit Version | 12.2                     |\n| GPU Driver Version   | 537.13                   |\n\n\n\n## Prerequisites\n(For linux, need GCC compiler with C++ support[^GCC_ONLY], and GNU make)\n1. Latest [NVIDIA GPU driver](https://www.nvidia.com.tw/Download/index.aspx)\n2. *Python 3.7+ with pip available*\n3. *Latest CUDA toolkit installed with nvcc compiler. [(download here)](https://developer.nvidia.com/cuda-downloads)*\n\n**NOTE: [Recommended]** You can skip 2 and 3. by using [conda](https://repo.anaconda.com/archive/), see [Installation](#Installation) below\n\n## Installation\n### Linux / Windows [Recommended]:\n```bash\nconda install cuda -c nvidia\npython -m pip install --upgrade pip setuptools\npip install cualgo\n```\n### Windows (without conda):\n1. Install NVIDIA latest GPU driver by yourself\n2. `python -m pip install --upgrade pip setuptools \u0026\u0026 pip install cualgo`\n\n\n## Sample Code\n\nSupport data type of `Numpy`.\n```python\nfrom cualgo import graph as cg\nimport numpy as np\ngraph = np.array([\n    [0     , 7     , np.inf, 8],\n    [np.inf, 0     , 5     , np.inf],\n    [np.inf, np.inf, 0     , 2],\n    [np.inf, np.inf, np.inf, 0]\n], dtype=np.float64)\nprint(cg.floydwarshall(graph))\n# [[0.0, 7.0, 12.0, 8.0], [inf, 0.0, 5.0, 7.0], [inf, inf, 0.0, 2.0], [inf, inf, inf, 0.0]]\n```\n\nOr just simply pass 2D `list` in python\n```python\nfrom cualgo import graph as cg\nINF = 9999\ngraph = [\n    [0  , 7  , INF, 8],\n    [INF, 0  , 5  , INF],\n    [INF, INF, 0  , 2],\n    [INF, INF, INF, 0]\n]\nprint(cg.floydwarshall(graph))\n# [[0, 7, 12, 8], [9999, 0, 5, 7], [9999, 9999, 0, 2], [9999, 9999, 9999, 0]]\n```\n\n[^GCC_ONLY]: GCC works more compatible with CUDA's compiler than clang\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanderson101866%2Fcualgo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanderson101866%2Fcualgo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanderson101866%2Fcualgo/lists"}