{"id":13995433,"url":"https://github.com/rxwei/cuda-swift","last_synced_at":"2025-04-29T18:31:47.276Z","repository":{"id":43618297,"uuid":"70857440","full_name":"rxwei/cuda-swift","owner":"rxwei","description":"Parallel Computing Library for Linux and macOS \u0026 NVIDIA CUDA Wrapper","archived":false,"fork":false,"pushed_at":"2017-03-27T21:11:07.000Z","size":323,"stargazers_count":82,"open_issues_count":1,"forks_count":8,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-05T18:54:02.117Z","etag":null,"topics":["cublas","cuda","gpu","parallel","swift"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rxwei.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-10-13T23:53:36.000Z","updated_at":"2025-01-01T23:10:06.000Z","dependencies_parsed_at":"2022-11-29T13:20:14.511Z","dependency_job_id":null,"html_url":"https://github.com/rxwei/cuda-swift","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rxwei%2Fcuda-swift","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rxwei%2Fcuda-swift/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rxwei%2Fcuda-swift/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rxwei%2Fcuda-swift/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rxwei","download_url":"https://codeload.github.com/rxwei/cuda-swift/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251559991,"owners_count":21609118,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cublas","cuda","gpu","parallel","swift"],"created_at":"2024-08-09T14:03:24.539Z","updated_at":"2025-04-29T18:31:46.994Z","avatar_url":"https://github.com/rxwei.png","language":"Swift","funding_links":[],"categories":["Swift"],"sub_categories":[],"readme":"# cuda-swift\n\nThis project provides a native Swift interface to CUDA with the following\nmodules:\n\n- [x] CUDA Driver API `import CUDADriver`\n- [x] CUDA Runtime API `import CUDARuntime`\n- [x] NVRTC - CUDA Runtime Compiler `import NVRTC`\n- [x] cuBLAS - CUDA Basic Linear Algebra Subprograms `import CuBLAS`\n- [x] Warp - GPU Acceleration Library `import Warp` ([Thrust](https://github.com/thrust/thrust) counterpart)\n\nAny machine with CUDA 7.0+ and a CUDA-capable GPU is supported. Xcode Playground\nis supported as well. Please refer to [Usage](#Usage)\nand [Components](#Components).\n\n## Quick look\n\n### Value types\n\nCUDA Driver, Runtime, cuBLAS, and NVRTC (real-time compiler) are wrapped in\nnative Swift types. Warp provides higher level value types, `DeviceArray` and\n`DeviceValue`, with copy-on-write semantics.\n\n```swift\nimport Warp\n\n/// Initialize two arrays on device\nvar x: DeviceArray\u003cFloat\u003e = [1.0, 2.0, 3.0, 4.0, 5.0]\nlet y: DeviceArray\u003cFloat\u003e = [1.0, 2.0, 3.0, 4.0, 5.0]\n\n/// Scalar map operations\nx.incrementElements(by: 2) // x =\u003e [2.0, 3.0, 4.0, 5.0, 6.0] on device\nx.multiplyElements(by: 2) // x =\u003e [2.0, 4.0, 6.0, 8.0, 10.0] on device\n\n/// Addition\nx.formElementwise(.addition, with: y) // x =\u003e [3.0, 6.0, 9.0, 12.0, 15.0] on device\n\n/// Dot product\nx • y // =\u003e 165.0\n\n/// Sum\nx.sum() // =\u003e 15\n\n/// Absolute sum\nx.sumOfAbsoluteValues() // =\u003e 15\n\n/// Transform by 1-place math functions\nx.transform(by: .sin)\nx.transform(by: .tanh)\nx.transform(by: .ceil)\n\n/// Elementwise operation\nx.formElementwise(.addition, with: y)\nx.formElementwise(.subtraction, with: y)\nx.formElementwise(.multiplication, with: y)\nx.formElementwise(.division, with: y)\n\n/// Fill with the same value\nvar z = y\nz.fill(with: 10.0)\n\n/// Composite assignment\nx.assign(from: .subtraction, left: y, multipliedBy: 100.0, right: z)\n```\n\n### Real-time compilation\n\n#### Compile source string to PTX\n```swift\nimport NVRTC\nimport CUDADriver\nimport Warp\n\nlet source: String =\n  + \"extern \\\"C\\\" __global__ void saxpy(float a, float *x, float *y, float *out, int n) {\"\n  + \"    size_t tid = blockIdx.x * blockDim.x + threadIdx.x;\"\n  + \"    if (tid \u003c n) out[tid] = a * x[tid] + y[tid];\"\n  + \"}\";\nlet ptx = try Compiler.compile(source)\n```\n\n#### JIT-compile and load PTX using Driver API within a device context\n```swift\ntry Device.main.withContext { context in\n    let module = try Module(ptx: ptx)\n    let function = module.function(named: \"saxpy\")!\n    \n    let x: DeviceArray\u003cFloat\u003e = [1, 2, 3, 4, 5, 6, 7, 8]\n    let y: DeviceArray\u003cFloat\u003e = [2, 3, 4, 5, 6, 7, 8, 9]\n    var result = DeviceArray\u003cFloat\u003e(capacity: 8)\n\n    try function\u003c\u003c\u003c(1, 8)\u003e\u003e\u003e[.float(1.0), .constPointer(to: x), .constPointer(to: y), .pointer(to: \u0026result), .int(8)]\n    /// result =\u003e [3, 5, 7, 9, 11, 13, 15, 17] on device\n}\n```\n\n## Package Information\n\nAdd a dependency:\n\n```swift\n.Package(url: \"https://github.com/rxwei/cuda-swift\", majorVersion: 1)\n```\n\nYou may use the `Makefile` in this repository for you own project. No extra path\nconfiguration is needed.\n\nOtherwise, specify the path to your CUDA headers and library at `swift build`.\n\n#### macOS\n```\nswift build -Xcc -I/usr/local/cuda/include -Xlinker -L/usr/local/cuda/lib\n```\n\n#### Linux\n```\nswift build -Xcc -I/usr/local/cuda/include -Xlinker -L/usr/local/cuda/lib64\n```\n\n## Components\n\n### Core\n\n- [x] CUDADriver - CUDA Driver API\n    - [x] `Context`\n    - [x] `Device`\n    - [x] `Function`\n    - [x] `PTX`\n    - [x] `Module`\n    - [x] `Stream`\n    - [x] `Unsafe(Mutable)DevicePointer\u003cT\u003e`\n    - [x] `DriverError` (all error codes from CUDA C API)\n- [x] CUDARuntime - CUDA Runtime API\n    - [x] `Unsafe(Mutable)DevicePointer\u003cT\u003e`\n    - [x] `Device`\n    - [x] `Stream`\n    - [x] `RuntimeError` (all error codes from CUDA C API)\n- [x] NVRTC - CUDA Runtime Compiler\n    - [x] `Compiler`\n- [x] CuBLAS - GPU Basic Linear Algebra Subprograms (in-progress)\n    - [x] Level 1 BLAS operations\n    - [x] Level 2 BLAS operations (GEMV)\n    - [x] Level 3 BLAS operations (GEMM)\n- [x] Warp - GPU Acceleration Library ([Thrust](https://github.com/thrust/thrust) counterpart)\n    - [x] `DeviceArray\u003cT\u003e` (generic array in device memory)\n    - [x] `DeviceValue\u003cT\u003e` (generic value in device memory)\n    - [x] Acclerated vector operations\n    - [x] Type-safe kernel argument helpers\n\n### Optional\n\n- [x] Swift Playground\n  - CUDADriver works in the playground. But other modules cause the \"couldn't lookup\n    symbols\" problem for which we don't have a solution until Xcode is fixed.\n  - To use the playground, open the Xcode workspace file, and add a library for\n    every modulemap under `Frameworks`.\n\n## Dependencies\n\n- [CCUDA (CUDA C System Module)](https://github.com/rxwei/CCUDA)\n\n## License\n\nMIT License\n\nCUDA is a registered trademark of NVIDIA Corporation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frxwei%2Fcuda-swift","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frxwei%2Fcuda-swift","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frxwei%2Fcuda-swift/lists"}