{"id":33294383,"url":"https://github.com/chelsea0x3b/cudarc","last_synced_at":"2026-02-09T17:05:23.288Z","repository":{"id":59502111,"uuid":"537639358","full_name":"chelsea0x3b/cudarc","owner":"chelsea0x3b","description":"Safe rust wrapper around CUDA toolkit","archived":false,"fork":false,"pushed_at":"2025-11-15T17:46:29.000Z","size":4145,"stargazers_count":968,"open_issues_count":15,"forks_count":123,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-11-15T18:25:07.919Z","etag":null,"topics":["cublas","cuda","cuda-kernels","cuda-programming","cuda-toolkit","cudnn","curand","gpu","gpu-acceleration","nccl","nvrtc","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chelsea0x3b.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"chelsea0x3b","patreon":"dfdx","open_collective":null,"ko_fi":"coreylowman","tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2022-09-16T22:40:38.000Z","updated_at":"2025-11-15T16:46:59.000Z","dependencies_parsed_at":"2022-09-18T04:11:11.717Z","dependency_job_id":"53db7fce-2685-4f2f-bc0e-e73bb199eee9","html_url":"https://github.com/chelsea0x3b/cudarc","commit_stats":{"total_commits":156,"total_committers":8,"mean_commits":19.5,"dds":0.3205128205128205,"last_synced_commit":"beb35342df7387bcae2481a60b60acfa95777825"},"previous_names":["chelsea0x3b/cudarc"],"tags_count":75,"template":false,"template_full_name":null,"purl":"pkg:github/chelsea0x3b/cudarc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chelsea0x3b%2Fcudarc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chelsea0x3b%2Fcudarc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chelsea0x3b%2Fcudarc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chelsea0x3b%2Fcudarc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chelsea0x3b","download_url":"https://codeload.github.com/chelsea0x3b/cudarc/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chelsea0x3b%2Fcudarc/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285834729,"owners_count":27239502,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-22T02:00:05.934Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cublas","cuda","cuda-kernels","cuda-programming","cuda-toolkit","cudnn","curand","gpu","gpu-acceleration","nccl","nvrtc","rust"],"created_at":"2025-11-18T01:00:46.654Z","updated_at":"2025-11-22T18:02:40.544Z","avatar_url":"https://github.com/chelsea0x3b.png","language":"Rust","funding_links":["https://github.com/sponsors/chelsea0x3b","https://patreon.com/dfdx","https://ko-fi.com/coreylowman"],"categories":["Rust"],"sub_categories":[],"readme":"# cudarc: minimal and safe api over the cuda toolkit\n\n[![](https://dcbadge.vercel.app/api/server/AtUhGqBDP5)](https://discord.gg/AtUhGqBDP5)\n[![crates.io](https://img.shields.io/crates/v/cudarc?style=for-the-badge)](https://crates.io/crates/cudarc)\n[![docs.rs](https://img.shields.io/docsrs/cudarc?label=docs.rs%20latest\u0026style=for-the-badge)](https://docs.rs/cudarc)\n\nCheckout cudarc on [crates.io](https://crates.io/crates/cudarc) and [docs.rs](https://docs.rs/cudarc/latest/cudarc/).\n\n**Contributions welcome!**\n\nSafe CUDA wrappers for:\n\n| library | dynamic load | dynamic link | static link |\n| --- | --- | --- | --- |\n| [CUDA driver](https://docs.nvidia.com/cuda/cuda-driver-api/index.html) | ✅ | ✅ | ❌ |\n| [NVRTC](https://docs.nvidia.com/cuda/nvrtc/index.html) | ✅ | ✅ | ✅ |\n| [cuRAND](https://docs.nvidia.com/cuda/curand/index.html) | ✅ | ✅ | ✅ |\n| [cuBLAS](https://docs.nvidia.com/cuda/cublas/index.html) | ✅ | ✅ | ✅ |\n| [cuBLASLt](https://docs.nvidia.com/cuda/cublas/#using-the-cublaslt-api) | ✅ | ✅ | ✅ |\n| [NCCL](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/) | ✅ | ✅ | ✅ |\n| [cuDNN](https://docs.nvidia.com/deeplearning/cudnn/backend/latest/api/overview.html) | ✅ | ✅ | ✅ |\n| [cuSPARSE](https://docs.nvidia.com/cuda/cusparse/) | ✅ | ✅ | ✅ |\n| [cuSOLVER](https://docs.nvidia.com/cuda/cusolver/) | ✅ | ✅ | ❌ |\n| [cuFILE](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html#introduction) | ✅ | ✅ | ✅ |\n| [CUPTI](https://docs.nvidia.com/cupti/) | ✅ | ✅ | ✅ |\n| [nvtx](https://nvidia.github.io/NVTX/) | ✅ | ✅ | ❌ |\n\nCUDA Versions supported\n- 11.4-11.8\n- 12.0-12.9\n- 13.0\n\nCUDNN versions supported:\n- 9.12.0\n\nNCCL versions supported:\n- 2.28.3\n\n# Configuring CUDA version\n\nSelect cuda version with one of:\n- `-F cuda-version-from-build-system`: At build time will get the cuda toolkit version using `nvcc`\n    - `-F fallback-latest`: can be used to control behavior if this fails. default is not enabled, which will cause the build\n      script to panic. if `-F fallback-latest` is enabled, we will use the highest bindings we have.\n- `-F cuda-\u003cmajor\u003e0\u003cminor\u003e0` to build for a specific version of cuda\n\n# Configuring linking\n\nBy default we use `-F dynamic-loading`, which will not require any libraries to be present at build time.\n\nYou can also enable `-F dynamic-linking` or `-F static-linking` for your use case.\n\n# Getting started\n\nIt's easy to create a new device and transfer data to the gpu:\n\n```rust\n// Get a stream for GPU 0\nlet ctx = cudarc::driver::CudaContext::new(0)?;\nlet stream = ctx.default_stream();\n\n// copy a rust slice to the device\nlet inp = stream.clone_htod(\u0026[1.0f32; 100])?;\n\n// or allocate directly\nlet mut out = stream.alloc_zeros::\u003cf32\u003e(100)?;\n```\n\nYou can also use the nvrtc api to compile kernels at runtime:\n\n```rust\nlet ptx = cudarc::nvrtc::compile_ptx(\"\nextern \\\"C\\\" __global__ void sin_kernel(float *out, const float *inp, const size_t numel) {\n    unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;\n    if (i \u003c numel) {\n        out[i] = sin(inp[i]);\n    }\n}\")?;\n\n// Dynamically load it into the device\nlet module = ctx.load_module(ptx)?;\nlet sin_kernel = module.load_function(\"sin_kernel\")?;\n```\n\n`cudarc` provides a very simple interface to launch kernels using a builder pattern to specify kernel arguments:\n\n```rust\nlet mut builder = stream.launch_builder(\u0026sin_kernel);\nbuilder.arg(\u0026mut out);\nbuilder.arg(\u0026inp);\nbuilder.arg(\u0026100usize);\nunsafe { builder.launch(LaunchConfig::for_num_elems(100)) }?;\n```\n\nAnd of course it's easy to copy things back to host after you're done:\n\n```rust\nlet out_host: Vec\u003cf32\u003e = stream.clone_dtoh(\u0026out)?;\nassert_eq!(out_host, [1.0; 100].map(f32::sin));\n```\n\n# Design\n\nGoals are:\n1. As safe as possible (there will still be a lot of unsafe due to ffi \u0026 async)\n2. As ergonomic as possible\n3. Allow mixing of high level `safe` apis, with low level `sys` apis\n\nTo that end there are three levels to each wrapper (by default the safe api is exported):\n```rust\nuse cudarc::driver::{safe, result, sys};\nuse cudarc::nvrtc::{safe, result, sys};\nuse cudarc::cublas::{safe, result, sys};\nuse cudarc::cublaslt::{safe, result, sys};\nuse cudarc::curand::{safe, result, sys};\nuse cudarc::nccl::{safe, result, sys};\n```\n\nwhere:\n1. `sys` is the raw ffi apis generated with bindgen\n2. `result` is a very small wrapper around sys to return `Result` from each function\n3. `safe` is a wrapper around result/sys to provide safe abstractions\n\n*Heavily recommend sticking with safe APIs*\n\n# License\n\nDual-licensed to be compatible with the Rust project.\n\nLicensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchelsea0x3b%2Fcudarc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchelsea0x3b%2Fcudarc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchelsea0x3b%2Fcudarc/lists"}