{"id":19121967,"url":"https://github.com/rurumimic/cuda","last_synced_at":"2026-06-18T08:31:16.429Z","repository":{"id":208848669,"uuid":"722511345","full_name":"rurumimic/cuda","owner":"rurumimic","description":"compute unified device architecture","archived":false,"fork":false,"pushed_at":"2026-02-18T09:58:41.000Z","size":5807,"stargazers_count":0,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-02-18T14:25:48.739Z","etag":null,"topics":["cuda","deep-learning","gpu","nvidia"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rurumimic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-11-23T10:02:08.000Z","updated_at":"2026-02-18T09:58:46.000Z","dependencies_parsed_at":"2024-12-16T12:18:42.607Z","dependency_job_id":"63c09bf9-fec8-4c8d-bd2d-079fa86e9998","html_url":"https://github.com/rurumimic/cuda","commit_stats":{"total_commits":3,"total_committers":1,"mean_commits":3.0,"dds":0.0,"last_synced_commit":"0e8c4d030ae6a8fa06b92bc1abff4b85489d9208"},"previous_names":["rurumimic/cuda"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rurumimic/cuda","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rurumimic%2Fcuda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rurumimic%2Fcuda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rurumimic%2Fcuda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rurumimic%2Fcuda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rurumimic","download_url":"https://codeload.github.com/rurumimic/cuda/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rurumimic%2Fcuda/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34483274,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-18T02:00:06.871Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","deep-learning","gpu","nvidia"],"created_at":"2024-11-09T05:19:16.483Z","updated_at":"2026-06-18T08:31:16.423Z","avatar_url":"https://github.com/rurumimic.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CUDA\n\n- nvidia developer\n  - [cuda-toolkit](https://developer.nvidia.com/cuda-toolkit)\n  - [gpu compute capability](https://developer.nvidia.com/cuda-gpus)\n- docs\n  - [quick start](https://docs.nvidia.com/cuda/cuda-quick-start-guide/index.html)\n  - [support compiler](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/#host-compiler-support-policy)\n  - [best practices guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/)\n  - [cuda c++ programming guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/)\n- source\n  - [samples](https://developer.nvidia.com/cuda-code-samples)\n  - github: [nvidia/cuda-samples](https://github.com/nvidia/cuda-samples)\n- repos\n  - [cutlass](https://github.com/NVIDIA/cutlass)\n\n---\n\n## GPU Compute Capability\n\n- [gpu compute capability](https://developer.nvidia.com/cuda-gpus)\n\n```bash\nnvidia-smi --query-gpu=compute_cap --format=csv\n\ncompute_cap\n8.6\n```\n\n---\n\n## Code\n\n### Samples\n\n```bash\ngit clone https://github.com/NVIDIA/cuda-samples.git\n```\n\n#### c++11_cuda\n\n- Introduction: [c++11_cuda](https://github.com/NVIDIA/cuda-samples/tree/master/Samples/0_Introduction/c++11_cuda)\n\n```bash\ncd Samples/0_Introduction/c++11_cuda\n```\n\n##### Compile\n\n```bash\nmake HOST_COMPILER=clang++ SMS=\"86\" dbg=1\nmake HOST_COMPILER=g++ SMS=\"86\" dbg=1\nmake HOST_COMPILER=g++-13 SMS=\"86\" dbg=1\n```\n\n#### Run\n\n```bash\n./c++11_cuda\n\nGPU Device 0: \"Ampere\" with compute capability 8.6\n\nRead 3223503 byte corpus from ./warandpeace.txt\ncounted 107310 instances of 'x', 'y', 'z', or 'w' in \"./warandpeace.txt\"\n```\n\n---\n\n## Docs\n\n- [install](docs/install.md)\n- [clang](docs/clang.md): format\n- [api](docs/api.md): driver, runtime\n- [huggingface](docs/huggingface.md)\n  - [text embeddings inference](docs/text.embeddings.inference.md)\n- [docker](docs/docker.md)\n- nvidia\n  - [triton](docs/triton.md)\n  - [libnvidia-container](docs/libnvidia.container.md)\n  - [dynamo](docs/dynamo.md)\n  - [tensorRT](docs/tensorrt.md), src/[tensorrt](src/tensorrt/README.md)\n- [leetgpu](docs/leetgpu.md)\n\n---\n\n## Code\n\n- Hello CUDA: [hello_cuda](src/hello_cuda/README.md), [hello_cuda with C++](src/hello_cuda_cpp/README.md)\n- Thread: [thread_layout](src/thread_layout/README.md)\n- Device: [device_query](src/device_query/README.md)\n- Vector: [vector_add](src/vector_add/README.md)\n- Matrix\n  - add: [matrix_add](src/matrix_add/README.md), [matrix_add_large](src/matrix_add_large/README.md)\n  - mul: [matrix_mul](src/matrix_mul/README.md), [matrix_mul_shared_memory](src/matrix_mul_shared_memory/README.md), [matrix_mul_shared_memory_large](src/matrix_mul_shared_memory_large/README.md)\n- TensorRT: [tensorrt](src/tensorrt/README.md)\n- Sync: [sync](src/sync/README.md), [streams + event](src/streams/README.md)\n\n---\n\n## Ref\n\n- [CUDA Books archive](https://developer.nvidia.com/cuda-books-archive)\n- book: [Programming Massively Parallel Processors](https://www.oreilly.com/library/view/programming-massively-parallel/9780323984638)\n- book: [CUDA Programming](https://github.com/bluekds/CUDA_Programming)\n- book: [The Art of HPC](https://theartofhpc.com/)\n- youtube: [CUDA Programming Course – High-Performance Computing with GPUs](https://www.youtube.com/watch?v=86FAWCzIe_4)\n- youtube: [GPU MODE](https://www.youtube.com/@GPUMODE)\n- [GPU Glossary](https://modal.com/gpu-glossary)\n- UIUC: [Introduction to Parallel Programming with CUDA](https://newfrontiers.illinois.edu/news-and-events/introduction-to-parallel-programming-with-cuda/)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frurumimic%2Fcuda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frurumimic%2Fcuda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frurumimic%2Fcuda/lists"}