{"id":27125041,"url":"https://github.com/baro-00/cpp-cuda-lab","last_synced_at":"2026-05-04T07:37:36.261Z","repository":{"id":286085915,"uuid":"960318687","full_name":"Baro-00/cpp-cuda-lab","owner":"Baro-00","description":"Experimental C++ projects using NVIDIA CUDA for parallel computing. Learning \u0026 testing GPU kernels","archived":false,"fork":false,"pushed_at":"2025-04-04T08:25:38.000Z","size":4,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-04T09:29:05.013Z","etag":null,"topics":["cpp","cuda"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Baro-00.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-04T08:22:14.000Z","updated_at":"2025-04-04T08:26:02.000Z","dependencies_parsed_at":"2025-04-04T09:29:26.732Z","dependency_job_id":"d412110d-24fd-4209-8bab-2eefda7b9627","html_url":"https://github.com/Baro-00/cpp-cuda-lab","commit_stats":null,"previous_names":["baro-00/cpp-cuda-lab"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baro-00%2Fcpp-cuda-lab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baro-00%2Fcpp-cuda-lab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baro-00%2Fcpp-cuda-lab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Baro-00%2Fcpp-cuda-lab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Baro-00","download_url":"https://codeload.github.com/Baro-00/cpp-cuda-lab/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247669128,"owners_count":20976330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","cuda"],"created_at":"2025-04-07T14:29:05.891Z","updated_at":"2026-05-04T07:37:36.233Z","avatar_url":"https://github.com/Baro-00.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CUDA Progamming\r\n\r\n## Introduction\r\n\r\n**CUDA** (*Compute Unified Device Architecture*) is a parallel computing platform and API model created by NVIDIA. It enables developers to utilize the power of NVIDIA GPUs for general-purpose processing (GPGPU). CUDA allows programmers to leverage parallelism in GPU cores to dramatically speed up computations, especially useful in scientific calculations, image processing, machine learning, and data-intensive tasks.\r\n\r\n### Prerequisites\r\n\r\nBefore getting started with CUDA programming, ensure you have:\r\n\r\n- NVIDIA GPU compatible with CUDA\r\n\r\n- [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads)\r\n\r\n- [Visual Studio IDE](https://visualstudio.microsoft.com/pl/downloads/) (just for binaries)\r\n\r\n### Supported Hardware\r\n\r\nMake sure your NVIDIA GPU supports CUDA. You can check compatibility [here](https://developer.nvidia.com/cuda-gpus).\r\n\r\n### Documentation\r\n\r\n[CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/)\r\n\r\n\u003e **Hint**: Source code with CUDA integration has `.cu` extension.\r\n\r\n---\r\n\r\n## Getting started\r\n\r\n### Simple CUDA Example (*Vector Addition*)\r\n\r\nFile: `vector_add/vector_add.cu`\r\n\r\n``` cpp\r\n#include \u003ciostream\u003e\r\n#include \u003ccuda_runtime.h\u003e\r\n\r\n// Kernel function executed on GPU\r\n__global__ void vectorAdd(const float *A, const float *B, float *C, int N) {\r\n    int i = blockIdx.x * blockDim.x + threadIdx.x;\r\n    if (i \u003c N) {\r\n        C[i] = A[i] + B[i];\r\n    }\r\n}\r\n\r\nint main() {\r\n    int N = 1024;\r\n    size_t size = N * sizeof(float);\r\n\r\n    float *h_A = new float[N];\r\n    float *h_B = new float[N];\r\n    float *h_C = new float[N];\r\n\r\n    for (int i = 0; i \u003c N; ++i) {\r\n        h_A[i] = i * 1.0f;\r\n        h_B[i] = i * 2.0f;\r\n    }\r\n\r\n    float *d_A, *d_B, *d_C;\r\n    cudaMalloc(\u0026d_A, size);\r\n    cudaMalloc(\u0026d_B, size);\r\n    cudaMalloc(\u0026d_C, size);\r\n\r\n    cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice);\r\n    cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice);\r\n\r\n    int threadsPerBlock = 256;\r\n    int blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock;\r\n\r\n    vectorAdd\u003c\u003c\u003cblocksPerGrid, threadsPerBlock\u003e\u003e\u003e(d_A, d_B, d_C, N);\r\n\r\n    cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost);\r\n\r\n    std::cout \u003c\u003c \"First 5 results:\\n\";\r\n    for (int i = 0; i \u003c 5; ++i) {\r\n        std::cout \u003c\u003c h_A[i] \u003c\u003c \" + \" \u003c\u003c h_B[i] \u003c\u003c \" = \" \u003c\u003c h_C[i] \u003c\u003c \"\\n\";\r\n    }\r\n\r\n    cudaFree(d_A);\r\n    cudaFree(d_B);\r\n    cudaFree(d_C);\r\n\r\n    delete[] h_A;\r\n    delete[] h_B;\r\n    delete[] h_C;\r\n\r\n    return 0;\r\n}\r\n```\r\n\r\n### Build and run\r\n\r\nFor compiling CUDA applications, use the provided NVIDIA compiler (`nvcc`).\r\n\r\n#### Using *x64 Native Tools Command Prompt for VS 2022*\r\n\r\nOpen `x64 Native Tools Command Prompt for VS 2022`\r\n\r\n**Build**\r\n\r\n``` console\r\nnvcc -o vector_add vector_add.cu\r\n```\r\n\r\n**Run**\r\n\r\n``` console\r\nvector_add.exe\r\n```\r\n\r\n#### Using *PowerShell*\r\n\r\nTo compile CUDA code using PowerShell, you must first load Visual Studio environment variables.\r\n\r\nRun the following command in PowerShell to initialize the Visual Studio environment:\r\n\r\n``` console\r\ncmd /c \"`\"C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Auxiliary\\Build\\vcvars64.bat`\" \u0026\u0026 powershell\"\r\n```\r\n\r\nAfter that, you can build and run your CUDA application:\r\n\r\n``` console\r\nnvcc -o vector_add.exe vector_add.cu\r\n.\\vector_add.exe\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaro-00%2Fcpp-cuda-lab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaro-00%2Fcpp-cuda-lab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaro-00%2Fcpp-cuda-lab/lists"}