https://github.com/baro-00/cpp-cuda-lab

Experimental C++ projects using NVIDIA CUDA for parallel computing. Learning & testing GPU kernels
https://github.com/baro-00/cpp-cuda-lab

cpp cuda

Last synced: 3 months ago
JSON representation

Experimental C++ projects using NVIDIA CUDA for parallel computing. Learning & testing GPU kernels

Host: GitHub
URL: https://github.com/baro-00/cpp-cuda-lab
Owner: Baro-00
License: mit
Created: 2025-04-04T08:22:14.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2025-04-04T08:25:38.000Z (over 1 year ago)
Last Synced: 2025-04-04T09:29:05.013Z (over 1 year ago)
Topics: cpp, cuda
Language: Cuda
Homepage:
Size: 3.91 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # CUDA Progamming

## Introduction

**CUDA** (*Compute Unified Device Architecture*) is a parallel computing platform and API model created by NVIDIA. It enables developers to utilize the power of NVIDIA GPUs for general-purpose processing (GPGPU). CUDA allows programmers to leverage parallelism in GPU cores to dramatically speed up computations, especially useful in scientific calculations, image processing, machine learning, and data-intensive tasks.

### Prerequisites

Before getting started with CUDA programming, ensure you have:

- NVIDIA GPU compatible with CUDA

- [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads)

- [Visual Studio IDE](https://visualstudio.microsoft.com/pl/downloads/) (just for binaries)

### Supported Hardware

Make sure your NVIDIA GPU supports CUDA. You can check compatibility [here](https://developer.nvidia.com/cuda-gpus).

### Documentation

[CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/)

> **Hint**: Source code with CUDA integration has `.cu` extension.

---

## Getting started

### Simple CUDA Example (*Vector Addition*)

File: `vector_add/vector_add.cu`

``` cpp

#include 

#include 

// Kernel function executed on GPU

__global__ void vectorAdd(const float *A, const float *B, float *C, int N) {

    int i = blockIdx.x * blockDim.x + threadIdx.x;

    if (i < N) {

        C[i] = A[i] + B[i];

    }

}

int main() {

    int N = 1024;

    size_t size = N * sizeof(float);

    float *h_A = new float[N];

    float *h_B = new float[N];

    float *h_C = new float[N];

    for (int i = 0; i < N; ++i) {

        h_A[i] = i * 1.0f;

        h_B[i] = i * 2.0f;

    }

    float *d_A, *d_B, *d_C;

    cudaMalloc(&d_A, size);

    cudaMalloc(&d_B, size);

    cudaMalloc(&d_C, size);

    cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice);

    cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice);

    int threadsPerBlock = 256;

    int blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock;

    vectorAdd<<>>(d_A, d_B, d_C, N);

    cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost);

    std::cout << "First 5 results:\n";

    for (int i = 0; i < 5; ++i) {

        std::cout << h_A[i] << " + " << h_B[i] << " = " << h_C[i] << "\n";

    }

    cudaFree(d_A);

    cudaFree(d_B);

    cudaFree(d_C);

    delete[] h_A;

    delete[] h_B;

    delete[] h_C;

    return 0;

}

```

### Build and run

For compiling CUDA applications, use the provided NVIDIA compiler (`nvcc`).

#### Using *x64 Native Tools Command Prompt for VS 2022*

Open `x64 Native Tools Command Prompt for VS 2022`

**Build**

``` console

nvcc -o vector_add vector_add.cu

```

**Run**

``` console

vector_add.exe

```

#### Using *PowerShell*

To compile CUDA code using PowerShell, you must first load Visual Studio environment variables.

Run the following command in PowerShell to initialize the Visual Studio environment:

``` console

cmd /c "`"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat`" && powershell"

```

After that, you can build and run your CUDA application:

``` console

nvcc -o vector_add.exe vector_add.cu

.\vector_add.exe

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/baro-00/cpp-cuda-lab

Awesome Lists containing this project

README