https://github.com/alessiobugetti/integral-image-processing
Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU
https://github.com/alessiobugetti/integral-image-processing
cuda gpu-acceleration integral-image numba parallel-computing pycuda
Last synced: 3 months ago
JSON representation
Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU
- Host: GitHub
- URL: https://github.com/alessiobugetti/integral-image-processing
- Owner: AlessioBugetti
- License: gpl-3.0
- Created: 2025-02-20T22:46:40.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-03-07T14:44:56.000Z (about 1 year ago)
- Last Synced: 2025-03-07T15:34:01.090Z (about 1 year ago)
- Topics: cuda, gpu-acceleration, integral-image, numba, parallel-computing, pycuda
- Language: Cuda
- Homepage:
- Size: 706 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CUDA-Based Integral Image Computation
## Table of Contents
- [Overview](#overview)
- [Integral Image](#integral-image)
- [Formula](#formula)
- [Repository Structure](#repository-structure)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Usage](#usage)
- [C++](#c)
- [Python](#python)
- [Cuda Kernels](#cuda-kernels)
- [Included Kernels](#included-kernels)
- [Performance](#performance)
- [License](#license)
- [Author](#author)
## Overview
This project implements integral image computation for grayscale images using CUDA. It leverages GPU parallel processing to achieve high performance.
The parallel computation on the GPU is based on the alternation of two kernels in the following order:
1. Row-wise Scan
2. Transpose
3. Row-wise Scan
4. Transpose
Specifically, for the scan kernel, two versions are provided: a naive implementation and an optimized one.
### Integral Image
An integral image, also known as a summed-area table, is a representation that allows for fast computation of the sum of values in a rectangular subset of an image.
For example:

#### Formula
Given an input image $i(x, y)$, the integral image $I(x, y)$ is computed as:
$$
I(x,y) = i(x,y) + I(x-1,y)+I(x,y-1)-I(x-1,y-1)
$$
## Repository Structure
```plaintext
.
├── python/
│ ├── pycuda_test.py # Python script using pyCUDA for invoking CUDA kernels and managing the workflow
│ └── numba_test.py # Python script using Numba for invoking CUDA kernels and managing the workflow
├── c++/
│ ├── main.cu # CUDA source file containing benchmarking logic
│ └── kernel.cu # CUDA kernel definitions
└── integralimage # Script for compiling the project and running benchmarks
```
## Prerequisites
- CUDA-capable NVIDIA GPU
- CUDA Toolkit
- C++ compiler
- CMake
- Python 3.x (for Python interface)
- Python Libraries:
- numpy
- pycuda
- numba
## Installation
1. Clone the repository:
```sh
git clone https://github.com/AlessioBugetti/integral-image-processing.git
cd integral-image-processing
```
2. Install Python dependencies:
```sh
pip install -r python/requirements.txt
```
3. Ensure the CUDA environment is set up:
- Install NVIDIA drivers.
- Install the CUDA Toolkit.
- Verify with ```nvcc --version```.
## Usage
### C++
```sh
./integralimage build
./integralimage run
```
### Python
```sh
python pycuda_test.py
```
or
```sh
python numba_test.py
```
## Cuda Kernels
### Included Kernels:
- `SumRows`: Naively computes the row-wise scan (prefix sum) of a matrix
- `SinglePassRowWiseScan`: Optimized computation of the row-wise scan (prefix sum) of a matrix
- `Transpose`: Transposes a matrix using block-level tiling with shared memory
## Performance
The implementation includes benchmarking capabilities that measure:
- Sequential CPU execution time
- CUDA execution time for the naive implementation of the integral image computation
- CUDA execution time for the optimized implementation of the integral image computation
- Speedup ratios compared to the CPU implementation for both the naive and optimized implementations
- Measurements are averaged over multiple iterations to ensure reliable results.
## License
This project is licensed under the GPL-3.0-only License. See the [`LICENSE`](LICENSE) file for more details.
## Author
Alessio Bugetti - alessiobugetti98@gmail.com