https://github.com/colesmcintosh/pycuda-numpy-vector-ops
Accelerating NumPy Vector Operations with PyCUDA
https://github.com/colesmcintosh/pycuda-numpy-vector-ops
cuda-programming numpy pycuda
Last synced: about 2 months ago
JSON representation
Accelerating NumPy Vector Operations with PyCUDA
- Host: GitHub
- URL: https://github.com/colesmcintosh/pycuda-numpy-vector-ops
- Owner: colesmcintosh
- Created: 2025-04-06T22:33:57.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-06T22:40:44.000Z (about 1 year ago)
- Last Synced: 2025-04-09T22:08:41.200Z (about 1 year ago)
- Topics: cuda-programming, numpy, pycuda
- Language: Jupyter Notebook
- Homepage: https://colab.research.google.com/drive/1ZhezfdgtMdAxHiMzUbVaizith8ILLreq?usp=sharing
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Accelerating NumPy Vector Operations with PyCUDA
This notebook demonstrates how to accelerate large-scale NumPy operations using GPU programming in Python via [PyCUDA](https://documen.tician.de/pycuda/).
We compare traditional CPU-based NumPy operations with a GPU-accelerated fused multiply-add (FMA) operation:
> The operation is defined as $c[i] = a[i] \times b[i] + d[i]$.
The notebook uses:
- Pinned (page-locked) memory for faster host-device transfers
- CUDA streams for asynchronous execution
- Event timing for accurate benchmarks
The result is a fast, validated comparison of NumPy vs PyCUDA performance.