https://github.com/marcramonmoreno/pycuda_pearson_correlation_function
Pearson Correlation Function Accelerated with PyCuda
https://github.com/marcramonmoreno/pycuda_pearson_correlation_function
Last synced: 2 months ago
JSON representation
Pearson Correlation Function Accelerated with PyCuda
- Host: GitHub
- URL: https://github.com/marcramonmoreno/pycuda_pearson_correlation_function
- Owner: MarcRamonMoreno
- License: gpl-3.0
- Created: 2023-11-12T17:32:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-16T06:24:07.000Z (over 1 year ago)
- Last Synced: 2025-02-12T17:59:50.690Z (4 months ago)
- Language: Jupyter Notebook
- Size: 19.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## GPU-Accelerated Pearson Correlation Coefficient Computation
# OverviewThis Python script utilizes PyCUDA, a Python interface to CUDA (Compute Unified Device Architecture), for performing high-performance computing on NVIDIA GPUs. It focuses on calculating Pearson's correlation coefficients, a measure of the linear correlation between two sets of data, with GPU acceleration.
Key FeaturesGPU Acceleration: Uses PyCUDA to leverage GPU for efficient computation, especially suitable for large datasets.
Reduction Kernels: Includes custom reduction kernels for summing elements, summing squares, and summing products.
Pearson Correlation Function: A function to compute Pearson's correlation coefficient between two data arrays.# Prerequisites
NVIDIA GPU with CUDA support
Python environment
PyCUDA installed (pip install pycuda)# Usage
Initialization: The script begins by importing necessary modules and setting up reduction kernels using PyCUDA.
Reduction Kernels:
sum_kernel: Sums the elements of an array.
sum_sq_kernel: Sums the squares of the elements of an array.
product_sum_kernel: Sums the products of corresponding elements from two arrays.Pearson Correlation Coefficient Calculation:
The function pearson_correlation(x, y) takes two arrays x and y and computes their Pearson correlation coefficient.
The function handles data transfer between CPU and GPU, and performs necessary computations on the GPU.Sample Data Generation:
The script generates random sample data for demonstration purposes, representing tobacco use, cancer cases, and infertility cases.
The Pearson correlation coefficient is then calculated for tobacco-cancer and tobacco-infertility datasets.# Example
python
N = 50000000 # Number of data points
np.random.seed(0) # Seed for reproducibility
tobacco_use = np.random.rand(N)
cancer_cases = np.random.rand(N)
infertility_cases = np.random.rand(N)# # Compute Pearson correlation
correlation_tobacco_cancer = pearson_correlation(tobacco_use, cancer_cases)
correlation_tobacco_infertility = pearson_correlation(tobacco_use, infertility_cases)# Limitations
The script requires an NVIDIA GPU with CUDA support.
The size of the data is limited by the GPU's memory capacity.# License
This script is released under the MIT License.
Feel free to modify the content to better suit your project's specifics.