https://github.com/marcramonmoreno/pycuda_pearson_correlation_function

Pearson Correlation Function Accelerated with PyCuda
https://github.com/marcramonmoreno/pycuda_pearson_correlation_function

Last synced: 2 months ago
JSON representation

Pearson Correlation Function Accelerated with PyCuda

Host: GitHub
URL: https://github.com/marcramonmoreno/pycuda_pearson_correlation_function
Owner: MarcRamonMoreno
License: gpl-3.0
Created: 2023-11-12T17:32:00.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-11-16T06:24:07.000Z (over 1 year ago)
Last Synced: 2025-02-12T17:59:50.690Z (4 months ago)
Language: Jupyter Notebook
Size: 19.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        ## GPU-Accelerated Pearson Correlation Coefficient Computation

# Overview

This Python script utilizes PyCUDA, a Python interface to CUDA (Compute Unified Device Architecture), for performing high-performance computing on NVIDIA GPUs. It focuses on calculating Pearson's correlation coefficients, a measure of the linear correlation between two sets of data, with GPU acceleration.

Key Features

    GPU Acceleration: Uses PyCUDA to leverage GPU for efficient computation, especially suitable for large datasets.

    Reduction Kernels: Includes custom reduction kernels for summing elements, summing squares, and summing products.

    Pearson Correlation Function: A function to compute Pearson's correlation coefficient between two data arrays.

# Prerequisites

    NVIDIA GPU with CUDA support

    Python environment

    PyCUDA installed (pip install pycuda)

# Usage

    Initialization: The script begins by importing necessary modules and setting up reduction kernels using PyCUDA.

    Reduction Kernels:

        sum_kernel: Sums the elements of an array.

        sum_sq_kernel: Sums the squares of the elements of an array.

        product_sum_kernel: Sums the products of corresponding elements from two arrays.

    Pearson Correlation Coefficient Calculation:

        The function pearson_correlation(x, y) takes two arrays x and y and computes their Pearson correlation coefficient.

        The function handles data transfer between CPU and GPU, and performs necessary computations on the GPU.

    Sample Data Generation:

        The script generates random sample data for demonstration purposes, representing tobacco use, cancer cases, and infertility cases.

        The Pearson correlation coefficient is then calculated for tobacco-cancer and tobacco-infertility datasets.

# Example

python

N = 50000000  # Number of data points

np.random.seed(0)  # Seed for reproducibility

tobacco_use = np.random.rand(N)

cancer_cases = np.random.rand(N)

infertility_cases = np.random.rand(N)

# # Compute Pearson correlation

correlation_tobacco_cancer = pearson_correlation(tobacco_use, cancer_cases)

correlation_tobacco_infertility = pearson_correlation(tobacco_use, infertility_cases)

# Limitations

    The script requires an NVIDIA GPU with CUDA support.

    The size of the data is limited by the GPU's memory capacity.

# License

This script is released under the MIT License.

Feel free to modify the content to better suit your project's specifics.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/marcramonmoreno/pycuda_pearson_correlation_function

Awesome Lists containing this project

README