https://github.com/bd2720/accesspatterns

Comparing chunked vs. striped memory access patterns for CPU and GPU code using the CUDA toolkit in C.
https://github.com/bd2720/accesspatterns

c cache cuda cuda-toolkit performance-analysis performance-testing profiling

Last synced: 4 months ago
JSON representation

Comparing chunked vs. striped memory access patterns for CPU and GPU code using the CUDA toolkit in C.

Host: GitHub
URL: https://github.com/bd2720/accesspatterns
Owner: bd2720
Created: 2024-05-12T21:11:07.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-05-12T21:12:26.000Z (about 1 year ago)
Last Synced: 2025-01-31T12:34:48.039Z (5 months ago)
Topics: c, cache, cuda, cuda-toolkit, performance-analysis, performance-testing, profiling
Language: Cuda
Homepage:
Size: 1.95 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.txt

Awesome Lists containing this project

README

        access-cpu: Uses pthreads to demonstrate how chunked memory access is faster

than striped access on the CPU. This is because threads are scheduled for

a period of time on the CPU, where each one is scheduled after the next. This

means cache usage is maximized in a given thread when memory is accessed in

a sequential pattern (chunked). Striped access is slow because it only allows

a given thread to access a fraction (1 / NTHREADS) of each cache line.

access-gpu: Uses CUDA to demonstrate how striped memory access is faster

than chunked access on the GPU. This is because GPU threads execute together

on a per-block basis. Since they share the same cache, an interleaved (striped) 

memory access pattern will allow all threads in a block to read from the same

cache line.

General Findings:

pthread speedup 1 -> 10 (bad access):	1.15x

pthread speedup 1 -> 10 (good access):	4-5x

cuda speedup <<<1,1>>> -> <<<8,64>>> (bad access):	9x

cuda speedup <<<1,1>>> -> <<<8,64>>> (good access):	256x

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bd2720/accesspatterns

Awesome Lists containing this project

README