{"id":22752783,"url":"https://github.com/grindelfp/cuda-texture-memory","last_synced_at":"2025-03-30T07:17:06.180Z","repository":{"id":267493170,"uuid":"901194705","full_name":"GrindelfP/cuda-texture-memory","owner":"GrindelfP","description":"Exercise on using texture memory in CUDA.","archived":false,"fork":false,"pushed_at":"2024-12-16T17:27:35.000Z","size":13,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T07:17:02.376Z","etag":null,"topics":["cuda","texture-memory"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GrindelfP.png","metadata":{"files":{"readme":"README.adoc","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-10T08:03:28.000Z","updated_at":"2024-12-16T17:27:39.000Z","dependencies_parsed_at":"2024-12-10T17:45:59.503Z","dependency_job_id":"165eb5b6-06fc-4cd2-8c74-d72f194ddc68","html_url":"https://github.com/GrindelfP/cuda-texture-memory","commit_stats":null,"previous_names":["grindelfp/cuda-texture-memory"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GrindelfP%2Fcuda-texture-memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GrindelfP%2Fcuda-texture-memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GrindelfP%2Fcuda-texture-memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GrindelfP%2Fcuda-texture-memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GrindelfP","download_url":"https://codeload.github.com/GrindelfP/cuda-texture-memory/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246285817,"owners_count":20752958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","texture-memory"],"created_at":"2024-12-11T06:07:19.523Z","updated_at":"2025-03-30T07:17:06.164Z","avatar_url":"https://github.com/GrindelfP.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"= CUDA Texture Memory Convolution\n\nThis project demonstrates the implementation of a 2D convolution using CUDA on the GPU. The code compares the performance of two memory types—global memory and texture memory—by applying a Gaussian-like kernel to an input signal. The kernel is applied to a signal of varying sizes, and the performance is measured for each memory type.\n\n== Features\n\n- Implemented convolution using two different memory access methods:\n  - Global Memory: Direct access to data in GPU memory.\n  - Texture Memory: Uses CUDA's Texture Object API for optimized data access.\n- Benchmarking of convolution performance on both CPU and GPU.\n- Comparison of execution times and speedup achieved by using GPU versus CPU.\n\n== Requirements\n\n- NVIDIA GPU with CUDA support.\n- CUDA Toolkit installed.\n- A C++ compiler supporting C++11 or later.\n- The program requires the following CUDA headers:\n  - `\u003ccuda_runtime.h\u003e`\n  - `\u003ciostream\u003e`\n  - `\u003ccmath\u003e`\n  - `\u003ccstdio\u003e`\n  - `\u003cchrono\u003e`\n\n== Code Explanation\n\nThe project defines two CUDA kernels for performing the convolution:\n\n- **`Conv_Glb` (Global memory kernel)**: This kernel directly accesses the global memory of the GPU to perform convolution operations. It iterates over a 2D neighborhood of each pixel and applies a Gaussian-like kernel.\n\n- **`Conv_Tex` (Texture memory kernel)**: This kernel uses the CUDA Texture Object API to access texture memory. Texture memory provides optimized access patterns, which can lead to faster performance compared to global memory for certain types of data access.\n\n### Key Variables and Parameters:\n\n- **`BLOCK_SIZE`**: The size of each block in the GPU grid. A value of `16` is used for this example.\n- **`PI`**: The value of Pi, used in the kernel calculations.\n- **`delta`**: Defines the size of the convolution window (half the size of the block).\n- **`W`, `H`**: Width and height of the input signal.\n- **`dS`**: The input signal data on the GPU.\n- **`dConv`**: The convolution result stored in GPU memory.\n- **`hS`, `hConv`**: Host memory for input signal and convolution result.\n- **`hdConv`, `hdConvText`**: Device memory for storing the results of convolution using global and texture memory.\n\n### Performance Comparison\n\nThe program benchmarks the time taken to perform the convolution on:\n\n1. **CPU**: A single-threaded CPU implementation.\n2. **GPU (Global memory)**: Using global memory for data access.\n3. **GPU (Texture memory)**: Using texture memory for optimized data access.\n\nExecution times are measured for each method, and the speedup (GPU vs CPU) is displayed in the console.\n\n### Output\n\nUpon execution, the program prints the following performance metrics for each signal size:\n\n- CPU execution time (in milliseconds).\n- GPU execution time using global memory (in milliseconds).\n- GPU execution time using texture memory (in milliseconds).\n- Speedup factor (GPU vs CPU) for both global and texture memory methods.\n\nExample output:\n----\nCPU time: 124.5 ms \nGPU (Global memory) time: 35.2 ms \nGPU (Texture memory) time: 28.7 ms \nSpeedup (Global): 3.54x \nSpeedup (Texture): 4.33x\n----\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrindelfp%2Fcuda-texture-memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgrindelfp%2Fcuda-texture-memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrindelfp%2Fcuda-texture-memory/lists"}