{"id":20095976,"url":"https://github.com/avicted/gpu-time-measurement","last_synced_at":"2025-07-04T06:07:24.885Z","repository":{"id":133527556,"uuid":"522054986","full_name":"Avicted/gpu-time-measurement","owner":"Avicted","description":"Cuda Template for AMD GPUs on Linux","archived":false,"fork":false,"pushed_at":"2023-07-04T11:27:33.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-13T03:42:09.602Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Avicted.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-08-06T21:12:59.000Z","updated_at":"2023-07-04T11:27:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"f81d1850-f556-4dff-a641-8408105c17cf","html_url":"https://github.com/Avicted/gpu-time-measurement","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Avicted%2Fgpu-time-measurement","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Avicted%2Fgpu-time-measurement/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Avicted%2Fgpu-time-measurement/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Avicted%2Fgpu-time-measurement/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Avicted","download_url":"https://codeload.github.com/Avicted/gpu-time-measurement/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241535214,"owners_count":19978106,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T16:57:12.609Z","updated_at":"2025-03-02T16:24:07.269Z","avatar_url":"https://github.com/Avicted.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cuda Template for AMD GPUs on Linux\n\nThe template invokes an empty kernel and measures the amount of time it\ntakes for the kernel to start up, execute and end.\n\n```bash\n# Install the needed packages\nsudo pacman -S opencl-amd opencl-amd-dev\n\n# Add the ROCm compiler and scripts and executable to the user path\n# Typically inside of ~/.bashrc or ~/.zshrc\n# source the config file (source ~/.zshrc) or start a new terminal session after\nexport PATH=\"/opt/rocm-5.5.0/bin:$PATH\" # \u003c--- change version to the installed\n```\n\n## Run\n\n```bash\nmake\n```\n\n## Example output\n\n```bash\nCleaning\nrm -r build 2\u003e /dev/null || true\nrm -r code/*.cu 2\u003e /dev/null || true\n\nCreating directories\nmkdir -p build\n\nHipifying the Cuda C++ code to HIP C++ code\nhipify-perl ./code/main.cpp -o ./code/main.cpp.hip.cu\n\nBuilding the program\nhipcc -O3  ./code/main.cpp.hip.cu -o ./build/gpu_signal_processing.out\n\nRunning the executable\n./build/gpu_signal_processing.out\n        Starting the program\n   Found 1 CUDA devices\n         Device AMD Radeon RX 6900 XT                    = device 0\n         compute capability           =         10.3\n         totalGlobalMemory            =        17.16 GB\n         l2CacheSize                  =     4194304 B\n         regsPerBlock                 =       65536\n         multiProcessorCount          =          40\n         maxThreadsPerMultiprocessor  =        2048\n         sharedMemPerBlock            =       65536 B\n         warpSize                     =          32\n         clockRate                    =     2660.00 MHz\n         maxThreadsPerBlock           =        1024\n         maxGridSize                  =    2147483647 x 2147483647 x 2147483647\n         maxThreadsDim                =    1024 x 1024 x 1024\n   Using CUDA device 0\n\n====================================================================\nblocksInGrid:\t{1, 1, 1} blocks.\nthreadsInBlock:\t1024 threads.\nnumber of threads: 1024\n        The program took 164043 microseconds\n        The program took 164 milliseconds\n        The program took 0.164043 seconds\n        To execute the GPU kernel\n\nThe program has been built and runned successfully!\n\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favicted%2Fgpu-time-measurement","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Favicted%2Fgpu-time-measurement","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favicted%2Fgpu-time-measurement/lists"}