{"id":18828573,"url":"https://github.com/shreya888/learning-cuda-with-cpp-and-pytorch","last_synced_at":"2026-05-07T13:40:03.206Z","repository":{"id":260490285,"uuid":"878841534","full_name":"shreya888/Learning-CUDA-with-Cpp-and-PyTorch","owner":"shreya888","description":"My notes, code, \u0026 insights will be recorded here while learning CUDA with C++ and PyTorch","archived":false,"fork":false,"pushed_at":"2024-11-10T22:52:12.000Z","size":372,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-20T02:21:32.749Z","etag":null,"topics":["cpp","cuda","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shreya888.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-26T08:56:15.000Z","updated_at":"2024-11-10T22:52:15.000Z","dependencies_parsed_at":"2024-10-31T16:44:00.326Z","dependency_job_id":null,"html_url":"https://github.com/shreya888/Learning-CUDA-with-Cpp-and-PyTorch","commit_stats":null,"previous_names":["shreya888/learning-cuda-with-cpp-and-pytorch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shreya888%2FLearning-CUDA-with-Cpp-and-PyTorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shreya888%2FLearning-CUDA-with-Cpp-and-PyTorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shreya888%2FLearning-CUDA-with-Cpp-and-PyTorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shreya888%2FLearning-CUDA-with-Cpp-and-PyTorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shreya888","download_url":"https://codeload.github.com/shreya888/Learning-CUDA-with-Cpp-and-PyTorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239763622,"owners_count":19692812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","cuda","pytorch"],"created_at":"2024-11-08T01:31:40.767Z","updated_at":"2026-05-07T13:40:03.133Z","avatar_url":"https://github.com/shreya888.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Learning CUDA with C++ \u0026 PyTorch: Notes and Code\nThis repository documents key concepts, exercises, and insights gained from my exploration of CUDA (Compute Unified Device Architecture) programming, with examples in both C++ and PyTorch. This repo focuses on understanding and implementing CUDA to enhance code performance in C++ and PyTorch. The repository contains commented code examples, conceptual explanations, and a curated list of resources.\n\n\n## Repository Structure\n```\n/Learning-CUDA-with-Cpp-and-PyTorch\n├── README.md                          # Main README with project overview, structure, learning points, and detailed list of resources\n├── Cpp                                # Code and explanations focused on CUDA in C++\n│   ├── README.md                      # Overview of C++ section and how to navigate it\n│   ├── CUDA Exercises                 # Folder for CUDA exercises from OLCF CUDA Training Series\n│   │   ├── hw1                        # Homework 1 folder\n│   │   │   ├── \u003cfile_name\u003e.cu         # CUDA implementations for various exercises in hw1\n│   │   │   └── README.md              # Notes for hw1 implementations and new code/CUDA insights learned\n│   │   └── ...                        # Additional homework folders\n└── PyTorch                            # Code and explanations focused on CUDA in PyTorch\n    ├── README.md                      # Overview of PyTorch section and how to navigate it\n    ├── GPU Mode                       # Folder for PyTorch code examples with output logs for GPU Mode lectures\n    │   ├── \u003cfile_name\u003e.py             # Various example scripts with comments\n    │   ├── output_logs                # Folder for output log files\n    │   └── README.md                  # Notes on CUDA using PyTorch and code implementations\n\n```\n\n\n## Terms to Remember:\n1. GPU Kernel (functions)\n2. Thread, Block, Grid\n3. Host (CPU), Device (GPU)\n4. Streaming Processors(SPs or cores), Streaming Multiprocessor (SM or multiprocessor)\n5. Warp (smallest unit (32 threads gen.) of execution on the device)\n6. CPU: SIMD (Single Instruction, Multiple Data)\n7. GPU: SIMT (Single Instruction, Multiple Threads)\n8. Physical (SMs and SPs) and Logical Memory (Virtual) (blocks, threads)\n9. Prefetching, Data Transfer, Global memory, Local memory, Registers, Constant Memory\n10. Synchronization and Asynchronization, Multithreading, Latency Hiding, Thread divergence\n11. Unified memory, Zero Copy, Page faults; cudaMemAdvise\u003c(Un)SetReadMostly | (Un)SetPreferredLocation | (Un)SetAccessedBy\u003e, cudaMemPrefetchAsync\n12. Pagable and Pinned Memory; cudaMallocHost\n\n## My Current Learning List:\n1. https://github.com/gpu-mode/lectures\n2. https://www.youtube.com/@GPUMODE\n3. https://www.youtube.com/@pmpp-book\n4. https://github.com/olcf/cuda-training-series\n5. https://github.com/CisMine/Parallel-Computing-Cuda-C - studied till chapter 11 today (MUST READ - very simplified topics with great analogies and examples, easy to follow)\n6. https://forums.developer.nvidia.com\n7. https://discuss.pytorch.org\n8. https://pytorch.org/docs\n9. https://wandb.ai/wandb/trace/reports/Using-the-PyTorch-Profiler-with-W-B--Vmlldzo5MDE3NjU\n10. https://gist.github.com/mingfeima/e08310d7e7bb9ae2a693adecf2d8a916\n11. https://users.wfu.edu/choss/CUDA/lectures.html - studied till lecture 6 (MUST READ - well explained code and concepts)\n12. https://developer.nvidia.com/blog/easy-introduction-cuda-c-and-c\n13. https://docs.nvidia.com/cuda/cuda-c-best-practices-guide\n14. https://docs.nvidia.com/cuda/cuda-c-programming-guide\n15. https://www.amazon.com/Programming-Massively-Parallel-Processors-Hands/dp/0323912311\n16. https://edoras.sdsu.edu/~mthomas/docs/cuda/cuda_by_example.book.pdf\n17. https://github.com/srush/GPU-Puzzles\n18. https://github.com/gpu-mode/resource-stream?tab=readme-ov-file - comprehensive list of CUDA/GPU resources\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshreya888%2Flearning-cuda-with-cpp-and-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshreya888%2Flearning-cuda-with-cpp-and-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshreya888%2Flearning-cuda-with-cpp-and-pytorch/lists"}