{"id":44044301,"url":"https://github.com/fzj-jsc/tutorial-multi-gpu","last_synced_at":"2026-02-07T21:35:18.479Z","repository":{"id":39797131,"uuid":"409504932","full_name":"FZJ-JSC/tutorial-multi-gpu","owner":"FZJ-JSC","description":"Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial","archived":false,"fork":false,"pushed_at":"2025-12-03T14:53:48.000Z","size":191365,"stargazers_count":335,"open_issues_count":0,"forks_count":68,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-12-06T19:23:31.923Z","etag":null,"topics":["cuda","exascale-computing","gpu","hpc","isc22","isc23","isc24","isc25","mpi","multi-gpu","nccl","nvshmem","sc21","sc22","sc23","sc24","sc25","supercomputing"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FZJ-JSC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":".zenodo.json","notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-09-23T08:12:06.000Z","updated_at":"2025-12-04T09:36:21.000Z","dependencies_parsed_at":"2023-10-03T10:28:51.280Z","dependency_job_id":"267ae2f7-20c6-4fea-bbca-91ab94fd03bb","html_url":"https://github.com/FZJ-JSC/tutorial-multi-gpu","commit_stats":{"total_commits":206,"total_committers":12,"mean_commits":"17.166666666666668","dds":0.5097087378640777,"last_synced_commit":"840c9ac87075672ea758f8cc1c636a0bec984c57"},"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/FZJ-JSC/tutorial-multi-gpu","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FZJ-JSC%2Ftutorial-multi-gpu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FZJ-JSC%2Ftutorial-multi-gpu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FZJ-JSC%2Ftutorial-multi-gpu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FZJ-JSC%2Ftutorial-multi-gpu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FZJ-JSC","download_url":"https://codeload.github.com/FZJ-JSC/tutorial-multi-gpu/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FZJ-JSC%2Ftutorial-multi-gpu/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29208735,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-07T20:33:12.493Z","status":"ssl_error","status_checked_at":"2026-02-07T20:30:47.381Z","response_time":63,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","exascale-computing","gpu","hpc","isc22","isc23","isc24","isc25","mpi","multi-gpu","nccl","nvshmem","sc21","sc22","sc23","sc24","sc25","supercomputing"],"created_at":"2026-02-07T21:35:17.928Z","updated_at":"2026-02-07T21:35:18.474Z","avatar_url":"https://github.com/FZJ-JSC.png","language":"Cuda","readme":"# SC25 Tutorial: Efficient Distributed GPU Programming for Exascale\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5745504.svg)](https://doi.org/10.5281/zenodo.5745504)\n\n\nRepository with talks and exercises of our Efficient GPU Programming for Exascale tutorial, to be held at [SC25](https://sc25.conference-program.com/presentation/?id=tut113\u0026sess=sess252).\n\n## Coordinates\n\n* Date: 16 November 2025\n* Occasion: SC25 Tutorial\n* Tutors: Simon Garcia de Gonzalo (SNL), Andreas Herten (JSC), Lena Oden (Uni Hagen), David Appelhans (NVIDIA); with support by Markus Hrywniak (NVIDIA) and Jiri Kraus (NVIDIA)\n\n\n## Setup\n\nThe tutorial is an interactive tutorial with introducing lectures and practical exercises to apply knowledge. The exercises have been derived from the Jacobi solver implementations available in [NVIDIA/multi-gpu-programming-models](https://github.com/NVIDIA/multi-gpu-programming-models).\n\nWalk-through (only possible on-site at SC25!):\n\n* Sign up at JuDoor\n* Open Jupyter JSC: https://jupyter.jsc.fz-juelich.de\n* Create new Jupyter instance on [JUWELS, using training26XX account, on **LoginNodeBooster**](https://jupyter.jsc.fz-juelich.de/workshops/sc25mg)\n* Source course environment: `source $PROJECT_training26XX/env.sh`\n* Sync material: `jsc-material-sync`\n* Locally install NVIDIA Nsight Systems: https://developer.nvidia.com/nsight-systems\n\n\n1. Lecture: Tutorial Overview, Introduction to System + Onboarding *Andreas*\n2. Lecture: MPI-Distributed Computing with GPUs *Simon*\n3. Hands-on: Multi-GPU Parallelization\n4. Lecture: Performance / Debugging Tools *David*\n5. Lecture: Optimization Techniques for Multi-GPU Applications *Simon*\n6. Hands-on: Overlap Communication and Computation with MPI\n7. Lecture: Overview of NCCL and NVSHMEN in MPI *Lena*\n8. Hands-on: Using NCCL and NVSHMEM\n9. Lecture: Device-initiated Communication with NVSHMEM *David*\n10. Hands-on: Using Device-Initiated Communication with NVSHMEM\n11. Lecture: Conclusion and Outline of Advanced Topics *Andreas*\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffzj-jsc%2Ftutorial-multi-gpu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffzj-jsc%2Ftutorial-multi-gpu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffzj-jsc%2Ftutorial-multi-gpu/lists"}