{"id":22291935,"url":"https://github.com/nrmancuso/big-bang","last_synced_at":"2025-06-13T02:03:20.429Z","repository":{"id":103245334,"uuid":"342316763","full_name":"nrmancuso/big-bang","owner":"nrmancuso","description":"CUDA and OpenMp NBody simulation based on data from the Milky Way and Andromeda Galaxies","archived":false,"fork":false,"pushed_at":"2023-11-21T04:00:59.000Z","size":42731,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-25T21:47:49.598Z","etag":null,"topics":["c","cuda-kernels","cuda-programming","nbody-simulation","openmp-parallelization","parallel-computing","space"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nrmancuso.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-25T17:00:55.000Z","updated_at":"2023-11-21T03:49:38.000Z","dependencies_parsed_at":"2023-11-21T04:41:45.254Z","dependency_job_id":null,"html_url":"https://github.com/nrmancuso/big-bang","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nrmancuso/big-bang","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nrmancuso%2Fbig-bang","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nrmancuso%2Fbig-bang/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nrmancuso%2Fbig-bang/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nrmancuso%2Fbig-bang/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nrmancuso","download_url":"https://codeload.github.com/nrmancuso/big-bang/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nrmancuso%2Fbig-bang/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259565534,"owners_count":22877345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","cuda-kernels","cuda-programming","nbody-simulation","openmp-parallelization","parallel-computing","space"],"created_at":"2024-12-03T17:19:02.477Z","updated_at":"2025-06-13T02:03:20.279Z","avatar_url":"https://github.com/nrmancuso.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CUDA/ OpenMp Hybrid Big Bang N-Body Simulation\n\n![bigbang](https://github.com/nrmancuso/big-bang/blob/main/gif/big-bang.gif)\n\n### ABOUT\n\nThis program is adapted from Professor Stephen Siegel's sequential 2-d \nnbody simulation program, by Jiaman Wang and Nick Mancuso for our CISC372\nfinal project. We have used CUDA and OpenMp to parallelize the original \nprogram and improve performance, so that we can create animations of\nmuch larger galaxies.\n\nThe translation unit(s) that we have created are much larger than the original\nones, with the largest consisting of 72,000 bodies. Our animations were inspired\nby the big bang, and the bodies that we have used simulate the birth of the \nmilky way and andromeda galaxies. As the animation progresses, you can see \nthe two galaxies forming.\n\nYou can find our animation here: https://vimeo.com/491779612\n\nMuch of our data was obtained from the research \nfound at https://arxiv.org/abs/astro-ph/950901 by John Dubinsky, and liberally\nmanipulated the data for the best appearance in 2 dimensions. \n\nWe have used the discretized universal gravitation equation provided in the \nsequential version of the program, with a few small modifications in the case\nof the GPU friendly function.  Our algorithms/ logic stay fairly close to the\noriginal sequential version, other than the parallelized sections. Each OpenMp \nthread has one of two responsiblities, depending on how the user chooses to\ndistribute the bodies (via `GPU_BODY_PROP`): \n\n(1) is the \"manager\" of one GPU device, or \n(2) is responsible for a block-distributed amount of bodies.\n\nIf an OpenMp thread is a \"manager\" of a GPU, this means that this thread will\ncall the CUDA kernel and handle most of the data copying responsibilities. \nIf an OpenMP thread is resposible for updating the states of bodies, then \nit functions similarly to the sequential version for that thread's number of\nowned bodies. \n\nWhen updating the states of bodies, each thread in the GPU is responsible \nfor calculating the interactions between \"it's\" body and that of all other \nbodies; so essentially, one GPU thread is in charge of one body.\n\n### PERFORMANCE\n\nWe have run numerous tests to determine that having the GPU handle ALL bodies\nconsistently provides the best performance. Please take a look at \n`nbody/graphs/big_bang_p100_diff-body-prop.pdf` to see the proportion of bodies updated\nby the GPU increased vs. running time.  In this case, the OpenMP threads act\nas managers for each GPU, handling memory copying and kernel calls.\n\nUsing one of the original translation units, `galaxy.c`, we have ran both strong \nand weak scaling experiments.  The results of these runs can be found in the \nfollowing graphs:\n\n`nbody/graphs/Galaxy_Strong_50-50-Efficiency.pdf`\n`nbody/graphs/Galaxy_Strong_50-50-Speedup.pdf`\n`nbody/graphs/Galaxy_Strong_50-50-Time.pdf`\n\n`nbody/graphs/galaxy-weak-scaling-no-gpu-Efficiency.pdf`\n`nbody/graphs/galaxy-weak-scaling-no-gpu-Speedup.pdf`\n`nbody/graphs/galaxy-weak-scaling-no-gpu-Time.pdf`\n\nAll of the strong scaling used OpenMP threads (1 -\u003e 16) and the body updates\nwere split 50/50 between the GPU and CPU, with 1001 bodies.\n\nThe weak scaling experiments increased the number of bodies by roughly 62 each time,\nand all other variables were consistent with the strong scaling version.\n\nIn order to test which type of GPU was the most effective for this animation, we\ntested 72,000 bodies on both types of GPU nodes, and threw in an MPI only run for \ngood measure. Please see the `nbody/graphs/big_bang_p100vsk80vsMPI-time.pdf` file \nfor more information.\n\nFinally, we pitted two runs with different proportions of bodies being updated\nby the CPU and GPU against each other in a weak scaling experiment.  The results\ncan be found in `nbody/graphs/big_bang_weak-scaling-50-90-1k80.pdf`.\n\n### GETTING STARTED\n\nThe source code for our program is located in the `nbody/` directory.\nThere you will find all the translation units and the main driver program.\nAdditionally, there is a \"configuration\" file, config.c, where users can specify\nadditional/ different colors to produce for the animations.\n\nTo test our version of nbody, `nbody_omp.cu`, against the original sequential\nversion of nbody:\n\n```````````````````````````\n$ make test\n\n```````````````````````````\n\nThis will use the original translation units to produce both parallel and \nsequential versions of each animation, then `diff` the results. Note that\ndepending on which machine you are testing on and how many GPU's you want\n to use, you will want to comment/ uncomment the makefile accordingly:\n\n```````````````````````````\n\n\t# below is for single gpu on grendel\n\tOMP_NUM_THREADS=$(NPROCS) $(CUDARUN) ./$\u003c $@ $(GPU_BODY_PROP)\n\t# below is for two gpus on grendel\n\t#OMP_NUM_THREADS=$(NPROCS) srun --unbuffered -n 1 --gres=gpu:2 ./$\u003c $@ $(GPU_BODY_PROP)\n\t# below is for bridges interactive mode\n\t#./$\u003c $@ $(GPU_BODY_PROP)\n\n`````````````````````````````\n\nThe GPU_BODY_PROP is the proportion of bodies that the GPU will update.  This\nproportion can be adjusted at the top of the makefile.\n\nIn order to complie and link all of our translation units, do:\n\n````````````````````````````\nmake big_bangs\n\n````````````````````````````\n\nWhen running compiled/ linked translation units manually, you must specify \nthe proportion of bodies that the GPU will update via command line arg:\n\n`````````````````````````````\n./big_bang8.exec big_bang8.anim 0.50\n\n`````````````````````````````\nThis would run big_bang8.exec, create an animation called big_bang8.anim, and\nthe GPU would update half of the bodies. To see more usage examples of our \nprogram, you can check out the bridges scripts, found in nbody/graphs/bridges.  The\nrange of GPU_BODY_PROP is from 0.0 to 1.0, inclusive. \n\nIn general, the syntax for the executables generated from the translation units\nis:\n\n`````````````````````````````\n\u003cmachine-specific env variables\u003e ./\u003ctranslation unit name\u003e.exec \u003coutput file name\u003e.anim \u003cgpu bodies proportion\u003e\n\n`````````````````````````````\n\nNOTE: This program MUST be linked to translation units; it cannot be ran\non it's own!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnrmancuso%2Fbig-bang","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnrmancuso%2Fbig-bang","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnrmancuso%2Fbig-bang/lists"}