{"id":18686217,"url":"https://github.com/mateuszk098/parallel-programming-examples","last_synced_at":"2026-04-27T16:31:02.672Z","repository":{"id":143199766,"uuid":"556918644","full_name":"mateuszk098/parallel-programming-examples","owner":"mateuszk098","description":"Simple parallel programming examples with CUDA, MPI and OpenMP.","archived":false,"fork":false,"pushed_at":"2023-02-10T19:12:48.000Z","size":41,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-18T19:07:24.434Z","etag":null,"topics":["cpp","cuda","mpi","openmp","parallel-programming"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mateuszk098.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-24T19:05:41.000Z","updated_at":"2023-12-10T15:07:29.000Z","dependencies_parsed_at":"2023-06-03T18:45:34.823Z","dependency_job_id":null,"html_url":"https://github.com/mateuszk098/parallel-programming-examples","commit_stats":null,"previous_names":["mateuszk098/parallel-programming-examples"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mateuszk098/parallel-programming-examples","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mateuszk098%2Fparallel-programming-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mateuszk098%2Fparallel-programming-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mateuszk098%2Fparallel-programming-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mateuszk098%2Fparallel-programming-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mateuszk098","download_url":"https://codeload.github.com/mateuszk098/parallel-programming-examples/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mateuszk098%2Fparallel-programming-examples/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32345802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T23:26:28.701Z","status":"online","status_checked_at":"2026-04-27T02:00:06.769Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","cuda","mpi","openmp","parallel-programming"],"created_at":"2024-11-07T10:26:38.839Z","updated_at":"2026-04-27T16:31:02.656Z","avatar_url":"https://github.com/mateuszk098.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **Parallel Programming Examples**\n\nUsage of parallel programming tools such as `CUDA`, `MPI` and `OpenMP` with minimalist code examples, e.g. Game of Life, Matrix Multiplication and $\\pi$ calculation.\n\n_**NOTE: The last time I created and tested the following guide was in 2021. For at this moment, some things may have changed.**_\n\n---\n\n## **Table of Contents**\n\n* [**OpenMP**](#openmp)\n* [**CUDA**](#cuda)\n* [**MPI**](#mpi)\n\n---\n\n## **1. OpenMP** \u003ca id=\"openmp\"\u003e\u003c/a\u003e\n\n`OpenMP` is a multi-platform programming interface that enables multiprocessing programming. `OpenMP` can be used in C++, C and Fortran languages, including different architectures like Windows and Unix. It consists of compilator directives that have an impact on code execution.\n\nThe `OpenMP` interface is a component of the GNU Compiler Collection (`GCC`), a set of open-source compilers developed by the GNU Project. `GCC` compiler is therefore highly recommended for use with `OpenMP`, although it is not required (there is an Intel compiler that also support `OpenMP`).\n\n**INSTALLATION AND CONFIGURATION ON LINUX SYSTEMS:**\n\nStart the terminal and update the repository:\n\n```bash\n\u003e\u003e\u003e sudo apt-get update\n```\n\nThen install the `build-essential` package, including `gcc`, `g++` and `make`:\n\n```bash\n\u003e\u003e\u003e sudo apt-get install build-essential\n```\n\nWe can also install the manual pages on using GNU/Linux for programming, but it is not necessary:\n\n```bash\n\u003e\u003e\u003e sudo apt-get install manpages-dev\n```\n\nTo check `GCC` version, type:\n\n```bash\n\u003e\u003e\u003e gcc --version\n```\n\n**INSTALLATION AND CONFIGURATION ON WINDOWS 10:**\n\nOn Windows, we need `MinGW`, a port of `GCC` providing a free, open environment\nand tools that allow us to compile native executables for the Windows platform.\nTo do this, we go to: [https://sourceforge.net/projects/mingw/](https://sourceforge.net/projects/mingw/) and download `MinGW` - Minimalist GNU for Windows. Once installed, we check the compiler in the command line:\n\n```bash\n\u003e\u003e\u003e gcc -v\n```\n\nMake sure that you've installed the `GCC` with the Posix thread model. If you get a message that the command is not recognised, add the appropriate environment variable to your system - \"`../MinGW/bin`\".\n\nThe use of `OpenMP` requires including a following library in the C/C++ code:\n\n```c++\n#include \u003comp.h\u003e\n```\n\nIt is also required to specify the appropriate flag during compilation:\n\n```bash\n\u003e\u003e\u003e gcc -fopenmp -pedantic -pipe -O3 -march=native main.cpp -o main\n```\n\nThe above flags mean:\n\n- `-fopenmp` enables the execution of `OpenMP` directives,\n- `-pedantic` is a standard error warning flag,\n- `-pipe` causes that temporary files will be avoided, which speeds up build,\n- `-O3` imposes a high degree of optimisation (be careful with this),\n- `-march=native` generates code dedicated to the system on which it is compiled.\n\nOnly the `-fopenmp` flag is required for `OpenMP` to work. The others are optional flags which are worth using to optimise the code. We can read more about `GNU GCC` here: [https://gcc.gnu.org/](https://gcc.gnu.org/).\n\n---\n\n## **2. CUDA** \u003ca id=\"cuda\"\u003e\u003c/a\u003e\n\n`CUDA` is Nvidia's universal architecture for multi-core processors (mainly graphics cards), allowing GPU to solve general numerical problems much more efficiently than traditional sequential general-purpose processors.\n\nWorking with Nvidia `CUDA` requires a dedicated graphics card from Nvidia that supports `CUDA` technology. If you have one, you can go to [https://developer.nvidia.com/cuda-downloads](https://developer.nvidia.com/cuda-downloads) to download the `CUDA` Toolkit (select the appropriate operating system, architecture, version, etc.). After choosing the suitable options, you will also be shown a simple guide on what to do next to install the `CUDA` Toolkit.\n\nNext, make sure that the Nvidia compiler works:\n\n```bash\n\u003e\u003e\u003e nvcc\n```\n\nAnd check the current version:\n\n```bash\n\u003e\u003e\u003e nvidia-smi\n```\n\nThe compilation of a program using the `CUDA` architecture is performed as follows:\n\n```bash\n\u003e\u003e\u003e nvcc main.cu -o main\n```\n\nYou can also provide information for the compiler about the computing capability of your graphics card:\n\n```bash\n\u003e\u003e\u003e nvcc -arch=sm_75 main.cu -o main\n```\n\nThe `-arch=sm_75` flag informs compiler that you are equipped with graphics card with computing capability equal to $7.5$.\n\nThe `CUDA` Toolkit also allows the `nvprof` utility to view the operations performed on the graphics card and their execution time. To obtain such statistics, run the toolkit in the following way on Windows:\n\n```bash\n\u003e\u003e\u003e nvprof main.exe\n```\n\nWe can find a lot of useful information about `CUDA` in the official documentation:\n[https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html). I strongly encourage you to familiarise yourself with this.\n\n---\n\n## **3. MPI** \u003ca id=\"mpi\"\u003e\u003c/a\u003e\n\nMessage Passing Interface (`MPI`) is a communication protocol standard for transferring messages between parallel program processes on one or more computers. `MPI` is currently the most widely used communication model in clusters of computers and supercomputers.\n\nThere are several implementations of `MPI`, including `OpenMPI`, `MPICH` and `MSMPI`. On Linux, we can choose from `OpenMPI` and `MPICH`, while `MSMPI` is a Windows implementation. Before going any further, we should ensure that we have the `GCC` compiler installed.\n\n**INSTALLATION AND CONFIGURATION OF `MPICH` ON LINUX SYSTEMS:**\n\nStart the terminal and update the repository:\n\n```bash\n\u003e\u003e\u003e sudo apt-get update\n```\n\nWe then install the `mpich` package:\n\n```bash\n\u003e\u003e\u003e sudo apt-get install mpich\n```\n\nWe can now check the version of the installed `MPI` (this will actually be the `GCC` version):\n\n```bash\n\u003e\u003e\u003e mpic++ --version\n```\n\nHere you can find out more about `MPICH`: [https://www.mpich.org/](https://www.mpich.org/).\n\n**THE INSTALLATION PROCESS UNDER WINDOWS IS COMPLEX, AND I DO NOT RECOMMEND USING MPI WITH THE WINDOWS PLATFORM..., BUT IF YOU WANT TO HAVE FUN YOU HAVE TO CHOOSE `MSMPI`.**\n\nMore about `MSMPI`: [https://learn.microsoft.com/en-us/message-passing-interface/microsoft-mpi](https://learn.microsoft.com/en-us/message-passing-interface/microsoft-mpi).\n\nWe can use the `MPI` protocol to communicate between machines via the local network. In this way, you can run a program that will be executed in parallel by the available allocated threads on more than one machine. To carry out such a task, we need a minimum of two machines connected via the local network. The machines will communicate via the `SSH` protocol and exchange data via the `NFS` protocol. Step-by-step instructions on how this can be implemented on Linux systems is available at: [https://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/.](https://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/.)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmateuszk098%2Fparallel-programming-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmateuszk098%2Fparallel-programming-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmateuszk098%2Fparallel-programming-examples/lists"}