{"id":15641576,"url":"https://github.com/ysh329/openmp-101","last_synced_at":"2025-10-04T09:19:44.735Z","repository":{"id":48200765,"uuid":"165330523","full_name":"ysh329/OpenMP-101","owner":"ysh329","description":"Learn OpenMP examples step by step","archived":false,"fork":false,"pushed_at":"2025-01-18T00:24:23.000Z","size":8067,"stargazers_count":91,"open_issues_count":4,"forks_count":15,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-30T10:07:51.724Z","etag":null,"topics":["exercise","guide","guidebook","hpc","omp","omp-parallel","openmp","openmp-parallelization","tutorial"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ysh329.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-12T00:47:34.000Z","updated_at":"2025-03-13T12:12:59.000Z","dependencies_parsed_at":"2025-02-28T14:17:24.902Z","dependency_job_id":"cac732ef-673a-4c11-999b-f74a93219c5c","html_url":"https://github.com/ysh329/OpenMP-101","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysh329%2FOpenMP-101","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysh329%2FOpenMP-101/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysh329%2FOpenMP-101/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysh329%2FOpenMP-101/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ysh329","download_url":"https://codeload.github.com/ysh329/OpenMP-101/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247471521,"owners_count":20944158,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["exercise","guide","guidebook","hpc","omp","omp-parallel","openmp","openmp-parallelization","tutorial"],"created_at":"2024-10-03T11:43:02.923Z","updated_at":"2025-10-04T09:19:39.692Z","avatar_url":"https://github.com/ysh329.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenMP-101\n\n\u003e ## Optimization Notice\n\u003e ![opt-img-01](./assets/opt-notice-en_080411.gif)\n\u003e ![opt-img-zh](./assets/Chinese.gif)\n\n## 0. Fast Guide: OMP in Caffe\n\n### 0.1 What's the OMP\n\n- an easy, portable and scalable way to parallelize applications for  many cores. – Multi-threaded, shared memory model (like pthreads) \n- a standard API\n- omp pragmas are supported by major C/C++ , Fortran compilers (gcc, icc, etc).  \n \nA lot of good tutorials on-line: \n- https://hpc.llnl.gov/tuts/openMP/\n- http://openmp.org/mp-documents/omp-hands-on-SC08.pdf \n\n### 0.2 OpenMP programming model \n\n![omp program model](./assets/omp1.png)\n\n### 0.3 Example\n\nnaive implementation\n\n```c\nint main(int argc, char *argv[])\n{\n    int idx;\n    float a[N], b[N], c[N];\n    \n    for(idx=0; idx\u003cN; ++idx)\n    {\n        a[idx] = b[idx] = 1.0;\n    }\n    \n    for(idx=0; idx\u003cN; ++idx)\n    {\n        c[idx] = a[idx] + b[idx];\n    }\n}\n```\n\nomp implementation\n\n```c\n#include \u003comp.h\u003e\nint main(int argc, char *argv[])\n{\n    int idx;\n    float a[N], b[N], c[N];\n    #pragma omp parallel for\n    for(idx=0; idx\u003cN; ++idx)\n    {\n        a[idx] = b[idx] = 1.0;\n    }\n    #pragma omp parallel for\n    for(idx=0; idx\u003cN; ++idx)\n    {\n        c[idx] = a[idx] + b[idx];\n    }\n}\n```\n\n```c\n#include \u003comp.h\u003e\n#include \u003cstdio.h\u003e\n#include \u003cstdlib.h\u003e\n#define N (100)\nint main(int argc, char *argv[])\n{\n    int nthreads, tid, idx;\n    float a[N], b[N], c[N];\n    nthreads = omp_get_num_threads();\n    printf(\"Number of threads = %d\\n\", nthreads);\n    #pragma omp parallel for\n    for(idx=0; idx\u003cN; ++idx)\n    {\n        a[idx] = b[idx] = 1.0;\n    }\n    #pragma omp parallel for\n    for(idx=0; idx\u003cN; ++idx)\n    {\n        c[idx] = a[idx] + b[idx];\n        tid = omp_get_thread_num();\n        printf(\"Thread %d: c[%d]=%f\\n\", tid, idx, c[idx]);\n    }\n}\n```\n\n### 0.4 Compiling, linking etc \n\nYou need to add flag `–fopenmp`\n\n```shell\n# compile using gcc\ngcc -fopenmp omp_vecadd.c -o vecadd\n\n# compile using icc\nicc -openmp omp_vecadd.c -o vecad\n```\n\nControl number of threads through set enviroment variable on command line:\n\n```shell\nexport OMP_NUM_THREADS=8 \n```\n\n### 0.5 Exercise\n\n1. Implement\n  - vector dot-product: c=\u003cx,y\u003e\n  - matrix-matrix multiply\n  - 2D matrix convolution\n2. Add openmp support to relu, and max-pooling layers \n\n\u003e ## note\n\u003e synch and critical sections,\n\u003e - use critical section to reduce false sharing\n\u003e - BUT don't put critical sections inside tight loops - doing so serializes things\n\n### 0.6 Tips to Improve Performance for Popular Deep Learning Frameworks on CPUs\n\n[improve_performance_for_deep_learning_frameworks_on_cpu](./improve_performance_for_deep_learning_frameworks_on_cpu.md)\n\n## Tutorial1: Introduction to OpenMP\n\nIntel’s Tim Mattson’s Introduction to OpenMP video tutorial is now available.\n\n- [video](https://www.youtube.com/playlist?list=PLLX-Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG)\n- [slide](https://www.openmp.org/wp-content/uploads/Intro_To_OpenMP_Mattson.pdf)\n- [exercise](https://www.openmp.org/wp-content/uploads/Mattson_OMP_exercises.zip)\n\nOutline:\n\n### Unit 1: Getting started with OpenMP\n\n- Module 1: Introduction to parallel programming\n- Module 2: The boring bits: Using an OpenMP compiler (hello world)\n- Discussion 1: Hello world and how threads work\n\n### Unit 2: The core features of OpenMP\n- Module 3: Creating Threads (the Pi program)\n- Discussion 2: The simple Pi program and why it sucks\n- Module 4: Synchronization (Pi program revisited)\n- Discussion 3: Synchronization overhead and eliminating false sharing\n- Module 5: Parallel Loops (making the Pi program simple)\n- Discussion 4: Pi program wrap-up\n\n### Unit 3: Working with OpenMP\n- Module 6: Synchronize single masters and stuff\n- Module 7: Data environment\n- Discussion 5: Debugging OpenMP programs\n- Module 8: Skills practice … linked lists and OpenMP\n- Discussion 6: Different ways to traverse linked lists\n\n### Unit 4: a few advanced OpenMP topics\n- Module 9: Tasks (linked lists the easy way)\n- Discussion 7: Understanding Tasks\n- Module 10: The scary stuff … Memory model, atomics, and flush (pairwise synch).\n- Discussion 8: The pitfalls of pairwise synchronization\n- Module 11: Threadprivate Data and how to support libraries (Pi again)\n- Discussion 9: Random number generators\n\n### Unit 5: Recapitulation\n\nThanks go to the University Program Office at Intel for making this tutorial available.\n\n## Tutorial2: OpenMP\n\nAuthor: Blaise Barney, Lawrence Livermore National Laboratory\n\n[OpenMP](https://computing.llnl.gov/tutorials/openMP/)\n\n## Tutorial3: OpenMP tutorial | Goulas Programming Soup  \nhttps://goulassoup.wordpress.com/2011/10/28/openmp-tutorial/\n\n## reference\n\n- [lnarmour/omp-tutorial](https://github.com/lnarmour/omp-tutorial)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fysh329%2Fopenmp-101","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fysh329%2Fopenmp-101","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fysh329%2Fopenmp-101/lists"}