{"id":19246361,"url":"https://github.com/dthuerck/aurora-runtime","last_synced_at":"2025-02-23T15:40:52.436Z","repository":{"id":116241912,"uuid":"219381353","full_name":"dthuerck/aurora-runtime","owner":"dthuerck","description":"A CUDA-like runtime API and system for NEC's SX-AURORA TSUBASA.","archived":false,"fork":false,"pushed_at":"2020-01-22T23:43:56.000Z","size":379,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-05T04:41:35.413Z","etag":null,"topics":["aurora-runtime","kernels","ve-offload","ve-udma"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dthuerck.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-03T23:41:06.000Z","updated_at":"2023-08-02T20:09:59.000Z","dependencies_parsed_at":null,"dependency_job_id":"f84fce6f-abe4-4b14-a053-b749d44ee273","html_url":"https://github.com/dthuerck/aurora-runtime","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dthuerck%2Faurora-runtime","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dthuerck%2Faurora-runtime/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dthuerck%2Faurora-runtime/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dthuerck%2Faurora-runtime/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dthuerck","download_url":"https://codeload.github.com/dthuerck/aurora-runtime/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240339489,"owners_count":19785956,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aurora-runtime","kernels","ve-offload","ve-udma"],"created_at":"2024-11-09T17:31:47.113Z","updated_at":"2025-02-23T15:40:52.381Z","avatar_url":"https://github.com/dthuerck.png","language":"C","readme":"# aurora-runtime\n\nThis package is an attempt to reproduce **NVIDIA's CUDA Runtime API** [1], i.e.\nenable the user to write device _kernels_  and launch them in a quasi-_grid_\nstructure on NEC's Aurora SX-TSUBASA vector engine.\n\nTo that end, we wrap NEC's VE Offload [2] and UDMA [3] APIs their such that\nthe usage mimics CUDA's runtime API.\n\n## Installation and Example\n\nThe installation is as easy as a breeze! The dependencies on the target\nsystems are:\n\n* python (\u003e= 3.5)\n* cmake (\u003e= 3.10)\n* reasonably new gcc/g++ (eg. from scl devtoolset-8)\n* NEC Aurora SDK (ncc, libs) - under ``/opt/nec``\n* LLVM-VE (llvm/clang): https://sx-aurora.com/repos/veos/ef_extra under ``/opt/nec``\n\nFor installation, \n\n1. Clone this repository:\n    ```\n    $ git clone https://github.com/dthuerck/aurora_runtime.git\n    ```\n2. Download and build dependencies:\n    ```\n    $ cd aurora_runtime\n    $ chmod +x init.sh\n    $ ./init.sh\n    ```\n\nThat's it! Now we can build an example application featuring GEMA (256x256\nbatched matrix addition) and GEMM (256x256 batched matrix multiplication):\n\n```\n$ mkdir build \u0026\u0026 cd build\n$ cmake ..\n$ make\n```\n\nFinally, run the example with ``./app-test`` and watch your Aurora hard at work!\n\n## Using the runtime\n\nThe runtime API functions are listed in ``.runtime/include/aurora_runtime.h``,\ntheir usage is demonstrated in the example (see ``app-test.cc``).\n\nThe runtime centers around the concept of a (_virtual_) **processing group**;\nbasically, we write _kernels_ and each kernel is then executed in a batch\nof size ``n`` via offload and OpenMP. Roughly speaking (for people familiar with CUDA),\neach processing group is a block and the batch corresponds to a grid of\nsize ``n``.\nThe runtime offers the following variables that are set in kernel functions:\n* ``__pg__ix``: the index of the processing group (index in the batch)\n* ``__num_pgs``: the batch size / number of processing groups\n* ``__pe__ix`` / ``__pg_size``: reserved for future use\n\nLastly, the most important part: kernels are conventional C-functions with\nthe annotation **__ve_kernel__** and saved with a ``.cve`` extension.\n\nThe build process is fully automated and supported by CMake. For details,\nplease refer to ``CMakeLists.txt``.\n\n## Creating a new project\n\nIdeally, use this repository as a scaffolding:\n1. Clone this repository and run the ``init.sh``.\n2. Replace ``gema.cve``, ``gemm.cve`` by your kernels.\n3. Replace ``app-test.cc`` by your application's source.\n4. Change the ``CMakeLists.txt`` accordingly.\n\nThat's it!\n\n## Standing on the shoulder of giants...\n\nThis project uses the following packages:\n\n* VE Offload [1]\n* VE UDMA [2]\n* [NEC's LLVM](https://github.com/sx-aurora-dev/llvm-project.git)\n* [pycparse](https://github.com/eliben/pycparser)\n\n## References\n\n1. [NVIDIA C Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html)\n2. [VE Offload](https://github.com/SX-Aurora/veoffload)\n3. [VE UDMA](https://github.com/SX-Aurora/veo-udma)","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdthuerck%2Faurora-runtime","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdthuerck%2Faurora-runtime","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdthuerck%2Faurora-runtime/lists"}