{"id":34935246,"url":"https://github.com/paintdream/iris","last_synced_at":"2026-01-08T15:18:11.950Z","repository":{"id":37751274,"uuid":"276397769","full_name":"paintdream/iris","owner":"paintdream","description":"Iris is an extensible asynchronous header-only framework written in pure modern C++, including a M:N task scheduler (with coroutine support for C++ 20 optionally) and an advanced DAG-based task dispatcher.","archived":false,"fork":false,"pushed_at":"2025-12-24T15:21:20.000Z","size":1689,"stargazers_count":17,"open_issues_count":1,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-26T06:28:28.312Z","etag":null,"topics":["coroutines","dag","lua","luabinding","multi-threading","thread-pool"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paintdream.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-07-01T14:20:52.000Z","updated_at":"2025-12-24T15:21:24.000Z","dependencies_parsed_at":"2023-12-17T14:27:01.301Z","dependency_job_id":"f6ed4727-f934-4c9d-829d-53d32df25727","html_url":"https://github.com/paintdream/iris","commit_stats":null,"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/paintdream/iris","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paintdream%2Firis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paintdream%2Firis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paintdream%2Firis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paintdream%2Firis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paintdream","download_url":"https://codeload.github.com/paintdream/iris/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paintdream%2Firis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28057669,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-26T02:00:06.189Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coroutines","dag","lua","luabinding","multi-threading","thread-pool"],"created_at":"2025-12-26T18:02:14.824Z","updated_at":"2025-12-26T18:02:23.035Z","avatar_url":"https://github.com/paintdream.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Iris\nIris is an extensible asynchronous header-only framework written in pure modern C++, including a M:N task scheduler (with coroutine support for C++ 20 optionally) and an advanced DAG-based task dispatcher.\n\n## Build\n\nIris is header-only. The only thing you need to do is to include the corresponding header files.\n\nMost of Iris classes works with C++ 11 compatible compilers, except some optional features: \n\n* Lua Binding support requires C++ 17 if-constexpr feature. (Visual Studio 2017+, GCC 7+, Clang 3.9+)\n* Coroutine support for thread pool scheduler requires C++ 20 standard coroutine feature. (Visual Studio 2019+, GCC 11+, Clang 14+)\n\nAll examples could be built by [CMake build system](https://cmake.org/), see CMakeLists.txt for more details.\n\n## License\n\nIris is distributed under MIT License.\n\n## Concepts\n\nIris provides a simple M:N task scheduler called **Warp System** which is inspired by [Boost](https://www.boost.org/) Strand System. Let's start illustrating it from basic concepts.\n\n#### Task\n\nA task is the **logical** executing unit in concept of application development . Usually it is represented by a function pointer. \n\n#### Thread\n\nA thread is a **native** execution unit provided by operating system. **Tasks must be run in threads**. Different threads are considered to be possibly running at the same time.\n\n**Multi-threading**, which aims to run several threads within an program, is an effective approach to making full use of CPUs in many-core system. Usually it's very hard to code and debug. Therefore, there are many data structures, programming patterns and frameworks to simplify coding process and make it easier for developers. This project is one of them.\n\n#### Thread Pool\n\nThreads are heavy. It is not efficient to run every task by invoking a brand-new thread. Thread pool is a type of multi-threading framework that could make it more efficient. Thread pool maintains a set of threads called \"Worker Thread\" **reused** for running tasks. When a new task are required to be run, the thread pool could schedule it to a proper worker thread if there were idled one, or queue it until any worker became idle.\n\n#### Warp\n\nSome tasks are going to read/write at the same objects, or visiting the same thread-unsafe interfaces, indicating that they are not able to run at the same time. See [RACE Condition](https://en.wikipedia.org/wiki/Race_condition) for details. Here we just call them **conflicting** tasks.\n\nTo make our programs run correctly, we must establish some techniques prevent unexpected conflicts. Here introduce a new concept: **Warp**.\n\nA warp is a logical container of series conflicting tasks. Tasks belong to the **same warp** are granted to be **mutexed** automatically and thus **neither two of them** can be run at the same time, avoiding race-conditions prospectively. This feature is called **warp restriction**. To make coding easier, we could bind all tasks about a specified object into a specific warp. In this case, we call that this object is totally bound to a warp context.\n\nBesides, tasks among **different** warps **can** be run at the same time respectively.\n\n#### Warp System\n\nThe Warp System is a bridge between **warps** and **thread pool**. That is, programmers commit tasks labeled by warp to the system, then the latter schedule them to a thread pool. With some magic techniques applied internally, we finally construct a conflict-free task flow.\n\nThe thread count **M** of Warp System is **fixed** when it starts. But the warp count **N** can be dynamically adjusted by programmers freely. So the warp system is a type of flexible M:N task mapping system.\n\n## Quick Start\n\nLet's start with simple programs in [iris_dispatcher_demo.cpp](iris_dispatcher_demo.cpp) . \n\n#### Basic Example: simple explosion\n\nThe Warp System runs on a thread pool, and the first thing is to create it. There is a built-in thread pool written in C++ 11 std::thread in [iris_dispatcher.h](iris_dispatcher.h), you can replace it with your own platform-correlated implementation.\n\n```C++\nstatic const size_t thread_count = 4;\niris_async_worker_t\u003c\u003e worker(thread_count);\n```\n\nThen we initialize the warps. There is no \"warp system class\". Each warp is **individual**, just create a vector of them. We call them warp 0, warp 1, etc. \n\nDifferent from boost strands, the tasks in a warp are **NOT** ordered by default, which means the **final execution order** is not the same order of committing. You can still enable ordering as you like anyway (see declaration of \"strand_t\" as following code), which is not recommended because ordering may be slightly inefficient than default setting.\n\n```C++\nstatic const size_t warp_count = 8;\nusing warp_t = iris_warp_t\u003ciris_async_worker_t\u003c\u003e\u003e;\nusing strand_t = iris_warp_t\u003ciris_async_worker_t\u003c\u003e, true\u003e; // behaves like a strand\n\nstd::vector\u003cwarp_t\u003e warps;\nwarps.reserve(warp_count);\nfor (size_t i = 0; i \u003c warp_count; i++) {\n\twarps.emplace_back(worker); // calls iris_warp_t::iris_warp_t(iris_async_worker_t\u003c\u003e\u0026)\n}\n```\n\nThen we can schedule a task into the warp you want. Just call **queue_routine**. \n\n```C++\nwarps[0].queue_routine([]() {/* operations on warps[0] */});\nwarps[0].queue_routine([]() {/* operations on warps[0] */});\n```\n\nThat's all you need to do. According to warp restrictions, operation A and operation B are **never executed at the same time**, since they are in the **same** warp.\n\nOtherwise, if we queue_routine tasks to different warps, like:\n\n```C++\nwarps[0].queue_routine([]() { /* do operation C */});\nwarps[1].queue_routine([]() { /* do operation D */});\n```\n\nAccording to warp restrictions, operation A and operation B **could be executed at the same time**.\n\nHere is an \"explosion\" example. In this example, we code a function called \"explosion\", which randomly folks multiple recursions of writing operations on a integer array described here:\n\n```C++\nstatic int32_t warp_data[warp_count] = { 0 };\n```\n\nThe restriction is that warp 0 can only write warp_data[0], warp 1 can only write warp_data[1]:\n\n```C++\nstd::function\u003cvoid()\u003e explosion;\nstatic constexpr size_t split_count = 4;\nstatic constexpr size_t terminate_factor = 100;\n\nexplosion = [\u0026warps, \u0026explosion, \u0026worker]() {\n\tif (worker.is_terminated())\n\t\treturn;\n\n\twarp_t\u0026 current_warp = *warp_t::get_current_warp();\n\tsize_t warp_index = \u0026current_warp - \u0026warps[0];\n\twarp_data[warp_index]++;\n\n\t// simulate working\n\tstd::this_thread::sleep_for(std::chrono::milliseconds(rand() % 40));\n\twarp_data[warp_index]++;\n\n\tif (rand() % terminate_factor == 0) {\n\t\t// randomly terminates\n\t\tworker.terminate();\n\t}\n\n\twarp_data[warp_index]++;\n\t// randomly dispatch to warp\n\tfor (size_t i = 0; i \u003c split_count; i++) {\n\t\twarps[rand() % warp_count].queue_routine(std::function\u003cvoid()\u003e(explosion));\n\t}\n\n\twarp_data[warp_index] -= 3;\n};\n```\n\nThough there is no locks or atomics on operating warp_data, we can still assert that the final value of each warp_data must be 0. The execution of the same warp never overlap in time-line.\n\n#### Advanced Example: garbage collection\n\nThere is a function named garbage_collection, which simulates a multi-threaded mark-sweep [garbage collection](http://en.wikipedia.org/wiki/Garbage_collection_(computer_science) ) process. \n\nGarbage collection is a technique for collecting unreferenced objects and deleting them. Mark-sweep is a basic approach for garbage collection. It contains three steps:\n\n1. Scanning all objects and mark them **unvisited**.\n2. Traverse from root objects through reference relationships, mark all objects that can be directly or indirectly referenced to **visited**.\n3. Rescanning all objects, delete the objects with **unvisited** mark. Thus all objects that not linked with root objects (i.e. garbage) are deleted.\n\nNow suppose we got the definition of basic object node as followed:\n\n```C++\nstruct node_t {\n\tsize_t warp_index = 0;\n\tsize_t visit_count = 0; // we do not use std::atomic\u003c\u003e here.\n\tstd::vector\u003csize_t\u003e references;\n};\n\nstruct graph_t {\n\tstd::vector\u003cnode_t\u003e nodes;\n};\n```\n\nTo apply garbage collection, we need to record every **references** from the current node, and traverse them from root object as collecting. We use **visit_count** to record whether the current node is **visited**.\n\nIf you are experienced in multi-threaded programming, you may figure out that **visit_count** should be of type std::atomic\u003csize_t\u003e because there may be several threads performing modification during collecting progress.\n\nBut we have decided to make things different.\n\nWe are splitting the node visiting operations into multiple warps (recorded by member **warp_index**). For example, node 1-10 are grouped into warp 0, node 11-20 are grouped into warp 1, or just randomly assigned. Any task operations on the nodes in the same warp will be protected by warp system. As a result, the variable **visit_count** is granted to **never** operated by multiple-threads and no atomic or locks are required.\n\nIn order to obey the warp restriction, all we need to do is to invoke a task with related node's warp when we are planning to do something on it:\n\n```C++\nwarps[target_node.warp_index].queue_routine([]() {\n\t// operations on target_node\n});\n```\n\nSince we have visited a new node, all **references** should be added into next collection process. To preserve the warp restriction, we schedule them into their own warps: (see the line commented with \u003c------)\n\n```C++\ngraph_t graph;\nstd::function\u003cvoid(size_t)\u003e collector;\nstd::atomic\u003csize_t\u003e collecting_count;\ncollecting_count.store(0, std::memory_order_release);\n\ncollector = [\u0026warps, \u0026collector, \u0026worker, \u0026graph, \u0026collecting_count](size_t node_index) {\n\twarp_t\u0026 current_warp = *warp_t::get_current_warp();\n\tsize_t warp_index = \u0026current_warp - \u0026warps[0];\n\n\tnode_t\u0026 node = graph.nodes[node_index];\n\tassert(node.warp_index == warp_index);\n\n\tif (node.visit_count == 0) {\n\t\tnode.visit_count++; // SAFE: only one thread can visit it\n\n\t\tfor (size_t i = 0; i \u003c node.references.size(); i++) {\n\t\t\tsize_t next_node_index = node.references[i];\n\t\t\tsize_t next_node_warp = graph.nodes[next_node_index].warp_index;\n\t\t\tcollecting_count.fetch_add(1, std::memory_order_acquire);\n\t\t\twarps[next_node_warp].queue_routine(std::bind(collector, next_node_index)); // \u003c------\n\t\t}\n\t}\n\n\tif (collecting_count.fetch_sub(1, std::memory_order_release) == 1) {\n\t\t// all work finished.\n\t\t// ...\n\t}\n};\n```\n\nThat's all, there would be **no** explicit locks and atomics. All dangerous multi-threaded works are done by Warp System. See the full source code of garbage_collection for more details.\n\n#### Discussion\n\nNow let's get back to the beginning, what's the meaning of warps? What if we just use atomics or locks?\n\nThe answer contains three aspects: \n\n1. Convenient: The only thing you must remember is the rule that **always schedule task according to warp**. There is no lock-order requirement, dead-locking, busy-waiting, memory order problem and atomic myths.\n2. High performance: If we abuse locks and atomics everywhere, for example, allocating separate locks on each object, do locks or atomic operations as long as we need to visit objects, then the program will stuck on bus-locking, kernel-switching and thread-switching, which lead to low performance issues. The warp concept wraps a series of operations or a mount of objects into a logical \"scheduling package\", reducing switching cost and busy-wait cost, making them more friendly for multi-thread systems.\n3. Flexible: you could easily adjust the object/task warping rules as you like. For example, allocating more warps and splitting objects with smaller granularity if you have more CPUs.  The system allows programmers to transport a object or a group of tasks from one warp to another dynamically, if they are working on some dynamic overload balancing features. \n\n\n\n## Step further\n\n### In-Warp Parallel\n\nIn common case, there is only one thread running in a warp context. But what if we want to break the rule temporarily by local code and do some parallelized operations with warp restriction holding for other code? I know it's unsafe, but I just want to do it.\n\nOpen the [iris_dispatcher_demo.cpp](iris_dispatcher_demo.cpp) and you could find a piece of code in function \"simple_explosion\":\n\n```C++\nstatic constexpr size_t parallel_factor = 11;\nstatic constexpr size_t parallel_count = 6;\nif (rand() % parallel_factor == 0) {\n\t// read-write lock example: multiple reading blocks writing\n\tstd::shared_ptr\u003cstd::atomic\u003cint32_t\u003e\u003e shared_value = std::make_shared\u003cstd::atomic\u003cint32_t\u003e\u003e(-0x7fffffff);\n\tfor (size_t i = 0; i \u003c parallel_count; i++) {\n\t\tcurrent_warp.queue_routine_parallel([shared_value, warp_index]() {\n\t\t\t// only read operations\n\t\t\tstd::this_thread::sleep_for(std::chrono::milliseconds(rand() % 40));\n\t\t\tint32_t v = shared_value-\u003eexchange(warp_data[warp_index], std::memory_order_release);\n\t\t\tassert(v == warp_data[warp_index] || v == -0x7fffffff);\n\t\t});\n\t}\n}\n```\n\nThe function **queue_routine_parallel** invokes a special parallelized task on current_warp, which can be run at the same time. As one parallelized task running, other normal tasks on current_warp remains to be **blocked**. After all parallelized task finishes, the normal tasks then could to be scheduled.\n\n**Parallelized tasks to normal tasks is what read locks to write locks**. It's an advanced feature and you must be careful when use them.\n\n### Coroutines\n\nIn C++ 20, we can use coroutines to simplify asynchronous program development.\n\nWarp system supports coroutines integration, you could find an example at [iris_coroutine_demo.cpp](iris_coroutine_demo.cpp):\n\nTo start with a coroutine, just write a function with return value type \"iris_coroutine_t\":\n\n```C++\niris_coroutine_t\u003creturn_type\u003e example(warp_t::async_worker_t\u0026 async_worker, warp_t* warp, int value) {}\n```\n\nIn this coroutine function, you could await call **iris_switch** to switch to another warp context:\n\n```C++\nif (warp != nullptr) {\n\twarp_t* current = co_await iris_switch(warp);\n\tprintf(\"Switch to warp %p\\n\", warp);\n\tco_await iris_switch((warp_t*)nullptr);\n\tprintf(\"Detached\\n\");\n\tco_await iris_switch(warp);\n\tprintf(\"Attached\\n\");\n\tco_await iris_switch(current);\n\tassert(current == warp_t::get_current_warp());\n}\n```\n\nco_wait iris_switch returns previous warp. Notice that we can switch to a nullptr warp, which means that we are planning to detach from current warp. Switching from nullptr warp to a valid warp is also allowed respectively.\n\nAnd we can create and wait a asynchronized task on target warp:\n\n```C++\nco_await iris_awaitable(warp, []() {});\n```\n\nIt is equivalent to switching to warp and switching back. But **iris_awaitable** allows early dispatching before waiting:\n\n```C++\nauto awaitble = iris_awaitable(warp, []() {});\nawaitable.dispatch();\n// do something other\nco_await awaitable;\n```\n\niris_coroutine_t\u003creturn_type\u003e is not only a coroutine but also an awaitable object. You could also co_await it to chain your coroutine pipeline.\n\n### DAG-based Task Dispatcher\n\nDAG-based Task Dispatcher, also well-known as Task Graph, is widely used task dispatching techniques for tasks with partial order dependency.\n\nWe also provide a DAG-based Task Dispatcher called iris_dispatcher_t (see function \"graph_dispatch\" at [iris_dispatcher_demo ](iris_dispatcher_demo.cpp) ):\n\nYou can create a dispatcher with:\n\n```C++\niris_dispatcher_t\u003cwarp_t\u003e dispatcher(worker);\n```\n\nThe second parameter is an optional function, called after all tasks in dispatcher graph finished.\n\nTo add a task to dispatcher, call **allocate**. \n\n```C++\nauto d = dispatcher.allocate(\u0026warps[2], []() { std::cout \u003c\u003c \"Warp 2 task [4]\" \u003c\u003c std::endl; });\nauto a = dispatcher.allocate(\u0026warps[0], []() { std::cout \u003c\u003c \"Warp 0 task [1]\" \u003c\u003c std::endl; });\nauto b = dispatcher.allocate(\u0026warps[1], []() { std::cout \u003c\u003c \"Warp 1 task [2]\" \u003c\u003c std::endl; });\n```\n\nNotice that there is a return value with internal type routine_t*. You could call **order** function to order them later. \n\n```C++\ndispatcher.order(a, b);\n// dispatcher.order(b, a); // will trigger validate assertion\n\nauto c = dispatcher.allocate(nullptr, []() { std::cout \u003c\u003c \"Warp nil task [3]\" \u003c\u003c std::endl; });\ndispatcher.order(b, c);\n// dispatcher.order(c, a); // will trigger validate assertion\ndispatcher.order(b, d);\n```\n\nThen call **dispatch** to run them. \n\n```C++\ndispatcher.dispatch(a);\ndispatcher.dispatch(b);\ndispatcher.dispatch(c);\ndispatcher.dispatch(d);\n```\n\nTo dispatch more flexible, you can **defer/dispatch** a task dynamically. Notice that **defer** must be called during dispatcher running and **BEFORE** target task actually runs.\n\n```c++\nauto b = dispatcher.allocate(\u0026warps[1], [\u0026dispatcher, d]() {\n\tdispatcher.defer(d);\n\tstd::cout \u003c\u003c \"Warp 1 task [2]\" \u003c\u003c std::endl;\n\tdispatcher.dispatch(d);\n});\n```\n\n### Polling from external thread\n\nIt is a common case that a thread has to be blocked to wait for some signals arrive. For example, suppose you are spinning to wait an atomic variable to be expected value (spin lock, for example), and there are nothing to be done but to spin. In this case, we can try to \"borrow\" some tasks from thread pool and executing them if our atomic variable is not ready yet.\n\n```C++\nwhile (some_variable.load(std::memory_order_acquire) != expected_value) {\n\t// delay at most 20ms or poll tasks with priority 0 if possible \n\tworker.poll_one(0, std::chrono::milliseconds(20));\n}\n```\n\n### Exiting \n\nUse iris_warp_t::poll to poll all tasks from all wraps (including their async_worker's tasks) while exiting.\n\n```C++\nasync_worker.terminate();\nasync_worker.join();\nwhile (iris_warp_t::poll({ \u0026warp1, \u0026warp2, ... })) {\n\tstd::this_thread::sleep_for(std::chrono::milliseconds(20));\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaintdream%2Firis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaintdream%2Firis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaintdream%2Firis/lists"}