{"id":19115395,"url":"https://github.com/polytypic/io","last_synced_at":"2025-04-19T00:32:45.394Z","repository":{"id":177306541,"uuid":"631822557","full_name":"polytypic/io","owner":"polytypic","description":"IO should be just a library","archived":false,"fork":false,"pushed_at":"2023-07-17T12:20:17.000Z","size":10,"stargazers_count":23,"open_issues_count":3,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-01-28T21:43:32.573Z","etag":null,"topics":["async-io","concurrency","interoperability","parallelism","runtimes"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/polytypic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-04-24T06:16:53.000Z","updated_at":"2023-12-10T07:52:21.000Z","dependencies_parsed_at":null,"dependency_job_id":"e022494e-2f44-41e0-9284-80655385275b","html_url":"https://github.com/polytypic/io","commit_stats":{"total_commits":2,"total_committers":1,"mean_commits":2.0,"dds":0.0,"last_synced_commit":"dbe4d37a98ef958178fe18bb4dae396a5537392e"},"previous_names":["polytypic/io"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polytypic%2Fio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polytypic%2Fio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polytypic%2Fio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polytypic%2Fio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/polytypic","download_url":"https://codeload.github.com/polytypic/io/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223786795,"owners_count":17202603,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["async-io","concurrency","interoperability","parallelism","runtimes"],"created_at":"2024-11-09T04:46:16.074Z","updated_at":"2024-11-09T04:46:16.471Z","avatar_url":"https://github.com/polytypic.png","language":"OCaml","readme":"\u003e TL;DR It is not necessary to agree on a single concurrent programming library,\n\u003e or scheduler, that provides IO, because asynchronous IO can be expressed as a\n\u003e library, independent of any particular scheduler, such that IO can then easily\n\u003e be used by libraries and applications and executed on any number of\n\u003e schedulers. Other concurrent runtime facilities such as blocking, timeouts,\n\u003e and cancellation can also be given scheduler independent interfaces. By\n\u003e providing such key concurrent runtime facilities in scheduler independent form\n\u003e we can have an ecosystem of interoperable libraries, multiple schedulers, and\n\u003e avoid unnecessary community split.\n\n**_NOTE_**: _This is still WIP. The basic idea should be clear, but I want to\nexpand upon it a bit. The code in this repository is not meant to provide a full\nimplementation of anything (IO or a scheduler). The code is just a proof of\nconcept of the feasibility of the ideas here._\n\n# IO should be just a library\n\nFor concurrent programming in OCaml one has traditionally had to make a choice\nbetween two incompatible ecosystems:\n[Lwt](https://ocsigen.org/lwt/latest/manual/manual) or\n[Async](https://opensource.janestreet.com/async/). This split is typically\nconsidered to be unfortunate as, due to lack of interoperability, it has lead to\nduplication of effort, as argued in\n[Abandoning Async](http://rgrinberg.com/posts/abandoning-async/).\n\nWhile entering the multicore era OCaml 5 also got another new major feature,\ncalled [effect handlers](https://v2.ocaml.org/manual/effects.html), which allows\none to express lightweight threads among other things. Somewhat to my surprise\nthis has not \u0026mdash; at least not yet \u0026mdash; lead to an explosion of effects\nbased concurrent programming libraries. Perhaps this might be partly due to the\nfear of another community split, which has, in part, motivated the design and\ndevelopment of the [Eio](https://github.com/ocaml-multicore/eio#readme) library\n\u0026mdash; destined to become **_the one_** library for concurrent programming and\nasynchronous IO for OCaml.\n\nI believe it is fair to say that Eio has an opinionated design in a number of\nways. Eio provides a programming model based on\n[capabilities](https://github.com/ocaml-multicore/eio#design-note-capabilities)\nand [structured concurrency](https://github.com/ocaml-multicore/eio#switches)\nwith [cancellation](https://github.com/ocaml-multicore/eio#switches) used as a\nkey coordination mechanism. IO is provided through a\n[flow](https://ocaml-multicore.github.io/eio/eio/Eio/Flow/index.html)\nabstraction.\n\nHowever, those opinionated designs are not what I'd like to draw attention to.\nThe key issue I'd like to discuss is the architecture of Eio. As described in\nEio's documentation, Eio has optimized backends for different platforms. See the\nbelow diagram and note the direction of dependencies:\n\n```\n       Applications      Libraries\n             |           |\n             +---------+ |\n             |         | |\n             v         v v\n           Eio_main    Eio \u003c--+\n             |                |\n-  - ---+----+----+--- - -    |\n        |    |    |           |\n        |    |    v           |\n        |    |  Eio_windows +-+\n        |    |                |\n        |    v                |\n        | Eio_posix +---------+\n        |                     |\n        v                     |\n     Eio_linux +--------------+\n```\n\nThe `Eio` library has some common components, but the core loop of Eio is\nactually not a single loop. The `Eio_main` library abstracts that loop and each\nof the backends implements it separately (see\n[linux sched](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_linux/sched.ml#L387),\n[posix sched](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_posix/sched.ml#L314),\n[windows sched](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_windows/sched.ml#L318)).\n\nImagine you would like to implement a different concurrent programming model.\n\nWhy would you want to do that?\n\nWell, perhaps you'd like to use work stealing, like provided by\n[Domainslib](https://github.com/ocaml-multicore/domainslib#readme), motived by\nthe idea put forth in the thesis\n[Using effect handlers for efficient parallel scheduling \u0026mdash; Bartosz Modelski](https://k-lifo.com/mphil.pdf):\n\n\u003e Modern hardware is so parallel and workloads are so concurrent that there is\n\u003e no single, perfect scheduling strategy across a complex application software\n\u003e stack. Therefore, significant performance advantages can be gained from\n\u003e customizing and composing schedulers.\n\nOr perhaps you'd rather not have capabilities, because you feel that they are\nunnecessary or you'd rather wait for typed effects to provide much of the same\nability with convenient type inference.\n\nOr perhaps you'd like to\n[introduce an actor framework](https://discuss.ocaml.org/t/rfc-for-a-distributed-process-actor-model-library/12004/5):\n\n\u003e The “ideal” scheduler would allow automatic distribution of processes across\n\u003e domains with effects etc, which would be the part concretely within OCaml 5\n\u003e territory. It is doable but it would mean needing to reimplementing Eio just\n\u003e to have a scheduler-aware IO layer. It seems easier to just wait for upstream\n\u003e Eio to maybe introduce that.\n\nThose are just particular examples. \u003c!-- selective IO primitives, better\nsupport for parallelism, ... --\u003e\n\nIt would be nice to be able to reuse basic IO facilities for a number of\nreasons. It takes considerable effort to implement efficient IO primitives for\nmultiple platforms. But the bigger problem is that if you would implement your\nown asynchronous IO system like Eio, you'd fork the community.\n\nWhat I'm proposing is that instead of associating IO intimately with a\nscheduler, we introduce a scheduler independent IO layer for OCaml. This layer\nwould provide an interface much like e.g. the `Unix` module of OCaml does. The\nkey difference being that basic IO operations like `read` and `write` would be\nable to block in a scheduler independent manner. Code, whether in libraries or\napplications, using that IO layer would then not necessarily be tied to any\nparticular scheduler:\n\n```\n              Applications -----+\n                   |            |\n                   |            v\n                   |       Schedulers: (one or more more of)\n                   |\n                   |            +-- Eio\n                   v            |\n Libraries -----\u003e IO \u003c----------+-- Domainslib\n                   |            |\n             +-----+----+       +-- Actor lib\n             |     |    |       |\n             v     |    v       +-- Oslo\n           Linux   |  Windows   |\n                   |            +-- Helsinki\n                   v            |\n                 Posix          .\n                                .\n                                .\n```\n\nHow could that be done?\n\nIt is simpler that you might think. If you look at how IO is integrated into the\nEio backends, you can see a pattern. First of all each IO backend ultimately has\na blocking operation, much like `Unix.select`, that waits for an IO event or\nreturns after a given timeout has expired (see\n[linux](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_linux/sched.ml#L246),\n[posix](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_posix/sched.ml#L206),\n[windows](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_windows/sched.ml#L211)).\nAdditionally, it is sometimes necessary to break the wait before the timeout\nexpires, such as when a fiber is resumed by a non-IO action, so some wakeup\nmechanism is needed (see\n[linux](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_linux/sched.ml#L86),\n[posix](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_posix/sched.ml#L71),\n[windows](https://github.com/ocaml-multicore/eio/blob/75c27bf50e986cc80bdcd1932a48286b56ab620f/lib_eio_windows/sched.ml#L82)).\n\nWhat this means is that we can abstract IO from the point-of-view of a\nscheduler:\n\n```ocaml\nmodule type Io = sig\n  type t\n  (** IO context for a specific domain. *)\n\n  val get_context : unit -\u003e t\n  (** Get IO context for current domain. *)\n\n  val pollf : float -\u003e unit\n  (** Wait for and trigger IO actions on current domain. *)\n\n  val wakeup : t -\u003e unit\n  (** Force [pollf] on specified context to return. *)\nend\n```\n\nThe exact signatures above are subject to minor variations, but the above is\nimplementable.\n\nA scheduler, then, to provide IO, needs to arrange for `Io.pollf` to be called\nperiodically and use `Io.wakeup`, when necessary, to force `Io.pollf` to return.\nFor Eio this would mean that instead of having three slightly different loops,\nthere would be only one. For other schedulers, like Domainslib, this means that\nthey actually become usable.\n\nBut, we are not actually done yet? How would fibers waiting for IO events be\nsuspended and resumed? In Eio, the mapping of suspended fibers to e.g. file\ndescriptors is managed by the scheduler loop and fibers are resumed after the\nblocking wait by the scheduler loop. We can avoid that simply by using a\nscheduler independent blocking mechanism such as\n[domain local await](https://github.com/ocaml-multicore/domain-local-await/#readme).\nUsing domain local await the IO layer can suspend and resume fibers waiting for\nIO events without having to directly depend on the scheduler. In other words,\nafter `pollf` returns, it has already resumed all the fibers corresponding to\nthe IO events and the scheduler loop should then have fibers to run.\n\n## Concurrent runtime services\n\nMore generally there is a vision for composing schedulers in OCaml as described\nin\n[Composing Schedulers using Effect Handlers](https://kcsrk.info/papers/compose_ocaml22.pdf).\nI'd like to expand on that vision and consider what are the key services that\nlibraries need from a concurrent runtime and could we provide abstract\nminimalistic interfaces for those such that we could have an ecosystem of\ninteroperable scheduler independent libraries.\n\n### Blocking\n\n- Communication and synchronization abstractions\n  - STM\n  - Promise\n  - Mutex\n  - Semaphore\n  - Async IO\n  - ...\n\n```ocaml\nmodule type Blocking = sig\n  type t = { release : unit -\u003e unit; await : unit -\u003e unit }\n  val prepare_for_await : unit -\u003e t\nend\n```\n\n### Cancellation\n\n- Anything that needs to call scheduler when it must not be canceled\n  - Condition variable that reacquires the Mutex after `wait`\n  - Protected sections:\n    `Fun.protect ( ... ) ~finally:(fun () -\u003e (* protected *))`\n\n```ocaml\nmodule type Cancelation = sig\n  val forbid : (unit -\u003e 'a) -\u003e 'a\n  val permit : (unit -\u003e 'a) -\u003e 'a\nend\n```\n\n### Timeouts\n\n```ocaml\nmodule type Timeout = sig\n  val set_timeoutf : float -\u003e (unit -\u003e unit) -\u003e unit -\u003e unit\nend\n```\n\n### IO\n\n```ocaml\nmodule type Io = sig\n  val pollf : float -\u003e unit\n  val wakeup : unit -\u003e unit\nend\n```\n\n### Fibers\n\n```ocaml\nmodule type Fiber = sig\n  type 'a t\n  val spawn : (unit -\u003e 'a) -\u003e 'a t\n  val join : 'a t -\u003e 'a\nend\n```\n\n### Nested parallelism\n\n```ocaml\nmodule type NestedParallelism = sig\n  val par : (unit -\u003e 'a) -\u003e (unit -\u003e 'b) -\u003e 'a * 'b\nend\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpolytypic%2Fio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpolytypic%2Fio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpolytypic%2Fio/lists"}