{"id":16964951,"url":"https://github.com/propensive/turbulence","last_synced_at":"2025-08-16T11:35:42.229Z","repository":{"id":37237057,"uuid":"472295968","full_name":"propensive/turbulence","owner":"propensive","description":"Simple tools for working with data streams in LazyLists in Scala","archived":false,"fork":false,"pushed_at":"2025-02-12T15:56:24.000Z","size":3845,"stargazers_count":10,"open_issues_count":2,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-08-11T17:56:14.346Z","etag":null,"topics":["multiplexing","scala","streaming","streaming-api","streaming-data"],"latest_commit_sha":null,"homepage":"https://soundness.dev/turbulence/","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/propensive.png","metadata":{"files":{"readme":".github/readme.md","changelog":null,"contributing":".github/contributing.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-21T10:53:08.000Z","updated_at":"2025-02-12T15:56:28.000Z","dependencies_parsed_at":"2023-02-19T17:00:45.689Z","dependency_job_id":"e450dc5b-0b2d-48fd-87c1-6115e4b77077","html_url":"https://github.com/propensive/turbulence","commit_stats":null,"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"purl":"pkg:github/propensive/turbulence","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fturbulence","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fturbulence/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fturbulence/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fturbulence/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/propensive","download_url":"https://codeload.github.com/propensive/turbulence/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/propensive%2Fturbulence/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270706888,"owners_count":24631768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["multiplexing","scala","streaming","streaming-api","streaming-data"],"created_at":"2024-10-13T23:44:40.793Z","updated_at":"2025-08-16T11:35:42.185Z","avatar_url":"https://github.com/propensive.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"[\u003cimg alt=\"GitHub Workflow\" src=\"https://img.shields.io/github/actions/workflow/status/propensive/turbulence/main.yml?style=for-the-badge\" height=\"24\"\u003e](https://github.com/propensive/turbulence/actions)\n[\u003cimg src=\"https://img.shields.io/discord/633198088311537684?color=8899f7\u0026label=DISCORD\u0026style=for-the-badge\" height=\"24\"\u003e](https://discord.com/invite/MBUrkTgMnA)\n\u003cimg src=\"/doc/images/github.png\" valign=\"middle\"\u003e\n\n# Turbulence\n\n__Simple tools for working with data streams in `LazyList`s__\n\n__Turbulence__ provides interfaces for reading and writing data in streams,\nusing `LazyList`s.\n\na few useful methods for working with `LazyList`s for streaming data as bytes, characters and chunks of data.\n\n## Features\n\n- provides several stream-related operations on `LazyList`s\n- can multiplex several streams into a single stream\n- can cluster together short sequences of events which happen within a predefined period of time\n\n\n## Availability\n\n\n\n\n\n\n\n## Getting Started\n\n## `LazyList`s\n\n### Abstract\n\n`LazyList`s are the core abstraction used by Turbulence for streaming data in\nvarious formats. A `LazyList` is a novel representation of data which makes it\npossible to start processing a stream of data as soon as the data starts\narriving, but before the entire stream has been received.\n\nRemarkably, a `LazyList` can achieve this while remaining _immutable_.\nThis is possible thanks to some nuance in the definition of immutability, and\nby constraining what information a `LazyList` makes available about the state\nof the stream.\n\n#### Immutability\n\nWe might\nnaturally assume that a stream, being a sequence of data which grows as time\npasses and\ndata arrives, must be _mutable_. There is a time when that sequence represents\na small amount of data, and a later time when the same sequence represents\na larger amount of data. Something has _mutated_, surely?\n\nBut no! And here lies the nuance: the `LazyList` _always_ represented the entire\nsequence of data—never a partial amount. Operations on the `LazyList` will\nalways produce the same result, regardless of whether they are invoked at the\nmoment the streaming data starts arriving, or after it has finished arriving.\n\nThis is only possible by making the concession that operations may not\nreturn immediately, and may _block_ if\nthey depend on data that has not yet arrived.\n\nFor example, the sum of elements\nof a `LazyList[Int]`, called `xs`, may be calculated by calling `xs.sum`. The\nfirst time `xs.sum` is invoked, it may take a long time to return a result\nwhile the data is arriving. This could take seconds, or longer!\n\nBut the second and third invocations of `xs.sum` would do the same calculation\nto add up all the numbers again. Yet they would run much faster, because all\nthe data\nwould already be accessible directly from memory without blocking. Each\nresult would be the same, confirming the claim of the `LazyList`'s immutability.\n\nHowever, we suffer the limitation that we cannot \"ask\" the `LazyList` whether\nit is \"ready\" by calling a method like\n`xs.ready`. To expose such information would compromise its immutability if\nsometimes it could return `false` and sometimes `true`.\n\nThis inability to query a `LazyList`'s readiness, and to know whether it will\nblock or not might, at first, seem limiting.\n\nIt's not, as it turns out. Though the reason might not be obvious.\nFirstly, analysis of a program's _correctness_ should not care\nabout the passing of time during the evaluation of an expression; it is not\na factor\nin determining whether the program produces the right or the wrong answer. And\nwe would call it a _race condition_ if it were!\n\nInstead, we must work with `LazyList`s with no means to branch conditionally on\nthe state of the stream. But instead of branching, we can _fork_ an\nasynchronous thread to\nperform the `LazyList` operations, and block the new thread.\n\n#### Blocking, Rehabilitated\n\nBlocking has earned itself a bad reputation in many programming languages. It\nstarted\nas a useful convenience: if a value is not ready, the runtime\nautomatically waits until\nit is. The alternative would be to _fail_, or to write boilerplate code to\nhandle the _ready_ and _unready_ cases.\n\nBut traditional implementations\nrequire the CPU to do a lot of checking and a lot of waiting. It's acceptable\nfor a few concurrent threads, but the overhead of several concurrent blocking\noperations can accumulate to the point where more CPU clock cycles are spent\nchecking and waiting than doing any real work. For a production application\nrunning across multiple nodes, that might mean that 100 servers are required,\nwhile we know that half of them are doing nothing but checking and waiting.\n\nThat changed with Java 21 which introduced lightweight virtual threads, and\nmade it possible for many orders of magnitude more method calls to be in a\nblocking state\nconcurrently, without significantly impacting the system overhead.\n\nAnd this plays into the hands of `LazyList`s and its various blocking\noperations as fine representations for streams. Blocking brings the convenience\nof programming with deterministic immutable values, but without the expense it\nonce did.\n\nAnd it is for this reason that _Turbulence_ exists.\n\n### What is in Turbulence?\n\nThe core of Turbulence includes key interfaces for reading and writing streams\nthat are used extensively in other [Soundness](https://soundness.dev/) modules.\n\nBasic implementations for upstream types are provided for streams of bytes\n(`LazyList[Bytes]`) and streams of characters (`LazyList[Text]`).\n\nInterfaces for communicating with standard I/O are provided through the `In`,\n`Out` and `Err` objects, and capability-aware `print` and `println` methods.\n\nA few general stream-related tools are provided. `Spool`s collate events from\nmultiple sources into `LazyList`s; `Multiplexer`s merge streams.\n\nFinally, Turbulence provides implementations of GZIP and Zlib compression\nalgorithms on byte streams.\n\n### Reading, Writing (and Appending)\n\nTurbulence defines four key typeclass interfaces related to streaming:\n`Readable`, `Writable` and `Aggregable`, and\n\n#### `Funnel`s\n\nA `Funnel` receives asynchronous events, potentially from multiple threads, and puts them into a\n`LazyList`\n\nFor example,\n```scala\nval funnel: Funnel[Int] = Funnel[Int]()\nfunnel.put(2)\nFuture(funnel.put(6))\nval events: LazyList[Int] = funnel.stream\n```\n\nNote that evaluation of the `Funnel#stream` method constructs a `LazyList` which consumes events,\nand should be called exactly once. Later releases of Turbulence will change the API to avoid this\ntrap.\n\n#### Clustering\n\nAn event stream provided by a `LazyList[T]` may yield events irregularly, often with several events\nhappening at the same time. A simple event-handling loop, which performs a slow operation, such as,\n```scala\nstream.foreach: event =\u003e\n  slowOperation(event)\n```\nwill incur a time cost for every event in the stream; so if the operation takes one second and ten\nevents arrive at around the same time, it will take about ten seconds from the first event arriving\nuntil the last event is processed.\n\nSometimes it can be quicker to process events in a batch, or the results of processing earlier\nevents can be invalidated by the arrival of later events. In these cases, clustering the events on\na stream can be useful.\n\nThe `LazyList#cluster` extension method can transform a `LazyList[T]` into a `LazyList[List[T]]`. It\nwill group together sequences of events arriving with less than a specified gap in time between\nthem.\n\nFor example,\n```scala\nstream.cluster(1000).foreach: events =\u003e\n  slowOperation(events.last)\n```\nwill effectively ignore all but the last event, but will not start processing an event until 1000ms\nhas passed without any new events.\n\nAs a more complete example, consider the event stream, `stream`, which produces events `0`-`9` at\nthe times shown in the \"Time\" column.\n\nEvent   | Time    | Gap      | `stream`    | `stream.cluster(10)`   | `stream.cluster(100)`     |\n-------:|--------:|---------:|------------:|-----------------------:|--------------------------:|\n`0`     | `4ms`   |          | `0 @ 4ms`   |                        |                           |\n`1`     | `8ms`   | `4ms`    | `1 @ 8ms`   | `{0,1} @ 18ms`         |                           |\n`2`     | `15ms`  | `7ms`    | `2 @ 15ms`  | `{2} @ 25ms`           |                           |\n`3`     | `26ms`  | `11ms`   | `3 @ 26ms`  | `{3} @ 36ms`           |                           |\n`4`     | `75ms`  | `49ms`   | `4 @ 75ms`  |                        |                           |\n`5`     | `80ms`  | `5ms`    | `5 @ 80ms`  |                        |                           |\n`6`     | `85ms`  | `5ms`    | `6 @ 85ms`  |                        |                           |\n`7`     | `90ms`  | `5ms`    | `7 @ 90ms`  | `{4,5,6,7} @ 100ms`    | `{1,2,3,4,5,6,7} @ 190ms` |\n`8`     | `203ms` | `113ms`  | `8 @ 203ms` | `{8} @ 213ms`          | `{8} @ 303ms`             |\n`9`     | `304ms` | `101ms`  | `9 @ 304ms` | `{8} @ 308ms`          | `{9} @ 308ms`             |\n`END`   | `308ms` | `4ms`    |             |                        |                           |\n\nThe event streams `stream.cluster(10)` and `stream.cluster(100)` will produce results at different\ntimes. Note that event `0` is not received on `stream.cluster(100)` until `190ms` after it is\nproduced, and likewise event `4` is not received on `stream.cluster(10)` until `25ms` after it\nfires.\n\nIn the worst-case scenario, a stream steadily producing events with a gap slightly shorter than the\ncluster interval will never produce a value! To mitigate this possibility, an optional second\nparameter can be provided which specifies the maximum number of events to include in a single\nclustered event, for example,\n```scala\nstream.cluster(100, 10)\n```\n\nThe `LazyList#cluster` extension method expects a parameter of the contextual `Timekeeping` type.\n\n#### Multiplexing\n\nMultiple `LazyList` streams may be combined into a single stream by multiplexing them. The\nextension method `LazyList.multiplex` takes a variable number of `LazyList` arguments to construct\na new `LazyList` from two or more existing `LazyList`s, for example:\n```scala\nval combinedStream = LazyList.multiplex(source1, source2, source3)\n```\n\nThe type parameter of the resultant `LazyList` will be the least upper-bound of that of the input\nstreams.\n\n#### Rate-limiting\n\nOften a stream will produce results faster than desired if it is actively consumed. The\n`LazyList#rate` method will guarantee a minimum amount of time passes between consecutive values.\nIf the elapsed time since the previous element already exceeds the minimum, it will be yielded\nimmediately.\n\nFor example,\n```scala\nLazyList.from(1).rate(100)\n```\nwill count from `1`, yielding approximately ten numbers per second.\n\nNote that a rate-limited `LazyList` which has already been partially or completely evaluated will\nevaluate without any delay on subsequent reads.\n\n#### Mutable Multiplexing\n\nIt may be desirable to add or remove streams from the set being multiplexed. This is possible with\na `Multiplexer` instance, which takes two type parameters: `K`, the type of the keys with which\nstreams will be associated, and `T`, the type of the elements in the resultant stream.\n\nNew streams may be added to the `Multiplexer` with the `Multiplexer#add` method, which takes a key\nand a stream, and removed with the `Multiplexer#remove` method, taking just the key. For example,\n```scala\nval multiplexer = Multiplexer[Text, Int]()\nmultiplexer.add(t\"Fibonacci\", fib(0, 1).rate(500))\nmultiplexer.add(t\"Naturals\", LazyList.from(1).rate(350))\nmultiplexer.stream.take(10).foreach(println(_))\nmultiplexer.remove(t\"Fibonacci\")\nmultiplexer.stream.take(10).foreach(println(_))\nmultiplexer.close()\n```\n\n#### Tap\n\nSometimes it's useful to have direct control over when a `LazyList` is yielding values and when\nit is \"paused\", using an external trigger. This functionality is provided by a `Tap`, a mutable\nobject which defines two methods, `open()` and `close()`, and holds the tap's current state.\n\nGiven a `LazyList[T]`, `stream`, the `regulate` extension method may be used to specify a `Tap`\nwhich can control it.\n\nFor example,\n```scala\nval tap = Tap()\ndef regulatedStream = stream.regulate(tap)\n```\n\nElsewhere, perhaps in another thread, `tap.close()` and `tap.open()` may be called to pause or\nresume output on the `LazyList`. Any events which arise while the `Tap` is closed will be buffered,\nand emitted when it is re-opened. Accessing `isEmpty`, `head` or `tail` on the `LazyList` will, of\ncourse, block while the tap is closed.\n\n#### Pulsar\n\nA `Pulsar` provides a regular stream of `Unit` values and a predefined rate. It may be created\nsimply with the `pulsar` extension method on the `LazyList` object, taking a time duration as its\nonly parameter, for example,\n```scala\nLazyList.pulsar(1000L).foreach:\n  unit =\u003e println(\"Hello\")\n```\nwill print `Hello` once per second, forever.\n\n\n## Status\n\nTurbulence is classified as __fledgling__. For reference, Soundness projects are\ncategorized into one of the following five stability levels:\n\n- _embryonic_: for experimental or demonstrative purposes only, without any guarantees of longevity\n- _fledgling_: of proven utility, seeking contributions, but liable to significant redesigns\n- _maturescent_: major design decisions broady settled, seeking probatory adoption and refinement\n- _dependable_: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version `1.0.0` or later\n- _adamantine_: proven, reliable and production-ready, with no further breaking changes ever anticipated\n\nProjects at any stability level, even _embryonic_ projects, can still be used,\nas long as caution is taken to avoid a mismatch between the project's stability\nlevel and the required stability and maintainability of your own project.\n\nTurbulence is designed to be _small_. Its entire source code currently consists\nof 1026 lines of code.\n\n## Building\n\nTurbulence will ultimately be built by Fury, when it is published. In the\nmeantime, two possibilities are offered, however they are acknowledged to be\nfragile, inadequately tested, and unsuitable for anything more than\nexperimentation. They are provided only for the necessity of providing _some_\nanswer to the question, \"how can I try Turbulence?\".\n\n1. *Copy the sources into your own project*\n   \n   Read the `fury` file in the repository root to understand Turbulence's build\n   structure, dependencies and source location; the file format should be short\n   and quite intuitive. Copy the sources into a source directory in your own\n   project, then repeat (recursively) for each of the dependencies.\n\n   The sources are compiled against the latest nightly release of Scala 3.\n   There should be no problem to compile the project together with all of its\n   dependencies in a single compilation.\n\n2. *Build with [Wrath](https://github.com/propensive/wrath/)*\n\n   Wrath is a bootstrapping script for building Turbulence and other projects in\n   the absence of a fully-featured build tool. It is designed to read the `fury`\n   file in the project directory, and produce a collection of JAR files which can\n   be added to a classpath, by compiling the project and all of its dependencies,\n   including the Scala compiler itself.\n   \n   Download the latest version of\n   [`wrath`](https://github.com/propensive/wrath/releases/latest), make it\n   executable, and add it to your path, for example by copying it to\n   `/usr/local/bin/`.\n\n   Clone this repository inside an empty directory, so that the build can\n   safely make clones of repositories it depends on as _peers_ of `turbulence`.\n   Run `wrath -F` in the repository root. This will download and compile the\n   latest version of Scala, as well as all of Turbulence's dependencies.\n\n   If the build was successful, the compiled JAR files can be found in the\n   `.wrath/dist` directory.\n\n## Contributing\n\nContributors to Turbulence are welcome and encouraged. New contributors may like\nto look for issues marked\n[beginner](https://github.com/propensive/turbulence/labels/beginner).\n\nWe suggest that all contributors read the [Contributing\nGuide](/contributing.md) to make the process of contributing to Turbulence\neasier.\n\nPlease __do not__ contact project maintainers privately with questions unless\nthere is a good reason to keep them private. While it can be tempting to\nrepsond to such questions, private answers cannot be shared with a wider\naudience, and it can result in duplication of effort.\n\n## Author\n\nTurbulence was designed and developed by Jon Pretty, and commercial support and\ntraining on all aspects of Scala 3 is available from [Propensive\nO\u0026Uuml;](https://propensive.com/).\n\n\n\n## Name\n\n_Turbulence_ describes multiple interacting flows, or streams, of fluids; this library makes it easier to streamline interacting streams.\n\nIn general, Soundness project names are always chosen with some rationale,\nhowever it is usually frivolous. Each name is chosen for more for its\n_uniqueness_ and _intrigue_ than its concision or catchiness, and there is no\nbias towards names with positive or \"nice\" meanings—since many of the libraries\nperform some quite unpleasant tasks.\n\nNames should be English words, though many are obscure or archaic, and it\nshould be noted how willingly English adopts foreign words. Names are generally\nof Greek or Latin origin, and have often arrived in English via a romance\nlanguage.\n\n## Logo\n\nThe logo shows a turbulent (and colorful) vortex.\n\n## License\n\nTurbulence is copyright \u0026copy; 2025 Jon Pretty \u0026 Propensive O\u0026Uuml;, and\nis made available under the [Apache 2.0 License](/license.md).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpropensive%2Fturbulence","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpropensive%2Fturbulence","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpropensive%2Fturbulence/lists"}