{"id":21253839,"url":"https://github.com/ebonnal/streamable","last_synced_at":"2025-05-15T20:03:04.110Z","repository":{"id":183722365,"uuid":"669800207","full_name":"ebonnal/streamable","owner":"ebonnal","description":"Pythonic Stream-like manipulation of iterables","archived":false,"fork":false,"pushed_at":"2025-05-15T14:00:37.000Z","size":4221,"stargazers_count":255,"open_issues_count":14,"forks_count":4,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-05-15T14:57:30.628Z","etag":null,"topics":["asyncio","collections","data","data-engineering","decorator-pattern","etl","etl-pipeline","fluent-interface","immutability","iterable","iterator","iterator-pattern","lazy-evaluation","method-chaining","python","python3","reverse-etl","streams","threads","visitor-pattern"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ebonnal.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-07-23T13:21:55.000Z","updated_at":"2025-05-13T21:18:19.000Z","dependencies_parsed_at":"2023-10-15T17:37:49.986Z","dependency_job_id":"86daf803-f7d0-4239-8c29-7cd3b6ad6a0f","html_url":"https://github.com/ebonnal/streamable","commit_stats":null,"previous_names":["bonnal-enzo/kissio","bonnal-enzo/kioss","bonnal-enzo/iterable","ebonnal/streamable"],"tags_count":56,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebonnal%2Fstreamable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebonnal%2Fstreamable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebonnal%2Fstreamable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebonnal%2Fstreamable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ebonnal","download_url":"https://codeload.github.com/ebonnal/streamable/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254414493,"owners_count":22067271,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asyncio","collections","data","data-engineering","decorator-pattern","etl","etl-pipeline","fluent-interface","immutability","iterable","iterator","iterator-pattern","lazy-evaluation","method-chaining","python","python3","reverse-etl","streams","threads","visitor-pattern"],"created_at":"2024-11-21T03:53:15.077Z","updated_at":"2025-05-15T20:03:02.052Z","avatar_url":"https://github.com/ebonnal.png","language":"Python","readme":"[![coverage](https://codecov.io/gh/ebonnal/streamable/graph/badge.svg?token=S62T0JQK9N)](https://codecov.io/gh/ebonnal/streamable)\n[![testing](https://github.com/ebonnal/streamable/actions/workflows/testing.yml/badge.svg?branch=main)](https://github.com/ebonnal/streamable/actions)\n[![typing](https://github.com/ebonnal/streamable/actions/workflows/typing.yml/badge.svg?branch=main)](https://github.com/ebonnal/streamable/actions)\n[![formatting](https://github.com/ebonnal/streamable/actions/workflows/formatting.yml/badge.svg?branch=main)](https://github.com/ebonnal/streamable/actions)\n[![PyPI](https://github.com/ebonnal/streamable/actions/workflows/pypi.yml/badge.svg?branch=main)](https://pypi.org/project/streamable)\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/streamable/badges/version.svg)](https://anaconda.org/conda-forge/streamable)\n\n# ༄ `streamable`\n\n### *Pythonic Stream-like manipulation of iterables*\n\n- 🔗 ***Fluent*** chainable lazy operations\n- 🔀 ***Concurrent*** via *threads*/*processes*/`asyncio`\n- 🇹 ***Typed***, fully annotated, `Stream[T]` is an `Iterable[T]`\n- 🛡️ ***Tested*** extensively with **Python 3.7 to 3.14**\n- 🪶 ***Light***, no dependencies\n\n\n\n## 1. install\n\n```bash\npip install streamable\n```\n*or*\n```bash\nconda install conda-forge::streamable \n```\n\n## 2. import\n\n```python\nfrom streamable import Stream\n```\n\n## 3. init\n\nCreate a `Stream[T]` *decorating* an `Iterable[T]`:\n\n```python\nintegers: Stream[int] = Stream(range(10))\n```\n\n## 4. operate\n\nChain ***lazy*** operations (only evaluated during iteration), each returning a new ***immutable*** `Stream`:\n\n```python\ninverses: Stream[float] = (\n    integers\n    .map(lambda n: round(1 / n, 2))\n    .catch(ZeroDivisionError)\n)\n```\n\n## 5. iterate\n\nIterate over a `Stream[T]` just as you would over any other `Iterable[T]`, elements are processed *on-the-fly*:\n\n- **collect**\n```python\n\u003e\u003e\u003e list(inverses)\n[1.0, 0.5, 0.33, 0.25, 0.2, 0.17, 0.14, 0.12, 0.11]\n\u003e\u003e\u003e set(inverses)\n{0.5, 1.0, 0.2, 0.33, 0.25, 0.17, 0.14, 0.12, 0.11}\n```\n\n- **reduce**\n```python\n\u003e\u003e\u003e sum(inverses)\n2.82\n\u003e\u003e\u003e from functools import reduce\n\u003e\u003e\u003e reduce(..., inverses)\n```\n\n- **loop**\n```python\n\u003e\u003e\u003e for inverse in inverses:\n\u003e\u003e\u003e    ...\n```\n\n- **next**\n```python\n\u003e\u003e\u003e next(iter(inverses))\n1.0\n```\n\n# 📒 ***Operations***\n\n*A dozen expressive lazy operations and that’s it!*\n\n# `.map`\n\n\u003e Applies a transformation on elements:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\ninteger_strings: Stream[str] = integers.map(str)\n\nassert list(integer_strings) == ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']\n```\n\u003c/details\u003e\n\n## concurrency\n\n\u003e [!NOTE]\n\u003e By default, all the concurrency modes presented below yield results in the upstream order (FIFO). Set the parameter `ordered=False` to yield results as they become available (***First Done, First Out***).\n\n### thread-based concurrency\n\n\u003e Applies the transformation via `concurrency` threads:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport requests\n\npokemon_names: Stream[str] = (\n    Stream(range(1, 4))\n    .map(lambda i: f\"https://pokeapi.co/api/v2/pokemon-species/{i}\")\n    .map(requests.get, concurrency=3)\n    .map(requests.Response.json)\n    .map(lambda poke: poke[\"name\"])\n)\nassert list(pokemon_names) == ['bulbasaur', 'ivysaur', 'venusaur']\n```\n\u003c/details\u003e\n\n\u003e [!NOTE]\n\u003e `concurrency` is also the size of the buffer containing not-yet-yielded results. **If the buffer is full, the iteration over the upstream is paused** until a result is yielded from the buffer.\n\n\u003e [!TIP]\n\u003e The performance of thread-based concurrency in a CPU-bound script can be drastically improved by using a [Python 3.13+ free-threading build](https://docs.python.org/3/using/configure.html#cmdoption-disable-gil).\n\n### process-based concurrency\n\n\u003e Set `via=\"process\"`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nif __name__ == \"__main__\":\n    state: List[int] = []\n    # integers are mapped\n    assert integers.map(state.append, concurrency=4, via=\"process\").count() == 10\n    # but the `state` of the main process is not mutated\n    assert state == []\n```\n\u003c/details\u003e\n\n### `asyncio`-based concurrency\n\n\u003e The sibling operation `.amap` applies an async function:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport httpx\nimport asyncio\n\nhttp_async_client = httpx.AsyncClient()\n\npokemon_names: Stream[str] = (\n    Stream(range(1, 4))\n    .map(lambda i: f\"https://pokeapi.co/api/v2/pokemon-species/{i}\")\n    .amap(http_async_client.get, concurrency=3)\n    .map(httpx.Response.json)\n    .map(lambda poke: poke[\"name\"])\n)\n\nassert list(pokemon_names) == ['bulbasaur', 'ivysaur', 'venusaur']\nasyncio.get_event_loop().run_until_complete(http_async_client.aclose())\n```\n\u003c/details\u003e\n\n## \"starmap\"\n\n\u003e The `star` function decorator transforms a function that takes several positional arguments into a function that takes a tuple:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom streamable import star\n\nzeros: Stream[int] = (\n    Stream(enumerate(integers))\n    .map(star(lambda index, integer: index - integer))\n)\n\nassert list(zeros) == [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n```\n\u003c/details\u003e\n\n\n\n# `.foreach`\n\n\n\n\u003e Applies a side effect on elements:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nstate: List[int] = []\nappending_integers: Stream[int] = integers.foreach(state.append)\n\nassert list(appending_integers) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nassert state == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n## concurrency\n\n\u003e Similar to `.map`:\n\u003e - set the `concurrency` parameter for **thread-based concurrency**\n\u003e - set `via=\"process\"` for **process-based concurrency**\n\u003e - use the sibling `.aforeach` operation for **`asyncio`-based concurrency**\n\u003e - set `ordered=False` for ***First Done First Out***\n\n# `.group`\n\n\u003e Groups elements into `List`s:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_by_5: Stream[List[int]] = integers.group(size=5)\n\nassert list(integers_by_5) == [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]\n```\n\u003c/details\u003e\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_by_parity: Stream[List[int]] = integers.group(by=lambda n: n % 2)\n\nassert list(integers_by_parity) == [[0, 2, 4, 6, 8], [1, 3, 5, 7, 9]]\n```\n\u003c/details\u003e\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom datetime import timedelta\n\nintegers_within_1_sec: Stream[List[int]] = (\n    integers\n    .throttle(2, per=timedelta(seconds=1))\n    .group(interval=timedelta(seconds=0.99))\n)\n\nassert list(integers_within_1_sec) == [[0, 1, 2], [3, 4], [5, 6], [7, 8], [9]]\n```\n\u003c/details\u003e\n\n\u003e Mix the `size`/`by`/`interval` parameters:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_by_parity_by_2: Stream[List[int]] = (\n    integers\n    .group(by=lambda n: n % 2, size=2)\n)\n\nassert list(integers_by_parity_by_2) == [[0, 2], [1, 3], [4, 6], [5, 7], [8], [9]]\n```\n\u003c/details\u003e\n\n\n## `.groupby`\n\n\u003e Like `.group`, but groups into `(key, elements)` tuples:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_by_parity: Stream[Tuple[str, List[int]]] = (\n    integers\n    .groupby(lambda n: \"odd\" if n % 2 else \"even\")\n)\n\nassert list(integers_by_parity) == [(\"even\", [0, 2, 4, 6, 8]), (\"odd\", [1, 3, 5, 7, 9])]\n```\n\u003c/details\u003e\n\n\u003e [!TIP]\n\u003e Then *\"starmap\"* over the tuples:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom streamable import star\n\ncounts_by_parity: Stream[Tuple[str, int]] = (\n    integers_by_parity\n    .map(star(lambda parity, ints: (parity, len(ints))))\n)\n\nassert list(counts_by_parity) == [(\"even\", 5), (\"odd\", 5)]\n```\n\u003c/details\u003e\n\n# `.flatten`\n\n\u003e Ungroups elements assuming that they are `Iterable`s:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\neven_then_odd_integers: Stream[int] = integers_by_parity.flatten()\n\nassert list(even_then_odd_integers) == [0, 2, 4, 6, 8, 1, 3, 5, 7, 9]\n```\n\u003c/details\u003e\n\n### thread-based concurrency\n\n\u003e Flattens `concurrency` iterables concurrently:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nmixed_ones_and_zeros: Stream[int] = (\n    Stream([[0] * 4, [1] * 4])\n    .flatten(concurrency=2)\n)\nassert list(mixed_ones_and_zeros) == [0, 1, 0, 1, 0, 1, 0, 1]\n```\n\u003c/details\u003e\n\n# `.filter`\n\n\u003e Keeps only the elements that satisfy a condition:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\neven_integers: Stream[int] = integers.filter(lambda n: n % 2 == 0)\n\nassert list(even_integers) == [0, 2, 4, 6, 8]\n```\n\u003c/details\u003e\n\n# `.distinct`\n\n\u003e Removes duplicates:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\ndistinct_chars: Stream[str] = Stream(\"foobarfooo\").distinct()\n\nassert list(distinct_chars) == [\"f\", \"o\", \"b\", \"a\", \"r\"]\n```\n\u003c/details\u003e\n\n\u003e specifying a deduplication `key`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nstrings_of_distinct_lengths: Stream[str] = (\n    Stream([\"a\", \"foo\", \"bar\", \"z\"])\n    .distinct(len)\n)\n\nassert list(strings_of_distinct_lengths) == [\"a\", \"foo\"]\n```\n\u003c/details\u003e\n\n\u003e [!WARNING]\n\u003e During iteration, all distinct elements that are yielded are retained in memory to perform deduplication. However, you can remove only consecutive duplicates without a memory footprint by setting `consecutive_only=True`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nconsecutively_distinct_chars: Stream[str] = (\n    Stream(\"foobarfooo\")\n    .distinct(consecutive_only=True)\n)\n\nassert list(consecutively_distinct_chars) == [\"f\", \"o\", \"b\", \"a\", \"r\", \"f\", \"o\"]\n```\n\u003c/details\u003e\n\n# `.truncate`\n\n\u003e Ends iteration once a given number of elements have been yielded:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfive_first_integers: Stream[int] = integers.truncate(5)\n\nassert list(five_first_integers) == [0, 1, 2, 3, 4]\n```\n\u003c/details\u003e\n\n\u003e or `when` a condition is satisfied:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfive_first_integers: Stream[int] = integers.truncate(when=lambda n: n == 5)\n\nassert list(five_first_integers) == [0, 1, 2, 3, 4]\n```\n\u003c/details\u003e\n\n\u003e If both `count` and `when` are set, truncation occurs as soon as either condition is met.\n\n# `.skip`\n\n\u003e Skips the first specified number of elements:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_after_five: Stream[int] = integers.skip(5)\n\nassert list(integers_after_five) == [5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n\u003e or skips elements `until` a predicate is satisfied:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nintegers_after_five: Stream[int] = integers.skip(until=lambda n: n \u003e= 5)\n\nassert list(integers_after_five) == [5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n\u003e If both `count` and `until` are set, skipping stops as soon as either condition is met.\n\n# `.catch`\n\n\u003e Catches a given type of exception, and optionally yields a `replacement` value:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\ninverses: Stream[float] = (\n    integers\n    .map(lambda n: round(1 / n, 2))\n    .catch(ZeroDivisionError, replacement=float(\"inf\"))\n)\n\nassert list(inverses) == [float(\"inf\"), 1.0, 0.5, 0.33, 0.25, 0.2, 0.17, 0.14, 0.12, 0.11]\n```\n\u003c/details\u003e\n\n\u003e You can specify an additional `when` condition for the catch:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport requests\nfrom requests.exceptions import ConnectionError\n\nstatus_codes_ignoring_resolution_errors: Stream[int] = (\n    Stream([\"https://github.com\", \"https://foo.bar\", \"https://github.com/foo/bar\"])\n    .map(requests.get, concurrency=2)\n    .catch(ConnectionError, when=lambda error: \"Max retries exceeded with url\" in str(error))\n    .map(lambda response: response.status_code)\n)\n\nassert list(status_codes_ignoring_resolution_errors) == [200, 404]\n```\n\u003c/details\u003e\n\n\u003e It has an optional `finally_raise: bool` parameter to raise the first exception caught (if any) when the iteration terminates.\n\n\u003e [!TIP]\n\u003e Apply side effects when catching an exception by integrating them into `when`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nerrors: List[Exception] = []\n\ndef store_error(error: Exception) -\u003e bool:\n    errors.append(error)  # applies effect\n    return True  # signals to catch the error\n\nintegers_in_string: Stream[int] = (\n    Stream(\"012345foo6789\")\n    .map(int)\n    .catch(ValueError, when=store_error)\n)\n\nassert list(integers_in_string) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nassert len(errors) == len(\"foo\")\n```\n\u003c/details\u003e\n\n\n# `.throttle`\n\n\u003e Limits the number of yields `per` time interval:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom datetime import timedelta\n\nthree_integers_per_second: Stream[int] = integers.throttle(3, per=timedelta(seconds=1))\n\n# takes 3s: ceil(10 integers / 3 per_second) - 1\nassert list(three_integers_per_second) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n\n# `.observe`\n\n\u003e Logs the progress of iterations:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\n\u003e\u003e\u003e assert list(integers.throttle(2, per=timedelta(seconds=1)).observe(\"integers\")) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n```\n\n```\nINFO: [duration=0:00:00.001793 errors=0] 1 integers yielded\nINFO: [duration=0:00:00.004388 errors=0] 2 integers yielded\nINFO: [duration=0:00:01.003655 errors=0] 4 integers yielded\nINFO: [duration=0:00:03.003196 errors=0] 8 integers yielded\nINFO: [duration=0:00:04.003852 errors=0] 10 integers yielded\n```\n\u003c/details\u003e\n\n\u003e [!NOTE]\n\u003e The amount of logs will never be overwhelming because they are produced logarithmically (base 2): the 11th log will be produced after 1,024 elements have been yielded, the 21th log after 1,048,576 elements, ...\n\n\u003e [!TIP]\n\u003e To mute these logs, set the logging level above `INFO`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport logging\nlogging.getLogger(\"streamable\").setLevel(logging.WARNING)\n```\n\u003c/details\u003e\n\n# `+`\n\n\u003e Concatenates streams:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nassert list(integers + integers) == [0, 1, 2, 3 ,4, 5, 6, 7, 8, 9, 0, 1, 2, 3 ,4, 5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n\n# `zip`\n\n\u003e [!TIP]\n\u003e Use the standard `zip` function:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom streamable import star\n\ncubes: Stream[int] = (\n    Stream(zip(integers, integers, integers))  # Stream[Tuple[int, int, int]]\n    .map(star(lambda a, b, c: a * b * c))  # Stream[int]\n)\n\nassert list(cubes) == [0, 1, 8, 27, 64, 125, 216, 343, 512, 729]\n```\n\u003c/details\u003e\n\n\n## Shorthands for consuming the stream\n\u003e [!NOTE]\n\u003e Although consuming the stream is beyond the scope of this library, it provides two basic shorthands to trigger an iteration:\n\n## `.count`\n\n\n\n\u003e Iterates over the stream until exhaustion and returns the number of elements yielded:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nassert integers.count() == 10\n```\n\u003c/details\u003e\n\n\n## `()`\n\n\n\n\u003e *Calling* the stream iterates over it until exhaustion and returns it:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nstate: List[int] = []\nappending_integers: Stream[int] = integers.foreach(state.append)\nassert appending_integers() is appending_integers\nassert state == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n```\n\u003c/details\u003e\n\n\n# `.pipe`\n\n\u003e Calls a function, passing the stream as first argument, followed by `*args/**kwargs` if any:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport pandas as pd\n\n(\n    integers\n    .observe(\"ints\")\n    .pipe(pd.DataFrame, columns=[\"integer\"])\n    .to_csv(\"integers.csv\", index=False)\n)\n```\n\u003c/details\u003e\n\n\u003e Inspired by the `.pipe` from [pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pipe.html) or [polars](https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.pipe.html).\n\n\n# 💡 Notes\n\n## Exceptions are not terminating the iteration\n\n\u003e [!TIP]\n\u003e If any of the operations raises an exception, you can resume the iteration after handling it:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom contextlib import suppress\n\ncasted_ints: Iterator[int] = iter(\n    Stream(\"0123_56789\")\n    .map(int)\n    .group(3)\n    .flatten()\n)\ncollected: List[int] = []\n\nwith suppress(ValueError):\n    collected.extend(casted_ints)\nassert collected == [0, 1, 2, 3]\n\ncollected.extend(casted_ints)\nassert collected == [0, 1, 2, 3, 5, 6, 7, 8, 9]\n```\n\n\u003c/details \u003e\n\n## Extract-Transform-Load\n\u003e [!TIP]\n\u003e **Custom ETL scripts** can benefit from the expressiveness of this library. Below is a pipeline that extracts the 67 quadruped Pokémon from the first three generations using [PokéAPI](https://pokeapi.co/) and loads them into a CSV:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nimport csv\nfrom datetime import timedelta\nimport itertools\nimport requests\nfrom streamable import Stream\n\nwith open(\"./quadruped_pokemons.csv\", mode=\"w\") as file:\n    fields = [\"id\", \"name\", \"is_legendary\", \"base_happiness\", \"capture_rate\"]\n    writer = csv.DictWriter(file, fields, extrasaction='ignore')\n    writer.writeheader()\n\n    pipeline: Stream = (\n        # Infinite Stream[int] of Pokemon ids starting from Pokémon #1: Bulbasaur\n        Stream(itertools.count(1))\n        # Limits to 16 requests per second to be friendly to our fellow PokéAPI devs\n        .throttle(16, per=timedelta(seconds=1))\n        # GETs pokemons concurrently using a pool of 8 threads\n        .map(lambda poke_id: f\"https://pokeapi.co/api/v2/pokemon-species/{poke_id}\")\n        .map(requests.get, concurrency=8)\n        .foreach(requests.Response.raise_for_status)\n        .map(requests.Response.json)\n        # Stops the iteration when reaching the 1st pokemon of the 4th generation\n        .truncate(when=lambda poke: poke[\"generation\"][\"name\"] == \"generation-iv\")\n        .observe(\"pokemons\")\n        # Keeps only quadruped Pokemons\n        .filter(lambda poke: poke[\"shape\"][\"name\"] == \"quadruped\")\n        .observe(\"quadruped pokemons\")\n        # Catches errors due to None \"generation\" or \"shape\"\n        .catch(\n            TypeError,\n            when=lambda error: str(error) == \"'NoneType' object is not subscriptable\"\n        )\n        # Writes a batch of pokemons every 5 seconds to the CSV file\n        .group(interval=timedelta(seconds=5))\n        .foreach(writer.writerows)\n        .flatten()\n        .observe(\"written pokemons\")\n        # Catches exceptions and raises the 1st one at the end of the iteration\n        .catch(Exception, finally_raise=True)\n    )\n\n    pipeline()\n```\n\u003c/details\u003e\n\n## Visitor Pattern\n\u003e [!TIP]\n\u003e A `Stream` can be visited via its `.accept` method: implement a custom [***visitor***](https://en.wikipedia.org/wiki/Visitor_pattern) by extending the abstract class `streamable.visitors.Visitor`:\n\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom streamable.visitors import Visitor\n\nclass DepthVisitor(Visitor[int]):\n    def visit_stream(self, stream: Stream) -\u003e int:\n        if not stream.upstream:\n            return 1\n        return 1 + stream.upstream.accept(self)\n\ndef depth(stream: Stream) -\u003e int:\n    return stream.accept(DepthVisitor())\n\nassert depth(Stream(range(10)).map(str).foreach(print)) == 3\n```\n\u003c/details\u003e\n\n## Functions\n\u003e [!TIP]\n\u003e The `Stream`'s methods are also exposed as functions:\n\u003cdetails \u003e\u003csummary style=\"text-indent: 40px;\"\u003e👀 show example\u003c/summary\u003e\u003c/br\u003e\n\n```python\nfrom streamable.functions import catch\n\ninverse_integers: Iterator[int] = map(lambda n: 1 / n, range(10))\nsafe_inverse_integers: Iterator[int] = catch(inverse_integers, ZeroDivisionError)\n```\n\u003c/details\u003e\n\n# Contributing\n**Many thanks to our [contributors](https://github.com/ebonnal/streamable/graphs/contributors)!**\n\nFeel very welcome to help us improve `streamable` via issues and PRs, check [CONTRIBUTING.md](CONTRIBUTING.md).\n\n\n# 🙏 Community Highlights – Thank You!\n- [Tryolabs' Top Python libraries of 2024](https://tryolabs.com/blog/top-python-libraries-2024#top-10---general-use) ([LinkedIn](https://www.linkedin.com/posts/tryolabs_top-python-libraries-2024-activity-7273052840984539137-bcGs?utm_source=share\u0026utm_medium=member_desktop), [Reddit](https://www.reddit.com/r/Python/comments/1hbs4t8/the_handpicked_selection_of_the_best_python/))\n- [PyCoder’s Weekly](https://pycoders.com/issues/651) x [Real Python](https://realpython.com/)\n- [@PythonHub's tweet](https://x.com/PythonHub/status/1842886311369142713)\n- [Upvoters on our showcase Reddit post](https://www.reddit.com/r/Python/comments/1fp38jd/streamable_streamlike_manipulation_of_iterables/)\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febonnal%2Fstreamable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Febonnal%2Fstreamable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febonnal%2Fstreamable/lists"}