{"id":16353426,"url":"https://github.com/lidatong/genstream","last_synced_at":"2025-06-25T10:36:53.585Z","repository":{"id":57433652,"uuid":"115670669","full_name":"lidatong/genstream","owner":"lidatong","description":"Method chaining comes to python","archived":false,"fork":false,"pushed_at":"2018-07-08T22:21:18.000Z","size":26,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-20T09:36:36.330Z","etag":null,"topics":["chaining","generators","method-chaining","python","stream","stream-api","stream-processing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lidatong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-29T00:37:11.000Z","updated_at":"2020-12-20T23:57:51.000Z","dependencies_parsed_at":"2022-08-28T03:01:42.914Z","dependency_job_id":null,"html_url":"https://github.com/lidatong/genstream","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/lidatong/genstream","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lidatong%2Fgenstream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lidatong%2Fgenstream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lidatong%2Fgenstream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lidatong%2Fgenstream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lidatong","download_url":"https://codeload.github.com/lidatong/genstream/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lidatong%2Fgenstream/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261853224,"owners_count":23219826,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chaining","generators","method-chaining","python","stream","stream-api","stream-processing"],"created_at":"2024-10-11T01:29:35.557Z","updated_at":"2025-06-25T10:36:53.560Z","avatar_url":"https://github.com/lidatong.png","language":"Python","readme":"# genstream\n\n## Quickstart\n\n`pip install genstream`\n\n```python\nfrom genstream import Stream\n\n\ndef main():\n    a_list_containing_four = (\n        Stream.of(1, 2, 3)\n            .map(lambda x: x * 2)\n            .take(2)\n            .tail()\n            .to(list) # prints [4]\n    )\n    print(a_list_containing_four)\n\n\nif __name__ == '__main__':\n    main()\n```\n\n\n## Soapbox\nWhile generators are one of Python's best and most distinctive language features, I personally find it tiresome to read \ngenerator code that undergoes successive transformations. The `(x for x in xs)` pattern has a low signal-to-noise ratio, \nespecially when it spans across many lines. Can the repetition be abstracted away?\n\nThe `itertools` module is another pain point: the module is very useful, but I don't like the two argument nature of\nmany of the provided functions. I'm always trying to remember which goes first, the parameterization or the iterable,\nas the ordering is inconsistent across functions (e.g. compare `take` with `map`)`.\n\n**genstream** provides a `Stream` structure that aims to address these two nits. It provides the infix method chaining syntax\n(`.map`, `.filter`, etc.) found in many other programming languages. While I agree with\nthe python community consensus that `map(f, xs)` is less readable than `(f(x) for x in xs)`, how about `xs.map(f)`? I\nprefer method syntax when sequencing many operations on an iterable.\n\n## Example of reading lines from many large files without running out of memory\n\n```python\n# Located under examples/concat_files.py\nimport os\nfrom genstream import Stream\n\n\ndef read_lines_in_file(filename):\n    with open(filename) as fp:\n        yield from fp\n\n\n# Using `Stream`\ndef concat_files(dirname):\n    return (\n        Stream(os.listdir(dirname))\n            .sort()\n            .map(lambda fname: f\"{dirname}/{fname}\")\n            .bind(read_lines_in_file)\n    )\n\n\n# Using generators\ndef concat_files_gen(dirname):\n    fnames = os.listdir(dirname)\n    sorted_fnames = sorted(fnames)\n    fnames_with_dir = (f\"{dirname}/{fname}\" for fname in sorted_fnames)\n    for fname in fnames_with_dir:\n        yield from read_lines_in_file(fname)\n\n\n# A more concise way, but perhaps less readable\n# In particular, `sorted(os.listdir(dirname))` is very dense\ndef concat_files_concise(dirname):\n    for fname in sorted(os.listdir(dirname)):\n        yield from read_lines_in_file(f\"{dirname}/{fname}\")\n\n\ndef main():\n    for line in concat_files(\"very_large_files\"):\n        print(line, end=\"\")\n\n\nif __name__ == '__main__':\n    main()\n```\n\n## Example using Stream's symbolic operators\n\n```python\nfrom functools import partial\nfrom itertools import count\nfrom operator import add\n\nfrom genstream import Stream\n\n\ndef main():\n    add_one = partial(add, 1)\n    xs = Stream(count(0))  # infinite stream counting from 0\n    one_thru_five = xs[:5] | add_one \u003e list\n    print(one_thru_five)\n\n\nif __name__ == '__main__':\n    main()\n```\n\n## Note on implementation\nGiven a particular set of primitive operations (e.g. `__init__` and `reduce`),\nit is possible to derive almost all stream ops in terms of one another.\n\nHowever, the methods on `Stream` instead make calls to a corresponding\n`itertools` function whenever possible. This is primarily for performance\nreasons: itertools is a highly-optimized module implemented in C.\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flidatong%2Fgenstream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flidatong%2Fgenstream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flidatong%2Fgenstream/lists"}