{"id":15797266,"url":"https://github.com/laurentrdc/npstreams","last_synced_at":"2025-04-10T04:58:42.678Z","repository":{"id":60722534,"uuid":"98888095","full_name":"LaurentRDC/npstreams","owner":"LaurentRDC","description":"Streaming operations on NumPy arrays","archived":false,"fork":false,"pushed_at":"2024-10-23T19:49:04.000Z","size":2966,"stargazers_count":34,"open_issues_count":0,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-24T01:36:55.021Z","etag":null,"topics":["numpy","python3","streaming-algorithms"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LaurentRDC.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.rst","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-31T12:47:53.000Z","updated_at":"2024-10-23T19:49:08.000Z","dependencies_parsed_at":"2022-10-03T21:02:14.360Z","dependency_job_id":"7ef87777-7daf-494c-a5d2-eedcfbf078f5","html_url":"https://github.com/LaurentRDC/npstreams","commit_stats":{"total_commits":318,"total_committers":2,"mean_commits":159.0,"dds":0.05974842767295596,"last_synced_commit":"f5fc4ed68598091e2b4fc25412faddee980d2e46"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LaurentRDC%2Fnpstreams","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LaurentRDC%2Fnpstreams/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LaurentRDC%2Fnpstreams/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LaurentRDC%2Fnpstreams/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LaurentRDC","download_url":"https://codeload.github.com/LaurentRDC/npstreams/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161265,"owners_count":21057554,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["numpy","python3","streaming-algorithms"],"created_at":"2024-10-05T00:05:57.128Z","updated_at":"2025-04-10T04:58:42.638Z","avatar_url":"https://github.com/LaurentRDC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# npstreams\n\n[![Documentation Build Status](https://readthedocs.org/projects/npstreams/badge/?version=master)](http://npstreams.readthedocs.io) [![PyPI Version](https://img.shields.io/pypi/v/npstreams.svg)](https://pypi.python.org/pypi/npstreams) [![Conda-forge Version](https://img.shields.io/conda/vn/conda-forge/npstreams.svg)](https://anaconda.org/conda-forge/npstreams) [![DOI badge](https://img.shields.io/badge/DOI-10.1186%2Fs40679--018--0060--y-blue)](https://doi.org/10.1186/s40679-018-0060-y)\n\nnpstreams is an open-source Python package for streaming NumPy array\noperations. The goal is to provide tested routines that operate on\nstreams (or generators) of arrays instead of dense arrays.\n\nStreaming reduction operations (sums, averages, etc.) can be implemented\nin constant memory, which in turns allows for easy parallelization.\n\nThis approach has been a huge boon when working with lots of images; the\nimages are read one-by-one from disk and combined/processed in a\nstreaming fashion.\n\nThis package is developed in conjunction with other software projects in\nthe [Siwick research group](http://www.physics.mcgill.ca/siwicklab/).\n\n## Motivating Example\n\nConsider the following snippet to combine 50 images from an iterable\n`source`:\n\n```python\nimport numpy as np\n\nimages = np.empty( shape = (2048, 2048, 50) )\nfor index, im in enumerate(source):\n    images[:,:,index] = im\n\navg = np.average(images, axis = 2)\n```\n\nIf the `source` iterable provided 1000 images, the above routine would\nnot work on most machines. Moreover, what if we want to transform the\nimages one by one before averaging them? What about looking at the\naverage while it is being computed? Let\\'s look at an example:\n\n```python\nimport numpy as np\nfrom npstreams import iaverage\nfrom scipy.misc import imread\n\nstream = map(imread, list_of_filenames)\naveraged = iaverage(stream)\n```\n\nAt this point, the generators `map` and `iaverage` are \\'wired\\' but\nwill not compute anything until it is requested. We can look at the\naverage evolve:\n\n```python\nimport matplotlib.pyplot as plt\nfor avg in average:\n    plt.imshow(avg); plt.show()\n```\n\nWe can also use `last` to get at the final average:\n\n```python\nfrom npstreams import last\n\ntotal = last(averaged) # average of the entire stream\n```\n\n## Streaming Functions\n\nnpstreams comes with some streaming functions built-in. Some examples:\n\n-   Numerics : `isum`, `iprod`, `isub`, etc.\n-   Statistics : `iaverage` (weighted mean), `ivar` (single-pass\n    variance), etc.\n\nMore importantly, npstreams gives you all the tools required to build\nyour own streaming function. All routines are documented in the [API\nReference on readthedocs.io](http://npstreams.readthedocs.io).\n\n## Benchmarking\n\nnpstreams provides a function for benchmarking common use cases.\n\nTo run the benchmark with default parameters, from the interpreter:\n\n```python\nfrom npstreams import benchmark\nbenchmark()\n```\n\nFrom a command-line terminal:\n\n```bash\npython -c 'import npstreams; npstreams.benchmark()'\n```\n\nThe results will be printed to the screen.\n\n## Future Work\n\nSome of the features I want to implement in this package in the near\nfuture:\n\n-   Optimize the CUDA-enabled routines\n-   More functions : more streaming functions borrowed from NumPy and\n    SciPy.\n\n## API Reference\n\nThe [API Reference on readthedocs.io](http://npstreams.readthedocs.io)\nprovides API-level documentation, as well as tutorials.\n\n## Installation\n\nThe only requirement is NumPy. To have access to CUDA-enabled routines,\nPyCUDA must also be installed. npstreams is available on PyPI; it can be\ninstalled with [pip](https://pip.pypa.io).:\n\n```bash\npython -m pip install npstreams\n```\n\nnpstreams can also be installed with the conda package manager, from the\nconda-forge channel:\n\n```bash\nconda config --add channels conda-forge\nconda install npstreams\n```\n\nTo install the latest development version from\n[Github](https://github.com/LaurentRDC/npstreams):\n\n```bash\npython -m pip install git+git://github.com/LaurentRDC/npstreams.git\n```\n\nTests can be run using the `pytest` package.\n\n## Citations\n\nIf you find this software useful, please consider citing the following\npublication:\n\n\u003e L. P. René de Cotret, M. R. Otto, M. J. Stern. and B. J. Siwick, *An open-source software ecosystem for the interactive exploration of ultrafast electron scattering data*, Advanced Structural and Chemical Imaging 4:11 (2018) [DOI: 10.1186/s40679-018-0060-y.](https://ascimaging.springeropen.com/articles/10.1186/s40679-018-0060-y)\n\n\n## Support / Report Issues\n\nAll support requests and issue reports should be [filed on Github as an\nissue](https://github.com/LaurentRDC/npstreams/issues).\n\n## License\n\nnpstreams is made available under the BSD License, same as NumPy. For\nmore details, see\n[LICENSE.txt](https://github.com/LaurentRDC/npstreams/blob/master/LICENSE.txt).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flaurentrdc%2Fnpstreams","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flaurentrdc%2Fnpstreams","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flaurentrdc%2Fnpstreams/lists"}