{"id":15469144,"url":"https://github.com/jmaces/statstream","last_synced_at":"2025-04-23T16:22:08.215Z","repository":{"id":55324493,"uuid":"218998941","full_name":"jmaces/statstream","owner":"jmaces","description":"Statistics for Streaming Data","archived":false,"fork":false,"pushed_at":"2022-08-17T19:15:16.000Z","size":71,"stargazers_count":9,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"master","last_synced_at":"2024-10-09T12:57:04.155Z","etag":null,"topics":["data-science","numpy","statistics","streaming-data"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmaces.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.rst","contributing":".github/CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.rst","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-01T14:16:37.000Z","updated_at":"2024-10-02T08:19:33.000Z","dependencies_parsed_at":"2022-08-14T21:10:40.716Z","dependency_job_id":null,"html_url":"https://github.com/jmaces/statstream","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmaces%2Fstatstream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmaces%2Fstatstream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmaces%2Fstatstream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmaces%2Fstatstream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmaces","download_url":"https://codeload.github.com/jmaces/statstream/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242131170,"owners_count":20076789,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","numpy","statistics","streaming-data"],"created_at":"2024-10-02T01:52:13.562Z","updated_at":"2025-03-06T01:30:37.256Z","avatar_url":"https://github.com/jmaces.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"=============================================\n``statstream``: Statistics for Streaming Data\n=============================================\n\n.. add project badges here\n.. image:: https://readthedocs.org/projects/statstream/badge/?version=latest\n    :target: https://statstream.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n.. image:: https://github.com/jmaces/statstream/actions/workflows/pr-check.yml/badge.svg?branch=master\n    :target: https://github.com/jmaces/statstream/actions/workflows/pr-check.yml?branch=master\n    :alt: CI Status\n\n.. image:: https://codecov.io/gh/jmaces/statstream/branch/master/graph/badge.svg\n  :target: https://codecov.io/gh/jmaces/statstream\n  :alt: Code Coverage\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n    :target: https://github.com/psf/black\n    :alt: Code Style: Black\n\n\n.. teaser-start\n\n``statstream`` is a lightweight Python package providing data analysis and statistics utilities for streaming data.\n\nIts main goal is to provide **single-pass** variants of conventional `numpy \u003chttps://numpy.org/\u003e`_\ndata analysis and statistics functionality for **streaming** data that is\neither generated on the fly or to large to be handled at once. Data can be\nstreamed as in chunks called **mini-batches**, which makes ``statstream``\nextremely useful in combination with machine learning and deep learning\npackages like `keras \u003chttps://keras.io/\u003e`_, `tensorflow \u003chttps://www.tensorflow.org/\u003e`_, or `pytorch \u003chttps://pytorch.org/\u003e`_.\n\n.. teaser-end\n\n\n.. example\n\n``statstream`` functions consume iterators providing batches of data.\nThey compute statistics of these batches and combine them to obtain statistics\nfor the full data set.\n\n.. code-block:: python\n\n   import statstream\n   mean = statstream.streaming_mean(some_iterable)\n\nThe `Overview \u003chttps://statstream.readthedocs.io/en/latest/overview.html\u003e`_ and\n`Examples \u003chttps://statstream.readthedocs.io/en/latest/examples.html\u003e`_ sections\nof our documentation provide more realistic and complete examples.\n\n.. project-info-start\n\nProject Information\n===================\n\n``statstream`` is released under the `MIT license \u003chttps://github.com/jmaces/statstream/blob/master/LICENSE\u003e`_,\nits documentation lives at `Read the Docs \u003chttps://statstream.readthedocs.io/en/latest/\u003e`_,\nthe code on `GitHub \u003chttps://github.com/jmaces/statstream\u003e`_,\nand the latest release can be found on `PyPI \u003chttps://pypi.org/project/statstream/\u003e`_.\nIt’s tested on Python 2.7 and 3.5+.\n\nIf you'd like to contribute to ``statstream`` you're most welcome.\nWe have written a `short guide \u003chttps://github.com/jmaces/statstream/blob/master/.github/CONTRIBUTING.rst\u003e`_ to help you get you started!\n\n.. project-info-end\n\n\n.. literature-start\n\nFurther Reading\n===============\n\nAdditional information on the algorithmic aspects of ``statstream`` can be found\nin the following works:\n\n- Tony F. Chan \u0026 Gene H. Golub \u0026 Randall J. LeVeque,\n  “Updating formulae and a pairwise algorithm for computing sample variances”,\n  1979\n- Radim, Rehurek,\n  “Scalability of Semantic Analysis in Natural Language Processing”,\n  2011\n\n.. literature-end\n\n\nAcknowledgments\n===============\n\nDuring the setup of this project we were heavily influenced and inspired by\nthe works of `Hynek Schlawack \u003chttps://hynek.me/\u003e`_ and in particular his\n`attrs \u003chttps://www.attrs.org/en/stable/\u003e`_ package and blog posts on\n`testing and packaing \u003chttps://hynek.me/articles/testing-packaging/\u003e`_\nand `deploying to PyPI \u003chttps://hynek.me/articles/sharing-your-labor-of-love-pypi-quick-and-dirty/\u003e`_.\nThank you for sharing your experiences and insights.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmaces%2Fstatstream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmaces%2Fstatstream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmaces%2Fstatstream/lists"}