{"id":13516783,"url":"https://github.com/sorenmacbeth/streaming-papers","last_synced_at":"2026-01-06T14:38:32.768Z","repository":{"id":9656352,"uuid":"11594025","full_name":"sorenmacbeth/streaming-papers","owner":"sorenmacbeth","description":"A curated collection of papers on streaming algorithms","archived":false,"fork":false,"pushed_at":"2018-07-27T02:16:25.000Z","size":6236,"stargazers_count":187,"open_issues_count":0,"forks_count":33,"subscribers_count":28,"default_branch":"master","last_synced_at":"2025-02-01T08:13:02.114Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sorenmacbeth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-07-22T22:28:13.000Z","updated_at":"2023-09-08T16:40:55.000Z","dependencies_parsed_at":"2022-08-25T21:41:32.272Z","dependency_job_id":null,"html_url":"https://github.com/sorenmacbeth/streaming-papers","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorenmacbeth%2Fstreaming-papers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorenmacbeth%2Fstreaming-papers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorenmacbeth%2Fstreaming-papers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorenmacbeth%2Fstreaming-papers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sorenmacbeth","download_url":"https://codeload.github.com/sorenmacbeth/streaming-papers/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245768627,"owners_count":20669043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T05:01:25.815Z","updated_at":"2026-01-06T14:38:32.715Z","avatar_url":"https://github.com/sorenmacbeth.png","language":null,"readme":"streaming-papers\n================\n\nA curated collection of papers on streaming algorithms\n\n### Please Contribute\n\nIf you have papers you want to add, make a pull request. Categories are wide open right now, so just put in a folder that makes sense to you and we'll figure it out.\n\n### Distinct Value Counting\n\n_distinct_value_counting/Probabilistic_Multiplicity_Counting_-_Lieven2010a.pdf_\n\nKnown Implementations\n* [seiflotfy/pmc](https://github.com/seiflotfy/pmc) - Go\n* [sorenmacbeth/runpmc](https://github.com/sorenmacbeth/runpmc) - Clojure\n\n===\n\n#### Data Streams as Random Permutations: the Distinct Element Problem - Helmi, Lumbroso, Martinez, Viola\n\n_distinct_value_counting/data_streams_as_random_permutations.pdf_\n\nKnown Implementations:\n* [cscotta/recordinality](https://github.com/cscotta/recordinality) - Java\n\n### Distribution Functions\n\n===\n\n#### Dynamic Histograms: Capturing Evolving Data Sets - Donko Donjerkovic, Yannis Ioannidis, Raghu Ramakrishnan\n\n_distribution_functions/dynamic-histograms.pdf_\n\nKnown Implementations:\n* [bigmlcom/histogram](https://github.com/bigmlcom/histogram) - Clojure\n* [bmizerany/perks](https://github.com/bmizerany/perks/blob/histo/histogram/histogram.go) - Go\n* [d2fn/shades-rb](https://github.com/d2fn/shades-rb) - Ruby\n\n===\n\n#### The P\u003csup\u003e2\u003c/sup\u003e Algorithm for Dynamic Calculation of Quantiles and Histograms Without Storing Observations - Raj Jain, IMRICH CHLAMTAC\n\n_distribution_functions/psqr.pdf_\n\nKnown Implementations:\n* [GNU Scientific Library](https://www.gnu.org/software/gsl/doc/html/rstat.html#quantiles) - C\n* [scassidy/livestats](https://bitbucket.org/scassidy/livestats) (bitbucket) - Python\n* [absmall/p2](https://github.com/absmall/p2) - C++\n* [jacksonicson/psquared](https://github.com/jacksonicson/psquared) - java\n\n===\n\n#### Effective Computation of Biased Quantiles over Data Streams: Cormode, Korn, Muthukrishnan, Srivastava\n\n_distribution_functions/bquant.pdf_\n\nKnown Implementations:\n* [bmizerany/perks](https://github.com/bmizerany/perks/blob/histo/quantile/stream.go) - Go\n\n===\n\n### Summary Statistics\n\n#### Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments - Pilippe Pebay\n\n_Summary Statistics/one_pass_moments_Pebay.pdf_\n\nKnown Implementations:\n[Kitware/VTK](https://github.com/Kitware/VTK/) (mirror) - C++ (check in [filters/statistics/vtkStatisticsAlgorithm.h](https://github.com/Kitware/VTK/blob/master/Filters/Statistics/vtkStatisticsAlgorithm.h))\n","funding_links":[],"categories":["Technical"],"sub_categories":["ramanihiteshc@gmail.com"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsorenmacbeth%2Fstreaming-papers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsorenmacbeth%2Fstreaming-papers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsorenmacbeth%2Fstreaming-papers/lists"}