{"id":15107719,"url":"https://github.com/osoco/pharopds","last_synced_at":"2025-10-23T02:31:34.610Z","repository":{"id":140852983,"uuid":"175432232","full_name":"osoco/PharoPDS","owner":"osoco","description":"Probabilistic data structures in Pharo Smalltalk.","archived":false,"fork":false,"pushed_at":"2019-05-14T14:34:58.000Z","size":680,"stargazers_count":29,"open_issues_count":0,"forks_count":4,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-01-30T17:12:20.458Z","etag":null,"topics":["bloom-filter","data-structures-and-algorithms","pharo-smalltalk"],"latest_commit_sha":null,"homepage":"","language":"Smalltalk","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/osoco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-03-13T14:01:10.000Z","updated_at":"2023-03-11T08:07:53.000Z","dependencies_parsed_at":null,"dependency_job_id":"ef05da9f-f22b-4fb3-84c4-4a3dc6879dae","html_url":"https://github.com/osoco/PharoPDS","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osoco%2FPharoPDS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osoco%2FPharoPDS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osoco%2FPharoPDS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/osoco%2FPharoPDS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/osoco","download_url":"https://codeload.github.com/osoco/PharoPDS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237769067,"owners_count":19363250,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bloom-filter","data-structures-and-algorithms","pharo-smalltalk"],"created_at":"2024-09-25T21:41:13.143Z","updated_at":"2025-10-23T02:31:29.178Z","avatar_url":"https://github.com/osoco.png","language":"Smalltalk","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PharoPDS\n\n[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)\n[![Build Status](https://travis-ci.org/osoco/PharoPDS.svg?branch=master)](https://travis-ci.org/osoco/PharoPDS)\n[![Coverage Status](https://coveralls.io/repos/github/osoco/PharoPDS/badge.svg?branch=master)](https://coveralls.io/github/osoco/PharoPDS?branch=master)\n[![Pharo version](https://img.shields.io/badge/Pharo-7.0-%23aac9ff.svg)](https://pharo.org/download)\n[![Pharo version](https://img.shields.io/badge/Pharo-8.0-%23aac9ff.svg)](https://pharo.org/download)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/osoco/PharoPDS/master/LICENSE)\n\nThe purpose of PharoPDS is to provide some **probabilistic data structures and algorithms** implemented in Pharo.\n\n''Probabilistic data structures'' is a common name for data structures based mostly on different hashing techniques. Unlike regular and deterministic data structures, they always provide approximated answers but with reliable ways to estimate possible errors.\n\nThe potential losses and errors are fully compensated for by extremely low memory requirements, constant query time, and scaling. All these factors make these structures relevant in ''Big Data'' applications.\n\nWe've written some posts about the library and the historical and intellectual background of some ideas behind the approach we have followed:\n\n- [Understanding Bloom filters with Pharo Smalltalk](https://osoco.es/thoughts/2019/05/understanding-bloom-filters-with-pharo-smalltalk/)\n- [Designing media for thought with moldable development](https://osoco.es/thoughts/2019/05/designing-media-for-thought-with-moldable-development/)\n\n## Install PharoPDS\n\nTo install PharoPDS on your Pharo image you can just find it in the **Pharo Project Catalog** (`World menu` \u003e `Tools` \u003e `Catalog Browser`) and click in the *green mark* icon in the upper right corner to install the latest stable version:\n\n![Pharo Project Catalog with the project selected](doc/images/pharo-project-catalog.png)\n\nOr, you can also execute the following script:\n\n```Smalltalk\n    Metacello new\n      baseline: #ProbabilisticDataStructures;\n    \trepository: 'github://osoco/PharoPDS:master/src';\n    \tload\n```\n\nYou can optionally install all the custom extensions and interactive tutorials included with the project executing the following script to install the group 'All':\n\n\n```Smalltalk\n    Metacello new\n      baseline: #ProbabilisticDataStructures;\n    \trepository: 'github://osoco/PharoPDS:master/src';\n    \tload: 'All'\n```\n\nTo add PharoPDS to your own project's baseline just add this:\n\n```Smalltalk\n    spec\n    \tbaseline: #ProbabilisticDataStructures\n    \twith: [ spec repository: 'github://osoco/PharoPDS:master/src' ]\n```\n\nNote that you can replace the *master* by another branch or a tag.\n\n## Data Structures\n\nCurrently, PharoPDS provides probabilistic data structures for the following categories of problems:\n\n### Membership\n\nA *membership problem* for a dataset is a task to decide whether some elements belongs to the dataset or not.\n\nThe data structures provided to solve the membership problem are the following:\n\n - **Bloom Filter**.\n\n### Cardinality\n\nThis is still a work in progress.\n\n - **HyperLogLog**\n\n## Moldable development\n\nThis library has been developed trying to apply the ideas after the **moldable development** approach, so you can expect that each data structure provides its own custom and domain-specific extensions in order to ease the understanding and learning by the developers.\n\nFor instance, the following pictures are some of the extensions provided by the Bloom filter:\n\n![Inspector on Bloom Filter - Parameters tab](doc/images/bloom-params-extension.png)\n\n![Inspector on Bloom Filter - FPP tab](doc/images/bloom-fpp-extension.png)\n\n![Inspector on Bloom Filter - Bits tab](doc/images/bloom-bits-extension.png)\n\n![Inspector on Bloom Filter - Analysis](doc/images/bloom-analysis.png)\n\n## Algorithms Browser\n\nIn order to ease the understanding of the inner workings and trade-offs, we provide specific *Playground* tools for each data structure that allows the developer to explore it and get deeper insights.\n\nYou can browse the available algorithm playgrounds through the **PharoPDS Algorithms Browser**. You can open it with the following expression:\n\n```Smalltalk\nPDSAlgorithmsBrowser open \n```\n\n![PDS Algorithms Browser](doc/images/algorithms-browser.png)\n\n## License\n\nPharoPDS is written and supported by developers at **[OSOCO](https://osococo.es)** and published as **free and open source** software  under an **[MIT license](LICENSE)**.\n\n## Project dependencies\n\nHashing plays a central role in probabilistic data structures. Indeed, the choice of the appropiate hash functions is crucial to avoid bias and to reach a good performance. In particular, the structures require **non-cryptographic hash functions** that are provided by the dependency module **[NonCryptographicHashes](https://github.com/osoco/pharo-non-cryptographic-hashes)**.\n\nOther dependencies like **Roassal** or **GToolkit** are optional for production use. Nevertheless, we recommend that you install them in the development image if you want to get some useful tools like Inspector custom extensions, the algorithm browser or interactive tutorials.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fosoco%2Fpharopds","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fosoco%2Fpharopds","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fosoco%2Fpharopds/lists"}