{"id":21259472,"url":"https://github.com/ori88c/weighted-random-item-sampler","last_synced_at":"2026-01-27T05:02:07.226Z","repository":{"id":251650417,"uuid":"836404901","full_name":"ori88c/weighted-random-item-sampler","owner":"ori88c","description":"A weighted random item sampler (selector), where the probability of selecting an item is proportional to its weight. The sampling method utilizes a binary search optimization, making it suitable for performance-demanding applications where the set of items is large and the sampling frequency is high.","archived":false,"fork":false,"pushed_at":"2024-10-26T22:39:10.000Z","size":96,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-01T04:07:34.282Z","etag":null,"topics":["binary-search","nodejs","random-choice","sampler","sampling","select","selector","ts","typescript","weighted","weighted-choice","weighted-random","weighted-sampler","weighted-samples","weighted-sampling","weighted-select"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ori88c.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-07-31T19:22:40.000Z","updated_at":"2024-10-26T22:37:22.000Z","dependencies_parsed_at":"2025-07-11T03:43:12.770Z","dependency_job_id":"1b8abebe-6411-44bd-8aea-5aca54bdd862","html_url":"https://github.com/ori88c/weighted-random-item-sampler","commit_stats":null,"previous_names":["ori88c/weighted-random-item-sampler"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/ori88c/weighted-random-item-sampler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ori88c%2Fweighted-random-item-sampler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ori88c%2Fweighted-random-item-sampler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ori88c%2Fweighted-random-item-sampler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ori88c%2Fweighted-random-item-sampler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ori88c","download_url":"https://codeload.github.com/ori88c/weighted-random-item-sampler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ori88c%2Fweighted-random-item-sampler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28803641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T03:44:14.111Z","status":"ssl_error","status_checked_at":"2026-01-27T03:43:33.507Z","response_time":168,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-search","nodejs","random-choice","sampler","sampling","select","selector","ts","typescript","weighted","weighted-choice","weighted-random","weighted-sampler","weighted-samples","weighted-sampling","weighted-select"],"created_at":"2024-11-21T04:14:12.609Z","updated_at":"2026-01-27T05:02:07.209Z","avatar_url":"https://github.com/ori88c.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch2 align=\"middle\"\u003eWeighted Random Item Sampler\u003c/h2\u003e\n\nThe `WeightedRandomItemSampler` class implements a random sampler where the probability of selecting an item is proportional to its weight, with **replacement allowed** between samples. In other words, an item can be sampled more than once.\n\nFor example, given items [A, B] with respective weights [5, 12], the probability of sampling item B is 12/5 higher than the probability of sampling item A.\n\nWeights must be positive numbers, and there are no restrictions on them being natural numbers. Floating point weights such as 0.95, 5.4, and 119.83 are also supported.\n\nUse case examples include:\n* __Distributed Systems__: The sampler can assist in distributing workloads among servers based on their capacities or current load, ensuring that more capable servers handle a greater number of tasks.\n* __Surveys and Polls__: The sampler can be used to select participants based on demographic weights, ensuring a representative sample.\n* __Attack Simulation__: Randomly select attack vectors for penetration testing based on their likelihood or impact.\n* __ML Model Training__: Select training samples with weights based on their importance or difficulty to ensure diverse and balanced training data.\n\nIf your use case requires sampling each item **exactly once** without replacement, consider using [non-replacement-weighted-random-item-sampler](https://www.npmjs.com/package/non-replacement-weighted-random-item-sampler) instead.\n\n## Table of Contents :bookmark_tabs:\n\n* [Key Features](#key-features)\n* [API](#api)\n* [Use Case Example: Training Samples for a ML model](#use-case-example)\n* [Algorithm](#algorithm)\n* [License](#license)\n\n## Key Features :sparkles:\u003ca id=\"key-features\"\u003e\u003c/a\u003e\n\n- __Weighted Random Sampling :weight_lifting_woman:__: Sampling items with proportional probability to their weight.\n- __With Replacement__: Items can be sampled multiple times.\n- __Efficiency :gear:__: O(log(n)) time and O(1) space per sample, making this class suitable for performance-critical applications where the set of items is large and the sampling frequency is high.\n- __Comprehensive documentation :books:__: The class is thoroughly documented, enabling IDEs to provide helpful tooltips that enhance the coding experience.\n- __Tests :test_tube:__: Fully covered by unit tests.\n- **TypeScript** support.\n- No external runtime dependencies: Only development dependencies are used.\n- ES2020 Compatibility: The `tsconfig` target is set to ES2020, ensuring compatibility with ES2020 environments.\n\n## API :globe_with_meridians:\u003ca id=\"api\"\u003e\u003c/a\u003e\n\nThe `WeightedRandomItemSampler` class provides the following method:\n\n* __sample__: Randomly samples an item, with the probability of selecting a given item being proportional to its weight.\n\nIf needed, refer to the code documentation for a more comprehensive description.\n\n## Use Case Example: Training Samples for a ML model :man_technologist:\u003ca id=\"use-case-example\"\u003e\u003c/a\u003e\n\nConsider a component responsible for selecting training-samples for a ML model. By assigning weights based on the importance or difficulty of each sample, we ensure a diverse and balanced training dataset.\n\n```ts\nimport { WeightedRandomItemSampler } from 'weighted-random-item-sampler';\n\ninterface TrainingSampleData {\n  // ...\n}\n\ninterface TrainingSampleMetadata {\n  importance: number; // Weight for sampling.\n  // ...\n}\n\ninterface TrainingSample {\n  data: TrainingSampleData;\n  metadata: TrainingSampleMetadata;\n}\n\nclass ModelTrainer {\n  private readonly _trainingSampler: WeightedRandomItemSampler\u003cTrainingSample\u003e;\n\n  constructor(samples: ReadonlyArray\u003cTrainingSample\u003e) {\n    this._trainingSampler = new WeightedRandomItemSampler(\n      samples, // Items array.\n      samples.map(sample =\u003e sample.metadata.importance) // Respective weights array.\n    );\n  }\n\n  public selectTrainingSample(): TrainingSample {\n    return this._trainingSampler.sample();\n  }\n}\n```\n\n## Algorithm :gear:\u003ca id=\"algorithm\"\u003e\u003c/a\u003e\n\nThis section introduces a foundational algorithm, which will later be optimized. For simplicity, we assume all weights are natural numbers (1, 2, 3, ...). A plausible and efficient solution with **O(1)** time complexity and **O(weights sum)** space complexity involves allocating an array with a size equal to the sum of the weights. Each item is assigned to its corresponding number of cells based on its weight. For example, given items A and B with respective weights of 1 and 2, we would allocate one cell for item A and two cells for item B. This approach is valid when the number of items and their weights are relatively small. However, challenges arise when weights can be non-natural (e.g., 5.4, 0.23) or when the total weight sum is substantial, leading to significant memory overhead.\n\nNext, we introduce an optimization over this basic idea. We calculate a **prefix sum** of the weights, treating each cell in the prefix sum array as denoting an **imaginary half-open range**. Using the previous example with items A and B (weights 1 and 2), the first range is denoted as [0, 1), while the second range is [1, 3). We can then randomly sample a number (not necessarily a natural number) within the total range [0, 3) and match it to its corresponding range index, which corresponds to a specific item. This random-to-interval matching can be performed in **O(log n)** time using a left-biased binary search to find the leftmost index i such that `randomPoint \u003c prefix_sum[i]`. A key observation that enables this binary search is the monotonic ascending nature of the prefix sum array, as weights are necessarily positive.\n\n## License :scroll:\u003ca id=\"license\"\u003e\u003c/a\u003e\n\n[Apache 2.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fori88c%2Fweighted-random-item-sampler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fori88c%2Fweighted-random-item-sampler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fori88c%2Fweighted-random-item-sampler/lists"}