{"id":15165899,"url":"https://github.com/virgesmith/humanleague","last_synced_at":"2025-10-25T09:31:02.207Z","repository":{"id":24749408,"uuid":"95961787","full_name":"virgesmith/humanleague","owner":"virgesmith","description":"Microsynthesis using quasirandom sampling and/or IPF","archived":false,"fork":false,"pushed_at":"2025-01-30T16:27:27.000Z","size":1738,"stargazers_count":18,"open_issues_count":1,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-01-31T06:51:12.060Z","etag":null,"topics":["c-plus-plus-11","microsynthesis","nodejs","python3","quasirandom","r","sampling-methods"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/virgesmith.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-01T12:03:50.000Z","updated_at":"2025-01-18T14:36:57.000Z","dependencies_parsed_at":"2023-01-14T01:33:02.934Z","dependency_job_id":"e779386e-11ce-49a6-a4de-bd7bf34d2b5a","html_url":"https://github.com/virgesmith/humanleague","commit_stats":{"total_commits":518,"total_committers":7,"mean_commits":74.0,"dds":"0.36872586872586877","last_synced_commit":"9c74c9dcf2b44cb93e49501b7a2d8125156bdc64"},"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/virgesmith%2Fhumanleague","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/virgesmith%2Fhumanleague/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/virgesmith%2Fhumanleague/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/virgesmith%2Fhumanleague/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/virgesmith","download_url":"https://codeload.github.com/virgesmith/humanleague/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238112937,"owners_count":19418535,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus-11","microsynthesis","nodejs","python3","quasirandom","r","sampling-methods"],"created_at":"2024-09-27T04:05:57.228Z","updated_at":"2025-10-25T09:31:02.192Z","avatar_url":"https://github.com/virgesmith.png","language":"C++","readme":"# humanleague\n\n[![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://opensource.org/licenses/MIT)\n\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/humanleague)](https://pypi.org/project/humanleague/)\n[![PyPI version](https://badge.fury.io/py/humanleague.svg)](https://badge.fury.io/py/humanleague)\n[![CRAN\\_Status\\_Badge](https://www.r-pkg.org/badges/version/humanleague)](https://CRAN.R-project.org/package=humanleague)\n\n[![DOI](https://zenodo.org/badge/95961787.svg)](https://zenodo.org/badge/latestdoi/95961787)\n[![status](https://joss.theoj.org/papers/d5aaf6e1c2efed431c506762622473b4/status.svg)](https://joss.theoj.org/papers/d5aaf6e1c2efed431c506762622473b4)\n\n[![python (pip) build](https://github.com/virgesmith/humanleague/actions/workflows/python-test.yml/badge.svg)](https://github.com/virgesmith/humanleague/actions/workflows/python-test.yml/badge.svg)\n[![r-cmd-check](https://github.com/virgesmith/humanleague/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/virgesmith/humanleague/actions/workflows/r-cmd-check/badge.svg)\n\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/430da36db15f46978bfccd1ad3243ae9)](https://www.codacy.com/gh/virgesmith/humanleague/dashboard?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=virgesmith/humanleague\u0026amp;utm_campaign=Badge_Grade)\n[![codecov](https://codecov.io/gh/virgesmith/humanleague/branch/main/graph/badge.svg)](https://codecov.io/gh/virgesmith/humanleague)\n\n## Introduction\n\n**Please note ongoing development is for the python version only. R development is currently maintenance-only due to resource constraints.**\n\n*humanleague* is a python *and* an R package for microsynthesising populations from marginal and (optionally) seed data. The package is implemented in C++ for performance.\n\nThe package contains algorithms that use a number of different microsynthesis techniques:\n\n- [Iterative Proportional Fitting (IPF)](https://en.wikipedia.org/wiki/Iterative_proportional_fitting)\n- [Quasirandom Integer Sampling (QIS)](http://jasss.soc.surrey.ac.uk/20/4/14.html) (no seed population)\n- Quasirandom Integer Sampling of IPF (QISI): A combination of the two techniques whereby the integral population is sampled (without replacement) from a distribution constructed from a dynamic IPF solution.\n\nThe latter provides a bridge between deterministic reweighting and combinatorial optimisation, offering advantages of both techniques:\n\n- generates high-entropy integral populations\n- can be used to generate multiple populations for sensitivity analysis\n- goes some way to address the 'empty cells' issues that can occur in straight IPF\n- relatively fast computation time\n\nThe algorithms:\n\n- support arbitrary dimensionality for both the marginals and the seed.\n- produce statistical data to ascertain the likelihood/degeneracy of the population (where appropriate).\n\nThe package also contains the following utilities:\n\n- a Sobol sequence generator (implemented as a generator class in python)\n- a function to construct a closest integer population from a discrete univariate probability distribution.\n- an algorithm for sampling an integer population from a discrete multivariate probability distribution, constrained to the marginal sums in every dimension (see [below](#multidimensional-integerisation)).\n- utility functions to convert a population represented as a multidimensional state array into tables of either counts (indexed by state) or individuals.\n\nVersion 1.0.1 reflects the work described in the [Quasirandom Integer Sampling (QIS)](http://jasss.soc.surrey.ac.uk/20/4/14.html) paper.\n\n## Installation\n\n### Python\n\nRequires Python 3.12 or newer. The package can be installed using `pip`, e.g.\n\n\n```bash\npip install humanleague\n```\n\n#### Development\n\n[uv](https://docs.astral.sh/uv/) is highly recommended for managing environments.\n\n```bash\nuv sync --dev\nuv run pytest\n```\n\n### R\n\nOfficial release:\n\n```r\n\u003e install.packages(\"humanleague\")\n```\n\nFor a development version\n\n```r\n\u003e devtools::install_github(\"virgesmith/humanleague\")\n```\n\nOr, for the legacy version\n\n```r\n\u003e devtools::install_github(\"virgesmith/humanleague@1.0.1\")\n```\n\n## Documentation and Examples\n\n### R\n\nConsult the package documentation, e.g.\n\n```r\n\u003e library(humanleague)\n\u003e ?humanleague\n```\n\n### Python\n\nThe package now contains type annotations and your IDE should automatically display this, e.g.:\n\n![help](./doc/help.png)\n\nNB type stubs are generated using the `pybind11-stubgen` package, with some [manual corrections](./doc/type-stubs.md).\n\n### Multidimensional integerisation\n\nBuilding on the one-dimensionl `integerise` function - which given a discrete probability distribution and a count, returns the closest integer population to the distribution that sums to the count - a multidimensional equivalent `integerise` is introduced. In one dimension, for example this:\n\n```python\n\u003e\u003e\u003e import humanleague\n\u003e\u003e\u003e p = [0.1, 0.2, 0.3, 0.4]\n\u003e\u003e\u003e result, stats = humanleague.integerise(p, 11)\n\u003e\u003e\u003e result\narray([1, 2, 3, 5], dtype=int32)\n\u003e\u003e\u003e stats\n{'rmse': 0.3535533905932736}\n```\n\nproduces the optimal (i.e. closest possible) integer population to the discrete distribution.\n\nThe `integerise` function generalises this problem and applies it to higher dimensions: given an n-dimensional array of real numbers where the 1-d marginal sums in every dimension are integral (and thus the total population is too), it attempts to find an integral array that also satisfies these constraints.\n\nThe QISI algorithm is repurposed to this end. As it is a sampling algorithm it cannot guarantee that a solution is found, and if so, whether the solution is optimal. If it fails this does not prove that a solution does not exist for the given input.\n\n```python\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e a = np.array([[ 0.3,  1.2,  2. ,  1.5],\n                  [ 0.6,  2.4,  4. ,  3. ],\n                  [ 1.5,  6. , 10. ,  7.5],\n                  [ 0.6,  2.4,  4. ,  3. ]])\n# marginal sums\n\u003e\u003e\u003e a.sum(axis=0)\narray([ 3., 12., 20., 15.])\n\u003e\u003e\u003e a.sum(axis=1)\narray([ 5., 10., 25., 10.])\n# perform integerisation\n\u003e\u003e\u003e result, stats = humanleague.integerise(a)\n\u003e\u003e\u003e stats\n{'conv': True, 'rmse': 0.5766281297335398}\n\u003e\u003e\u003e result\narray([[ 0,  2,  2,  1],\n       [ 0,  3,  4,  3],\n       [ 2,  6, 10,  7],\n       [ 1,  1,  4,  4]])\n# check marginals are preserved\n\u003e\u003e\u003e (result.sum(axis=0) == a.sum(axis=0)).all()\nTrue\n\u003e\u003e\u003e (result.sum(axis=1) == a.sum(axis=1)).all()\nTrue\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvirgesmith%2Fhumanleague","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvirgesmith%2Fhumanleague","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvirgesmith%2Fhumanleague/lists"}