{"id":19984544,"url":"https://github.com/intellabs/ais-benchmarks","last_synced_at":"2025-05-04T06:33:18.561Z","repository":{"id":148596653,"uuid":"499670223","full_name":"IntelLabs/ais-benchmarks","owner":"IntelLabs","description":"A framework, based on python and numpy, for evaluation of sampling methods","archived":false,"fork":false,"pushed_at":"2023-10-31T15:19:31.000Z","size":1903,"stargazers_count":9,"open_issues_count":15,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2023-10-31T16:31:31.682Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IntelLabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-03T22:43:27.000Z","updated_at":"2023-03-06T01:06:13.000Z","dependencies_parsed_at":"2024-10-25T21:27:27.872Z","dependency_job_id":null,"html_url":"https://github.com/IntelLabs/ais-benchmarks","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2Fais-benchmarks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2Fais-benchmarks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2Fais-benchmarks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2Fais-benchmarks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IntelLabs","download_url":"https://codeload.github.com/IntelLabs/ais-benchmarks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224386380,"owners_count":17302652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T04:19:28.040Z","updated_at":"2024-11-13T04:19:28.607Z","avatar_url":"https://github.com/IntelLabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sampling methods benchmark \n\nA framework, based on python and numpy, for evaluation of sampling methods. The framework consists of common interfaces \nfor sampling methods, an evaluation methodology with different metrics and an automated result and plots generation.\n\nIt is our hope that this package helps researches to benchmark their sampling algorithms and compare them to state\nof the art implementations, if you find this package useful for your research please consider citing it, see citation\ninfo below. \n\n### Installation and quickstart\nThe last supported version is 3.10. guppy3 which is used for CPU and Memory metrics is still not ported to 3.11. \nOnce this dependency runs on 3.11 ais-benchmarks should be capable to run on 3.11\n\n### Ubuntu Linux\n```shell\nsudo apt install python python-pip python-tk\ngit clone https://github.com/IntelLabs/ais-benchmarks.git ais-benchmarks\ncd ais-benchmarks\npip install cython scikit-build\npip install -r requirements.txt\npython benchmark/run_benchmark.py\n```\n### Windows\nYou need to have python3 and git in your PATH for Windows powershell to find\nthe commands\n\nInstall python 3.10:\nhttps://www.python.org/downloads/release/python-3109/\n```shell\ngit clone https://github.com/IntelLabs/ais-benchmarks.git ais-benchmarks\ncd ais-benchmarks\npip install -r requirements.txt\npython benchmark/run_benchmark.py\n```\n\n### Batching:\nAll methods assume the first dimension to be the batch dimension to support vectorized implementations. The results\nare also in batch form even if there is only one component in the batch dimension.\n\n\n### Baselines\nSeveral well-known and state-of-the-art algorithms are provided as a baseline for evaluation of new implementation of \nmethods. The repo welcomes new algorithms to be added as they become available.\n\n- Deterministic Mixture Adaptive Importance Sampling (DM-AIS) [1]\n- Mixture Particle Monte Carlo (M-PMC) [2]\n- Layered Adaptive Importance Sampling (LAIS) [3]\n- MCMC Metropolis-Hastings [4]\n- Nested sampling [5]\n- Multi-nested sampling [6]\n- Rejection Importance Sampling [7]\n- Tree-pyramidal Adaptive Importance Sampling (TP-AIS) [9]\n- TODO: Adaptive Population Importance Sampling [8]\n- Hierarchical Adaptive Importance Sampling via Exploration and Exploitation. (HiDaisee) [10]\n\n\n### Evaluation methodology\nThe benchmark aims to evaluate multiple aspects of sampling algorithms.\n- **Sampling efficiency**: How good are the estimates with the minimum number of sample operations, including rejected \nsamples.\n- **Acceptance rate**: Algorithms that reject samples might exhibit high sampling efficiency and provide uncorrelated\n samples. However, they achieve it at the cost of rejecting samples. This metric considers the number of rejected \n samples.\n- **Proposal quality**: Measures similarity of the proposal to the underlying target distribution.\n- **Samples quality**: Measures the similarity of a KDE approximation using the generated samples to the target \ndistribution.\n- **Time per sample**: Measures the time it required to generate a sample. This metric is useful to quantify the \noverhead that algorithms with more sophisticated proposals and adaptation algorithms have. \n- **Target distribution specific**: Obtain metrics across multiple types of target distributions with different shape\nparameters and modality. Combined results provide a good idea of how the methods perform in general while target \ndistributions results are also available for analysis.\n\n\n### Evaluation metrics\nAs part of the evaluation process, several metrics are computed and stored in the results file.\n- Kullback–Leibler divergence (KLD)\n- Sampling runtime (T)\n- Memory usage (MEM)\n\nMetrics todo:\n- Jensen Shannon Divergence (JSD)\n- Bhattacharya distance (BD)\n- **Normalized Effective Sample Size (NESS)**: Samplers often provide correlated samples, this metric computes the effective\n number of independent samples, by dividing it by the total number of samples generated this metric conveys information\n about how much each sample is worth. [ref]\n- Expected value mean squared error (EV-MSE)\n\n\n### Benchmark configuration and execution\nThe evaluation process uses a benchmark YAML file that defines the different target distributions that will be \nused to produce the evaluation results. See an example below: \n\n```yaml\n# Define the target densities used for evaluation and label their categories and provide evaluation parameters\ntargets:\n    - normal:\n        type: Normal\n        params: {loc: 0, scale: 1}\n        tags: [low_kurtosis, low_skew, unimodal]\n        evaluation:\n            batch_size: 2\n            nsamples: 1000\n            nsamples_eval: 2000\n\n    - gmm:\n        type: GMM\n        params: {loc: [0, -0.2], scale: [0.01, 0.001], weight:[0.5, 0.5]}\n        tags: [high_kurtosis, low_skew, multimodal, close_modes]\n        evaluation:\n            batch_size: 2\n            nsamples: 1000\n            nsamples_eval: 2000\n``` \n\nThe benchmark YAML file is complemented by a methods YAML file that is used to define the configuration of the methods \nto be evaluated using the benchmark file.\n```yaml\n# Define the methods to be evaluated and its parameter configuration\nmethods:\n    - name: rejection\n      type: CRejectionSampling\n      params:\n        scaling: 1\n        proposal: uniform\n        kde_bw: 0.01\n\n    - name: mcmc_mh\n      type: CMetropolisHastings    \n      params: \n        n_burnin: 10 \n        n_steps: 2 \n        kde_bw: 0.01 \n        proposal_sigma: 0.1\n```\n\nFinally the configuration YAML file defines the different metrics that will be used, paths to the results and some\nother options for ais-benchmarks to provide other types of results like plots or tables.\n```yaml\nnreps: 3  # Number of times each experiment is repeated\n\nrseed: 0  # Random seed used for reproducibility\n\nmetrics: [KLD, MEM, T]  # Metrics computed\n\ndisplay:\n    value: true             # Display 1D and 2D density plots on the screen for debug\n    display_path: results/   # If value is true, save the debug plots as a .png in the provided path\n\n    animation: {value: true,                    # Compile the sequence of visualized sampling steps into an animation\n                fps: 1,                         # Frames per second used for the animation\n                animation_path: results/anim/}  # If value is true, save the plots as a .png in the provided path\n\noutput:\n    file: results/results.txt\n    make_plots: true\n    plots_path: results/\n    plots_dpi: 1200\n```\n\nA benchmark can be executed on the desired methods by using the appropriate script: \n```\nrun_sampling_bechmark.py benchmark.yaml methods.yaml config.yaml\n```\n\nA more thorough example of benchmark and methods file can be found in the provided default benchmarks specified in \n*def_benchmark.yaml*, *def_methods.yaml* and *def_config.yaml*.\n\n\n## Framework extension\n### Implementing new sampling algorithms\nImplement the new method by deriving from the sampling_methods.base class that is most suited for your case: \nCMixtureSamplingMethod for general sampling methods and CMixtureISSamplingMethod for IS methods.\n\nTODO: Explain here how a sampling method is implemented, the methods that it must implement and how samples and\nweights must be stored and updated. Comment also on the behavior of the importance_sample method and how it\nshould be used when the sampling method is not IS but simulation like MCMC or Nested.\n\nReminder to mention the batching and provide an example.\n\nComment on the optional requirement of methods to handle different dimensionality RV\n\n### Adding target distributions\n\n### Adding metrics\n\n\n\n## Authors\n- Javier Felip Leon: javier.felip.leon@intel.com\n\n## Citation\nFelip et. al. Benchmarking sampling algorithms\n\n## References\n[1] Víctor Elvira, Luca Martino, David Luengo, and Mónica F. Bugallo.  Improving populationMonte Carlo: Alternative \nweighting and resampling schemes.Signal Processing, 131(Mc):77–1391, 2017.\n\n[2] Olivier Cappé, Randal Douc, Arnaud Guillin, Jean Michel Marin, and Christian P. Robert. Adaptive importance \nsampling in general mixture classes.Statistics and Computing, 18(4):447–459, 2008.\n\n[3] L Martino,  V Elvira,  D Luengo,  and J Corander. Layered adaptive importance sampling.Statistics and Computing, \n27(3):599–623, may 2017.\n\n[4] W K Hastings.  Monte Carlo sampling methods using Markov chains and their applications.Biometrika, \n57(1):97–109, 1970.\n\n[5] J Skilling. Nested Sampling for Bayesian Computations.Bayesian Analysis, (4):833–860, 2006.\n\n[6] F. Feroz, M.P. Hobson, E. Cameron, and Pettitt A.N.  Importance Nested Sampling and theMULTINEST Algorithm.\nArxiv astro physics, 2014\n\n[7] George Casella, Christian P. Robert, and Martin T. Wells.Generalized Accept-Reject samplingschemes, volume \nVolume 45 ofLecture Notes–Monograph Series, pages 342–347. Institute of Mathematical Statistics, Beachwood, Ohio, \nUSA, 2004.\n\n[8] Luca Martino, Victor Elvira, David Luengo, and Jukka Corander.  An adaptive populationimportance sampler: \nLearning from uncertainty.IEEE Transactions on Signal Processing,63(16):4422–4437, 2015\n\n[9] Felip et. al. Tree pyramid adaptive importance sampling\n\n[10] Lu, Xiaoyu, Tom Rainforth, Yuan Zhou, Jan-Willem van de Meent, and Yee Whye Teh. “On Exploration, \nExploitation and Learning in Adaptive Importance Sampling.” ArXiv:1810.13296 [Cs, Stat], October 31, 2018. \n\n\u003c!-- First Review - 10/31/2023 MRB --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintellabs%2Fais-benchmarks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fintellabs%2Fais-benchmarks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintellabs%2Fais-benchmarks/lists"}