{"id":42889150,"url":"https://github.com/tmaklin/rcgpar","last_synced_at":"2026-01-30T14:55:23.413Z","repository":{"id":55157368,"uuid":"424234558","full_name":"tmaklin/rcgpar","owner":"tmaklin","description":"c++ library for parallel and distributed estimation of mixture model components using variational inference.","archived":false,"fork":false,"pushed_at":"2025-11-15T12:53:10.000Z","size":1247,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-11-15T14:35:00.248Z","etag":null,"topics":["c-plus-plus","hpc","mixture-model","mpi","openmp","variational-inference"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-2.1","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tmaklin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-11-03T13:28:24.000Z","updated_at":"2024-09-11T11:46:03.000Z","dependencies_parsed_at":"2024-08-20T15:52:22.189Z","dependency_job_id":"7a4efd0c-3f80-406a-a860-9636f54da494","html_url":"https://github.com/tmaklin/rcgpar","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/tmaklin/rcgpar","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmaklin%2Frcgpar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmaklin%2Frcgpar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmaklin%2Frcgpar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmaklin%2Frcgpar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tmaklin","download_url":"https://codeload.github.com/tmaklin/rcgpar/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmaklin%2Frcgpar/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28914895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-30T12:13:43.263Z","status":"ssl_error","status_checked_at":"2026-01-30T12:13:22.389Z","response_time":66,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus","hpc","mixture-model","mpi","openmp","variational-inference"],"created_at":"2026-01-30T14:55:22.755Z","updated_at":"2026-01-30T14:55:23.404Z","avatar_url":"https://github.com/tmaklin.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"**Note** rcgpar has been superceded by\n[mixt](https://codeberg.org/themaklin/mixt) as of 24 November 2025.\n\n# rcgpar - Fit mixture models in HPC environments\nrcgpar provides CPU and GPU implementations of a variational\ninference algorithm for estimating mixture model components from a\nlikelihood matrix in parallel.\n\n## Installation\nrcgpar is currently only available compiled from source.\n## Compiling from source\n### Requirements\n- C++17 compliant compiler.\n- cmake\n\n#### Optional\n- Compiler with OpenMP support.\n- [LibTorch](https://pytorch.org/get-started/locally/)\n- CUDA Toolkit or ROCm\n\n### Compiling\nClone the rcgpar repository\n```\ngit clone https://github.com/tmaklin/rcgpar\n```\nenter the directory and run\n```\ncd rcgpar\nmkdir build\ncd build\n```\n\n... and follow the instructions below.\n\n#### CPU estimation only\nin the `build/` directory, run\n```\ncmake ..\nmake\n```\n\nThis creates the `librcgomp` library in `build/lib/`.\n\n#### GPU acceleration\ninstead of the above, run\n```\ncmake -DCMAKE_LIBTORCH_PATH=/absolute/path/to/libtorch ..\nmake\n```\nwhere `/absolute/path/to/libtorch` should be the absolute (!) path to the LibTorch distribution.\n\nThis creates the `librcgomp` and `librcggpu` libraries in `build/lib/`.\n\n## Usage\nLink against `librcgomp` and/or `librcggpu` and include the\n`rcgpar.hpp` header in your project. This header provides four\nfunctions:\n- 'rcgpar::rcg\\_optl\\_omp' using OpenMP\n- 'rcgpar::rcg\\_optl\\_torch' using LibTorch\n- 'rcgpar::em\\_torch' a different algorithm using LibTorch.\n\nThe LibTorch algorithms will run on the GPU if one is\npresent. Otherwise, they will run on the CPU. These algorithms are\nfaster even when ran on the CPU but `rcg_optl_torch` consumes more\nmemory than `rcg_optl_omp`.\n\n### rcg\\_optl\\_omp, rcg\\_optl\\_mpi, rcg\\_optl\\_torch, and em\\_torch\nThese four functions perform the actual model fitting. All have to be called with the following\narguments:\n```\nconst rcgpar::Matrix\u003cdouble\u003e \u0026logl:\n    KxN row-major order matrix containing the log-likelihoods for theobservations,\n    where K is the number of components and N is the number of observations.\nconst std::vector\u003cdouble\u003e \u0026log_times_observed:\n    N-dimensional vector which contains the natural logarithm of the number\n\tof times that the N:th row in `logl` should be counted. Useful if many\n\trows in the log-likelihood matrix are identical - they can be compressed\n\tby counting them several times via this argument.\nconst std::vector\u003cdouble\u003e \u0026alpha0:\n    N-dimensional vector containing the prior parameters of the Dirichlet\n\tdistribution that is used as a conjugate prior in the model. Good\n\tdefault choice is to set all entries to 1.\nconst double \u0026tol:\n    The estimation process will terminate once the evidence lower bound\n\tELBO changes by less than this value from one iteration to the next.\n\tGood choices are around 1e-6 and 1e-8, adjust according to your needs.\nconst uint16_t maxiters:\n    Maximum number of iterations to run the optimizer for if the tolerance\n\tcriterion is not fulfilled.\nstd::ostream \u0026log:\n    Print status messages here. Silence the messages by supplying a\n\tstd::ofstream that has not been assigned to any file.\n```\n'em\\_torch' requires the extra argument:\n```\nstd::string precision:\n    Either \"float\" or \"double\", which determines the precision of the algorithm.\n```\n\nThe optimizers return a KxN `rcgpar::Matrix\u003cdouble\u003e` type row-major order\nmatrix, where each row is a probability vector assigning the row to\nthe mixture components.\n\nNote: rcg\\_optl\\_mpi assumes that the root process holds the full\n'logl' and 'log\\_times\\_observed values', which are then distributed\nfrom the root process to other processes. Contrary to this, 'alpha0',\n'tol', and 'maxiters' are assumed to be present on all processes when\ncalling rcg\\_optl\\_mpi.\n\n### mixture\\_components and mixture\\_components\\_torch\nUse 'rcgpar::mixture\\_components\\(_torch)' to transform the matrix from\nrcg\\_optl\\_omp/mpi/torch into a probability vector containing the relative\ncontributions of each mixture component. 'mixture\\_components\\(_torch)' takes\nthe following input arguments:\n```\nconst rcgpar::Matrix\u003cdouble\u003e \u0026probs:\n    The matrix returned from rcg_optl_omp/torch, em_torch, or rcg_optl_mpi.\nconst std::vector\u003cdouble\u003e \u0026log_times_observed:\n    The N-dimensional vector of log times observed that was used\n\tas input to the call to rcg_optl_omp/torch, em_torch, or rcg_optl_mpi.\n```\n\n'mixture\\_components\\(_torch)' will return a N-dimensional probability vector\ncontaining the mixture component proportions.\n\n### Creating the input matrix\nrcgpar requires the input log-likelihood matrix formatted with the\ninternal rcgpar::Matrix class. If your input log-likelihoods are\nstored in a flattened vector, you can construct the input object to\nrcg\\_optl\\_omp/mpi with the constructor:\n```\nMatrix\u003cdouble\u003e(std::vector\u003cdouble\u003e \u0026flattened_logl,\n               uint16_t n_mixture_components, uint32_t n_observations)\n```\n\nIf your data is stored in a 2D vector, use the following constructor:\n```\nMatrix\u003cdouble\u003e(std::vector\u003cstd::vector\u003cdouble\u003e\u003e \u0026logl_2D)\n```\n\nNote that both constructors assume the data is stored in row-major\norder.\n\n## License\nThe source code from this project is subject to the terms of the\nLGPL-2.1 license. A copy of the LGPL-2.1 license is supplied with the\nproject, or can be obtained at\nhttps://opensource.org/licenses/LGPL-2.1.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftmaklin%2Frcgpar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftmaklin%2Frcgpar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftmaklin%2Frcgpar/lists"}