{"id":27999268,"url":"https://github.com/probcomp/cgpm","last_synced_at":"2025-10-19T16:42:20.296Z","repository":{"id":46069741,"uuid":"49149711","full_name":"probcomp/cgpm","owner":"probcomp","description":"Library of composable generative population models which serve as the modeling and inference backend of BayesDB.","archived":false,"fork":false,"pushed_at":"2024-07-11T15:39:12.000Z","size":9489,"stargazers_count":25,"open_issues_count":44,"forks_count":11,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-05-08T22:57:27.855Z","etag":null,"topics":["bayesian-inference","data-analysis","machine-learning","probabilistic-programming","tabular-data"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/probcomp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-01-06T17:16:53.000Z","updated_at":"2023-05-29T11:04:57.000Z","dependencies_parsed_at":"2023-01-18T09:15:43.944Z","dependency_job_id":"be228d03-1aa9-48aa-acc5-39bcdaeefce0","html_url":"https://github.com/probcomp/cgpm","commit_stats":null,"previous_names":[],"tags_count":27,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probcomp%2Fcgpm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probcomp%2Fcgpm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probcomp%2Fcgpm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probcomp%2Fcgpm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/probcomp","download_url":"https://codeload.github.com/probcomp/cgpm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253160790,"owners_count":21863627,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-inference","data-analysis","machine-learning","probabilistic-programming","tabular-data"],"created_at":"2025-05-08T22:57:34.012Z","updated_at":"2025-10-19T16:42:20.206Z","avatar_url":"https://github.com/probcomp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# cgpm\n\n[![Build Status](https://travis-ci.org/probcomp/cgpm.svg?branch=master)](https://travis-ci.org/probcomp/cgpm)\n\nThe aim of this project is to provide a unified probabilistic programming\nframework to express different models and techniques from statistics, machine\nlearning and non-parametric Bayes. It serves as the primary modeling and\ninference runtime system for [bayeslite](https://github.com/probcomp/bayeslite),\nan open-source implementation of BayesDB.\n\nComposable generative population models (CGPM) are a computational abstraction\nfor probabilistic objects. They provide an interface that explicitly\ndifferentiates between the _sampler_ of a random variable from its conditional\ndistribution and the _assessor_ of its conditional density. By encapsulating\nmodels as probabilistic programs that implement CGPMs, complex models can be\nbuilt as compositions of sub-CGPMs, and queried in a model-independent way\nusing the Bayesian Query Language.\n\n## Installing\n\n### Conda\n\nThe easiest way to install cgpm is to use the\n[package](https://anaconda.org/probcomp/cgpm) on Anaconda Cloud.\nPlease follow [these instructions](https://github.com/probcomp/iventure/blob/master/docs/conda.md).\n\n### Manual Build\n\n`cgpm` targets Ubuntu 14.04 and 16.04. The package can be installed by cloning\nthis repository and following these instructions. It is _highly recommended_ to\ninstall `cgpm` inside of a virtualenv which was created using the\n`--system-site-packages` flag.\n\n1. Install dependencies from `apt`, [listed here](https://github.com/probcomp/cgpm/blob/71fe62790f466e9dd2149d0f527c584cce19e70f/docker/ubuntu1604#L4-L14).\n\n2. Retrieve and build the source.\n\n    ```\n    % git clone git@github.com:probcomp/cgpm\n    % cd cgpm\n    % pip install --no-deps .\n    ```\n\n3. Verify the installation.\n\n    ```\n    % python -c 'import cgpm'\n    % cd cgpm \u0026\u0026 ./check.sh\n    ```\n\n## Publications\n\nCGPMs, and their integration as a runtime system for\n[BayesDB](probcomp.csail.mit.edu/bayesdb/), are described in the following\ntechnical report:\n\n- __Probabilistic Data Analysis with Probabilistic Programming__.\nSaad, F., and Mansinghka, V. [_arXiv preprint, arXiv:1608.05347_](https://arxiv.org/abs/1608.05347), 2017.\n\nApplications of using cgpm and bayeslite for data analysis tasks can be further\nfound in:\n\n- __Probabilistic Search for Structured Data via Probabilistic Programming and Nonparametric Bayes__.\nSaad, F. Casarsa, L., and Mansinghka, V. [_arXiv preprint, arXiv:1704.01087_](https://arxiv.org/abs/1704.01087), 2017.\n\n- __Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes__.\nSaad, F., and Mansinghka, V. [_Artificial Intelligence and Statistics (AISTATS)_](http://proceedings.mlr.press/v54/saad17a.html), 2017.\n\n- __A Probabilistic Programming Approach to Probabilistic Data Analysis__.\nSaad, F., and Mansinghka, V. [_Advances in Neural Information Processing Systems (NIPS)_](https://papers.nips.cc/paper/6060-a-probabilistic-programming-approach-to-probabilistic-data-analysis.html), 2016.\n\n\n## Tests\n\nRunning `./check.sh` will run a subset of the tests that are considered complete\nand stable. To launch the full test suite, including continuous integration\ntests, run `py.test` in the root directory. There are more tests in the `tests/`\ndirectory, but those that do not start with `test_` or do start with `disabled_`\nare not considered ready. The tip of every branch merged into master __must__\npass `./check.sh`, and be consistent with the code conventions outlined in\n[HACKING](HACKING).\n\nTo run the full test suite, use `./check.sh --integration tests/`. Note that the\nfull integration test suite requires installing the C++\n[crosscat](https://github.com/probcomp/crosscat) backend.\n\n## License\n\nCopyright (c) 2015-2016 MIT Probabilistic Computing Project\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprobcomp%2Fcgpm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprobcomp%2Fcgpm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprobcomp%2Fcgpm/lists"}