{"id":24186546,"url":"https://github.com/kylesayrs/gmmpytorch","last_synced_at":"2025-09-21T10:31:47.897Z","repository":{"id":208726123,"uuid":"722240506","full_name":"kylesayrs/GMMPytorch","owner":"kylesayrs","description":"Pytorch implementation of same-family gaussian mixture models with guardrails. Features separable parameter optimization and singularity mitigation","archived":false,"fork":false,"pushed_at":"2025-05-31T01:30:11.000Z","size":620,"stargazers_count":22,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-31T12:20:50.570Z","etag":null,"topics":["gaussian-mixture-models","gmm","machine-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kylesayrs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-11-22T18:30:09.000Z","updated_at":"2025-05-31T01:30:14.000Z","dependencies_parsed_at":"2023-11-25T01:30:48.284Z","dependency_job_id":"5f75dab7-3697-410d-bacb-b2092017096c","html_url":"https://github.com/kylesayrs/GMMPytorch","commit_stats":null,"previous_names":["kylesayrs/gmmpytorch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kylesayrs/GMMPytorch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylesayrs%2FGMMPytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylesayrs%2FGMMPytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylesayrs%2FGMMPytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylesayrs%2FGMMPytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kylesayrs","download_url":"https://codeload.github.com/kylesayrs/GMMPytorch/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylesayrs%2FGMMPytorch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276229019,"owners_count":25606938,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-21T02:00:07.055Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gaussian-mixture-models","gmm","machine-learning","pytorch"],"created_at":"2025-01-13T12:36:02.582Z","updated_at":"2025-09-21T10:31:47.891Z","avatar_url":"https://github.com/kylesayrs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gaussian Mixture Models in Pytorch #\nImplements gaussian mixture models in pytorch. Loss is computed with respect to mean negative log likelihood and optimized via gradient descent.\n\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"75%\" src=\"assets/5_clusters.png\" alt=\"Example Optimization\"/\u003e\n\u003c/p\u003e\n\n## Usage ##\nRun demo\n```\nusage: demo.py [-h] [--samples SAMPLES] [--components COMPONENTS] [--dims DIMS]\n               [--iterations ITERATIONS]\n               [--family {full,diagonal,isotropic,shared_isotropic,constant}] [--log_freq LOG_FREQ]\n               [--radius RADIUS] [--mixture_lr MIXTURE_LR] [--component_lr COMPONENT_LR]\n               [--visualize VISUALIZE] [--seed SEED]\n\nFit a gaussian mixture model to generated mock data\n\noptions:\n  -h, --help            show this help message and exit\n  --samples SAMPLES     The number of total samples in dataset\n  --components COMPONENTS\n                        The number of gaussian components in mixture model\n  --dims DIMS           The number of data dimensions\n  --iterations ITERATIONS\n                        The number optimization steps\n  --family {full,diagonal,isotropic,shared_isotropic,constant}\n                        Model family, see `Mixture Types`\n  --log_freq LOG_FREQ   Steps per log event\n  --radius RADIUS       L1 bound of data samples\n  --mixture_lr MIXTURE_LR\n                        Learning rate of mixture parameter (pi)\n  --component_lr COMPONENT_LR\n                        Learning rate of component parameters (mus, sigmas)\n  --visualize VISUALIZE\n                        True for visualization at each log event and end\n  --seed SEED           seed for numpy and torch\n\n```\n\nUsage\n```python3\ndata = load_data(...)\n\nmodel = GmmFull(num_components=3, num_dims=2)\n\nloss = model.fit(\n    data,\n    num_iterations=10_000,\n    mixture_lr=1e-5\n    component_lr=1e-2\n)\n\n# visualize\nprint(f\"Final Loss: {loss:.2f}\")\nplot_data_and_model(data, model)\n```\n\nRun tests\n```bash\npython3 -m pytest tests\n```\n\n## Derivation ##\nWe start with the probability density function of a multivariate gaussian parameterized by mean $\\mu \\in \\mathbb{R}^{d}$ and the covariance matrix $\\Sigma \\in \\mathrm{S}_+^d$. The PDF describes the likelihood of sampling a point $x\\in\\mathbb{R}^{d}$ from the distribution.\n\n```math\n\\mathcal{N}(\\mathbf{x}) = \\frac{1}{(2\\pi)^{k/2}|\\Sigma|^{1/2}} \\exp\\left(-\\frac{1}{2} (\\mathbf{x} - \\mathbf{\\mu})^T \\Sigma^{-1} (\\mathbf{x} - \\mathbf{\\mu})\\right)\n```\n\nIn order to describe a mixture of gaussians, we add an additional parameter $\\pi_k \\in \\Delta^{k-1}$ which assigns the probability that a sample comes from any of the $K$ gaussian components.\n\n```math\np(\\mathbf{x}) = \\sum_{k=1}^{K} \\pi_k \\mathcal{N}(\\mathbf{x} | \\mathbf{\\mu}_k, \\mathbf{\\Sigma}_k)\n```\n\nGiven elements of a dataset $x \\in X^{(D \\times N)}$, we want our model to fit to the data. This means maximizing the likelihood that the elements could have been sampled from the mixture PDF $p(\\mathbf{x})$. Applying the function $-log(p(\\mathbf{x}))$ for each element $\\mathbf{x}$ has the effect of lowerbounding the best possible probability (1) while leaving the cost of an unlikely point (~0) unbounded. Given a dataset which contains few outliers and sufficently many components to cover the dataset, these properties make the negative log likelihood a suitable choice for our objective function.\n\n```math\nf(\\mathbf{x}) = - \\frac{1}{N} \\sum_{i=1}^{N} \\ln{ p(\\mathbf{x}) }\n```\n\nFor a from-scratch implementation of negative log likelihood backpropogation, see [GMMScratch](https://github.com/kylesayrs/GMMScratch/tree/master).\n\n\n## Gaussian Model Types ##\n| Type       | Description                                                                   |\n| ---------- | ----------------------------------------------------------------------------- |\n| Full       | Fully expressive eigenvalues. Data can be skewed in any direction             |\n| Diagonal   | Eigenvalues align with data axes. Dimensional variance is independent         |\n| Isotropic  | Equal variance in all directions. Spherical distributions                     |\n| Shared     | Equal variance in all directions for all components                           |\n| Constant   | Variance is not learned and is equal across all dimensions and components     |\n\nWhile more expressive varieties are able to better fit to real-world data, they require learning more parameters and are often less stable during training. As of now, all but the Constant mixture type have been implemented.\n\n## Comparison to Expectation Maximization (EM) Algorithm ##\nFor more information, see [On Convergence Properties of the EM\nAlgorithm for Gaussian Mixtures](https://dspace.mit.edu/bitstream/handle/1721.1/7195/AIM-1520.pdf?sequence=2).\n\n## Singularity Mitigation ##\nFrom Pattern Recognition and Machine Learning by Christopher M. Bishop, pg. 433:\n\u003e Suppose that one of the components of the mixture model, let us say the jth component, has its mean μ_j exactly equal to one of the data points so that μ_j = x_n for some value of n. If we consider the limit σ_j → 0, then we see that this term goes to infinity and so the log likelihood function will also go to infinity. Thus the maximization of the log likelihood function is not a well posed problem because such singularities will always be present and will occur whenever one of the Gaussian components ‘collapses’ onto a specific data point.\n\nA common solution to this problem is to reset the mean of the offending component whenever a singularity appears. In practice, singularities can be mitigated by clamping the minimum value of elements on the covariance diagonal. In a stochastic environment, a large enough clamp value will allow the model to recover after a few iterations.\n\n## Motivation ##\nThis project is not associated with any course or program. Instead, I hope that it serves as an educational tool for exploring the capabilities and engineering behind probabilistic modeling, custom loss functions, and differentiable programming in PyTorch.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkylesayrs%2Fgmmpytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkylesayrs%2Fgmmpytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkylesayrs%2Fgmmpytorch/lists"}