{"id":13471778,"url":"https://github.com/erdogant/distfit","last_synced_at":"2025-05-14T22:09:48.847Z","repository":{"id":41472508,"uuid":"231843440","full_name":"erdogant/distfit","owner":"erdogant","description":"distfit is a python library for probability density fitting.","archived":false,"fork":false,"pushed_at":"2025-05-04T14:47:30.000Z","size":16558,"stargazers_count":385,"open_issues_count":12,"forks_count":27,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-05-04T15:35:00.125Z","etag":null,"topics":["cumulative-distribution-function","density-functions","fitting-curve","hypothesis-testing","kolmogorov-smirnov","pdf","plot","probability-distribution","probability-statistics","pypi","qqplot","sse"],"latest_commit_sha":null,"homepage":"https://erdogant.github.io/distfit","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/erdogant.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["erdogant"],"buy_me_a_coffee":"erdogant","ko_fi":"erdogant","custom":["https://erdogant.github.io/distfit/pages/html/Documentation.html"],"patreon":null,"open_collective":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null}},"created_at":"2020-01-04T23:36:08.000Z","updated_at":"2025-05-04T14:47:33.000Z","dependencies_parsed_at":"2023-12-16T13:00:54.131Z","dependency_job_id":"b8af9dcd-991a-4c11-b1c2-3cd93e30c415","html_url":"https://github.com/erdogant/distfit","commit_stats":{"total_commits":503,"total_committers":8,"mean_commits":62.875,"dds":"0.17693836978131217","last_synced_commit":"44d00faaced278ec9d8cbc84b3a7febb3254af22"},"previous_names":[],"tags_count":55,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fdistfit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fdistfit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fdistfit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fdistfit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/erdogant","download_url":"https://codeload.github.com/erdogant/distfit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254235701,"owners_count":22036964,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cumulative-distribution-function","density-functions","fitting-curve","hypothesis-testing","kolmogorov-smirnov","pdf","plot","probability-distribution","probability-statistics","pypi","qqplot","sse"],"created_at":"2024-07-31T16:00:49.197Z","updated_at":"2025-05-14T22:09:43.827Z","avatar_url":"https://github.com/erdogant.png","language":"Jupyter Notebook","funding_links":["https://github.com/sponsors/erdogant","https://buymeacoffee.com/erdogant","https://ko-fi.com/erdogant","https://erdogant.github.io/distfit/pages/html/Documentation.html","https://www.buymeacoffee.com/erdogant)--"],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/logo.png\" width=\"600\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n[![Python](https://img.shields.io/pypi/pyversions/distfit)](https://img.shields.io/pypi/pyversions/distfit)\n[![Pypi](https://img.shields.io/pypi/v/distfit)](https://pypi.org/project/distfit/)\n[![Docs](https://img.shields.io/badge/Sphinx-Docs-Green)](https://erdogant.github.io/distfit/)\n[![LOC](https://sloc.xyz/github/erdogant/distfit/?category=code)](https://github.com/erdogant/distfit/)\n[![Downloads](https://static.pepy.tech/personalized-badge/distfit?period=month\u0026units=international_system\u0026left_color=grey\u0026right_color=brightgreen\u0026left_text=PyPI%20downloads/month)](https://pepy.tech/project/distfit)\n[![Downloads](https://static.pepy.tech/personalized-badge/distfit?period=total\u0026units=international_system\u0026left_color=grey\u0026right_color=brightgreen\u0026left_text=Downloads)](https://pepy.tech/project/distfit)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/erdogant/distfit/blob/master/LICENSE)\n[![Forks](https://img.shields.io/github/forks/erdogant/distfit.svg)](https://github.com/erdogant/distfit/network)\n[![Issues](https://img.shields.io/github/issues/erdogant/distfit.svg)](https://github.com/erdogant/distfit/issues)\n[![Project Status](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)\n[![DOI](https://zenodo.org/badge/231843440.svg)](https://zenodo.org/badge/latestdoi/231843440)\n[![Medium](https://img.shields.io/badge/Medium-Blog-black)](https://erdogant.github.io/distfit/pages/html/Documentation.html#medium-blog)\n[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://erdogant.github.io/distfit/pages/html/Documentation.html#colab-notebook)\n[![Donate](https://img.shields.io/badge/Support%20this%20project-grey.svg?logo=github%20sponsors)](https://erdogant.github.io/distfit/pages/html/Documentation.html#)\n\u003c!---[![BuyMeCoffee](https://img.shields.io/badge/buymea-coffee-yellow.svg)](https://www.buymeacoffee.com/erdogant)--\u003e\n\u003c!---[![Coffee](https://img.shields.io/badge/coffee-black-grey.svg)](https://erdogant.github.io/donate/?currency=USD\u0026amount=5)--\u003e\n\n# \n### Blogs\n#### [1. How to Find the Best Theoretical Distribution for Your Data](https://erdogant.github.io/distfit/pages/html/Documentation.html#medium-blog)\n\n#### [2. Outlier Detection Using Distribution Fitting in Univariate Datasets](https://towardsdatascience.com/outlier-detection-using-distribution-fitting-in-univariate-data-sets-ac8b7a14d40e)\n\n#### [3. Step-by-Step Guide to Generate Synthetic Data by Sampling From Univariate Distributions](https://towardsdatascience.com/step-by-step-guide-to-generate-synthetic-data-by-sampling-from-univariate-distributions-6b0be4221cb1)\n\n\n\n# \n\n### [Documentation pages](https://erdogant.github.io/distfit/)\n\n# \n\n``distfit`` is a python package for probability density fitting of univariate distributions for random variables.\nWith the random variable as an input, distfit can find the best fit for parametric, non-parametric, and discrete distributions.\n\n* For the parametric approach, the distfit library can determine the best fit across 89 theoretical distributions.\n  To score the fit, one of the scoring statistics for the good-of-fitness test can be used used, such as RSS/SSE, Wasserstein,\n  Kolmogorov-Smirnov (KS), or Energy. After finding the best-fitted theoretical distribution, the loc, scale,\n  and arg parameters are returned, such as mean and standard deviation for normal distribution.\n\n* For the non-parametric approach, the distfit library contains two methods, the quantile and percentile method.\n  Both methods assume that the data does not follow a specific probability distribution. In the case of the quantile method,\n  the quantiles of the data are modeled whereas for the percentile method, the percentiles are modeled.\n\n* In case the dataset contains discrete values, the distift library contains the option for discrete fitting.\n  The best fit is then derived using the binomial distribution.\n\n# \n**⭐️ Star this repo if you like it ⭐️**\n# \n\n\n\n### Installation\n\n##### Install distfit from PyPI\n```bash\npip install distfit\n```\n\n##### Install from github source (beta version)\n```bash\npip install git+https://github.com/erdogant/distfit\n```  \n\n##### Check version\n```python\nimport distfit\nprint(distfit.__version__)\n```\n\n##### The following functions are available after installation:\n\n```python\n# Import library\nfrom distfit import distfit\n\ndfit = distfit()        # Initialize \ndfit.fit_transform(X)   # Fit distributions on empirical data X\ndfit.predict(y)         # Predict the probability of the resonse variables\ndfit.plot()             # Plot the best fitted distribution (y is included if prediction is made)\n```\n\n\u003chr\u003e\n\n### Examples\n\n# \n\n##### [Example: Quick start to find best fit for your input data](https://erdogant.github.io/distfit/pages/html/Examples.html#)\n\n```python\n\n# [distfit] \u003eINFO\u003e fit\n# [distfit] \u003eINFO\u003e transform\n# [distfit] \u003eINFO\u003e [norm      ] [0.00 sec] [RSS: 0.00108326] [loc=-0.048 scale=1.997]\n# [distfit] \u003eINFO\u003e [expon     ] [0.00 sec] [RSS: 0.404237] [loc=-6.897 scale=6.849]\n# [distfit] \u003eINFO\u003e [pareto    ] [0.00 sec] [RSS: 0.404237] [loc=-536870918.897 scale=536870912.000]\n# [distfit] \u003eINFO\u003e [dweibull  ] [0.06 sec] [RSS: 0.0115552] [loc=-0.031 scale=1.722]\n# [distfit] \u003eINFO\u003e [t         ] [0.59 sec] [RSS: 0.00108349] [loc=-0.048 scale=1.997]\n# [distfit] \u003eINFO\u003e [genextreme] [0.17 sec] [RSS: 0.00300806] [loc=-0.806 scale=1.979]\n# [distfit] \u003eINFO\u003e [gamma     ] [0.05 sec] [RSS: 0.00108459] [loc=-1862.903 scale=0.002]\n# [distfit] \u003eINFO\u003e [lognorm   ] [0.32 sec] [RSS: 0.00121597] [loc=-110.597 scale=110.530]\n# [distfit] \u003eINFO\u003e [beta      ] [0.10 sec] [RSS: 0.00105629] [loc=-16.364 scale=32.869]\n# [distfit] \u003eINFO\u003e [uniform   ] [0.00 sec] [RSS: 0.287339] [loc=-6.897 scale=14.437]\n# [distfit] \u003eINFO\u003e [loggamma  ] [0.12 sec] [RSS: 0.00109042] [loc=-370.746 scale=55.722]\n# [distfit] \u003eINFO\u003e Compute confidence intervals [parametric]\n# [distfit] \u003eINFO\u003e Compute significance for 9 samples.\n# [distfit] \u003eINFO\u003e Multiple test correction method applied: [fdr_bh].\n# [distfit] \u003eINFO\u003e Create PDF plot for the parametric method.\n# [distfit] \u003eINFO\u003e Mark 5 significant regions\n# [distfit] \u003eINFO\u003e Estimated distribution: beta [loc:-16.364265, scale:32.868811]\n```\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Examples.html#make-predictions\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/example_figP4c.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n# \n\n##### [Example: Plot summary of the tested distributions](https://erdogant.github.io/distfit/pages/html/Examples.html#plot-rss)\n\nAfter we have a fitted model, we can make some predictions using the theoretical distributions. \nAfter making some predictions, we can plot again but now the predictions are automatically included.\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Examples.html#plot-rss\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/fig1_summary.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n# \n\n##### [Example: Make predictions using the fitted distribution](https://erdogant.github.io/distfit/pages/html/Examples.html#make-predictions)\n\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Examples.html#make-predictions\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/example_figP1a.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n\n# \n\n##### [Example: Test for one specific distributions](https://erdogant.github.io/distfit/pages/html/Examples.html#fit-for-one-specific-distribution)\n\nThe full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Examples.html#fit-for-one-specific-distribution\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/example_figP3b.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n# \n\n##### [Example: Test for multiple distributions](https://erdogant.github.io/distfit/pages/html/Examples.html#fit-for-multiple-distributions)\n\nThe full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Examples.html#fit-for-multiple-distributions\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/example_figP2b.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n# \n\n\n##### [Example: Fit discrete distribution](https://erdogant.github.io/distfit/pages/html/Discrete.html)\n\n\n```python\nfrom scipy.stats import binom\n# Generate random numbers\n\n# Set parameters for the test-case\nn = 8\np = 0.5\n\n# Generate 10000 samples of the distribution of (n, p)\nX = binom(n, p).rvs(10000)\nprint(X)\n\n# [5 1 4 5 5 6 2 4 6 5 4 4 4 7 3 4 4 2 3 3 4 4 5 1 3 2 7 4 5 2 3 4 3 3 2 3 5\n#  4 6 7 6 2 4 3 3 5 3 5 3 4 4 4 7 5 4 5 3 4 3 3 4 3 3 6 3 3 5 4 4 2 3 2 5 7\n#  5 4 8 3 4 3 5 4 3 5 5 2 5 6 7 4 5 5 5 4 4 3 4 5 6 2...]\n\n# Import distfit\nfrom distfit import distfit\n\n# Initialize for discrete distribution fitting\ndfit = distfit(method='discrete')\n\n# Run distfit to and determine whether we can find the parameters from the data.\ndfit.fit_transform(X)\n\n# [distfit] \u003efit..\n# [distfit] \u003etransform..\n# [distfit] \u003eFit using binomial distribution..\n# [distfit] \u003e[binomial] [SSE: 7.79] [n: 8] [p: 0.499959] [chi^2: 1.11]\n# [distfit] \u003eCompute confidence interval [discrete]\n\n```\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Discrete.html\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/binomial_plot.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n# \n\n##### [Example: Make predictions on unseen data for discrete distribution](https://erdogant.github.io/distfit/pages/html/Discrete.html#make-predictions)\n\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/distfit/pages/html/Discrete.html#make-predictions\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/distfit/blob/master/docs/figs/binomial_plot_predict.png\" width=\"450\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n# \n\n\n##### [Example: Generate samples based on the fitted distribution](https://erdogant.github.io/distfit/pages/html/Generate.html)\n\n\u003chr\u003e\n\n### Contributors\nSetting up and maintaining distfit has been possible thanks to users and contributors. Thanks:\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://github.com/erdogant/distfit/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=erdogant/distfit\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n### Citation\nPlease cite ``distfit`` in your publications if this is useful for your research. See column right for citation information.\n\n### Maintainer\n* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)\n* Contributions are welcome.\n* If you wish to buy me a \u003ca href=\"https://erdogant.github.io/donate/?currency=USD\u0026amount=5\"\u003eCoffee\u003c/a\u003e for this work, it is very appreciated :)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferdogant%2Fdistfit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ferdogant%2Fdistfit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferdogant%2Fdistfit/lists"}