{"id":13415706,"url":"https://github.com/milcent/benford_py","last_synced_at":"2025-03-14T23:31:03.217Z","repository":{"id":26108053,"uuid":"29552320","full_name":"milcent/benford_py","owner":"milcent","description":"Python implementation of Benford's Law tests.","archived":false,"fork":false,"pushed_at":"2022-10-11T08:35:46.000Z","size":11032,"stargazers_count":150,"open_issues_count":3,"forks_count":51,"subscribers_count":13,"default_branch":"master","last_synced_at":"2024-08-30T22:39:01.634Z","etag":null,"topics":["accounting","auditing","benford","benford-compliant","benfords-law","compliance","digit","financial-analysis","fraud-detection","matplotlib","numpy","pandas","python","python3","research","simon-newcomb"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/milcent.png","metadata":{"files":{"readme":"README-pypi.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null}},"created_at":"2015-01-20T21:11:59.000Z","updated_at":"2024-08-01T10:21:20.000Z","dependencies_parsed_at":"2022-07-27T06:16:16.821Z","dependency_job_id":null,"html_url":"https://github.com/milcent/benford_py","commit_stats":null,"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milcent%2Fbenford_py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milcent%2Fbenford_py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milcent%2Fbenford_py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milcent%2Fbenford_py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/milcent","download_url":"https://codeload.github.com/milcent/benford_py/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243663434,"owners_count":20327299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accounting","auditing","benford","benford-compliant","benfords-law","compliance","digit","financial-analysis","fraud-detection","matplotlib","numpy","pandas","python","python3","research","simon-newcomb"],"created_at":"2024-07-30T21:00:51.481Z","updated_at":"2025-03-14T23:31:03.212Z","avatar_url":"https://github.com/milcent.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"[![Downloads](https://pepy.tech/badge/benford-py)](https://pepy.tech/project/benford-py)\n\n# Benford for Python\n\n--------------------------------------------------------------------------------\n\n**Citing**\n\n\nIf you find *Benford_py* useful in your research, please consider adding the following citation:\n\n```bibtex\n@misc{benford_py,\n      author = {Marcel, Milcent},\n      title = {{Benford_py: a Python Implementation of Benford's Law Tests}},\n      year = {2017},\n      publisher = {GitHub},\n      journal = {GitHub repository},\n      howpublished = {\\url{https://github.com/milcent/benford_py}},\n}\n```\n\n--------------------------------------------------------------------------------\n\n`current version = 0.5.0`\n\n### See [release notes](https://github.com/milcent/benford_py/releases/) for features in this and in older versions\n\n### Python versions \u003e= 3.6\n\n### Installation\n\nBenford_py is a package in PyPi, so you can install with pip:\n\n`pip install benford_py`\n\nor\n\n`pip install benford-py`\n\nOr you can cd into the site-packages subfolder of your python distribution (or environment) and git clone from there:\n\n`git clone https://github.com/milcent/benford_py`\n\nFor a quick start, please go to the [Demo notebook](https://github.com/milcent/benford_py/blob/master/Demo.ipynb), in which I show examples on how to run the tests with the SPY (S\u0026P 500 ETF) daily returns.\n\nFor more fine-grained details of the functions and classes, see the [docs](https://benford-py.readthedocs.io/en/latest/index.html).\n\n### Background\n\nThe first digit of a number is [its leftmost digit](https://github.com/milcent/benford_py/blob/master/img/First_Digits.png)\n\nSince the first digit of any number can range from \"1\" to \"9\"\n(not considering \"0\"), it would be intuitively expected that the\nproportion of each occurrence in a set of numerical records would\nbe uniformly distributed at 1/9, i.e., approximately 0.1111,\nor 11.11%.\n\n[Benford's Law](https://en.wikipedia.org/wiki/Benford%27s_law),\nalso known as the Law of First Digits or the Phenomenon of\nSignificant Digits, is the finding that the first digits of the\nnumbers found in series of records of the most varied sources do\nnot display a uniform distribution, but rather are arranged in such\na way that the digit \"1\" is the most frequent, followed by \"2\",\n\"3\", and so in a successive and decremental way down to \"9\", \nwhich presents the lowest frequency as the first digit.\n\nThe expected distributions of the First Digits in a\nBenford-compliant data set are the ones shown [here](https://github.com/milcent/benford_py/blob/master/img/First.png)\n\nThe first record on the subject dates from 1881, in the work of\n[Simon Newcomb](https://github.com/milcent/benford_py/blob/master/img/Simon_Newcomb_APS.jpg), an American-Canadian astronomer and mathematician,\nwho noted that in the logarithmic tables the first pages, which\ncontained logarithms beginning with the numerals \"1\" and \"2\",\nwere more worn out, that is, more consulted.\n\nIn that same article, Newcomb proposed the [formula](https://github.com/milcent/benford_py/blob/master/img/formula.png) for the probability of a certain digit \"d\" \nbeing the first digit of a number, given by the following equation.\n\nIn 1938, the American physicist [Frank Benford](https://github.com/milcent/benford_py/blob/master/img/2429_Benford-Frank.jpg) revisited the \nphenomenon, which he called the \"Law of Anomalous Numbers,\" in \na survey with more than 20,000 observations of empirical data \ncompiled from various sources, ranging from areas of rivers to\nmolecular weights of chemical compounds, including cost data, \naddress numbers, population sizes and physical constants. All \nof them, to a greater or lesser extent, followed such \ndistribution.\n\nThe extent of Benford's work seems to have been one good reason \nfor the phenomenon to be popularized with his name, though \ndescribed by Newcomb 57 years earlier.\n\nDerivations of the original formula were also applied in the \nexpected findings of the proportions of digits in other \npositions in the number, as in the case of the second digit\n(BENFORD, 1938), as well as combinations, such as the first \ntwo digits of a number (NIGRINI, 2012, p.5).\n\nOnly in 1995, however, was the phenomenon proven by Hill. \nHis proof was based on the fact that numbers in data series\nfollowing the Benford Law are, in effect, \"second generation\"\ndistributions, ie combinations of other distributions.\nThe union of randomly drawn samples from various distributions\nforms a distribution that respects Benford's Law (HILL, 1995).\n\nWhen grouped in ascending order, data that obey Benford's Law \nmust approximate a geometric sequence (NIGRINI, 2012, page 21).\nFrom this it follows that the logarithms of this ordered series\nmust form a straight line. In addition, the mantissas (decimal\nparts) of the logarithms of these numbers must be uniformly\ndistributed in the interval [0,1] (NIGRINI, 2012, p.10).\n\nIn general, a series of numerical records follows Benford's Law\nwhen (NIGRINI, 2012, p.21):\n* it represents magnitudes of events or events, such as populations\nof cities, flows of water in rivers or sizes of celestial bodies;\n* it does not have pre-established minimum or maximum limits;\n* it is not made up of numbers used as identifiers, such as \nidentity or social security numbers, bank accounts, telephone numbers; and\n* its mean is less than the median, and the data is not\nconcentrated around the mean.\n\nIt follows from this expected distribution that, if the set of\nnumbers in a series of records that usually respects the Law\nshows a deviation in the proportions found, there may be\ndistortions, whether intentional or not.\n\nBenford's Law has been used in [several fields](http://www.benfordonline.net/). \nAfer asserting that the usual data type is Benford-compliant,\none can study samples from the same data type tin search of\ninconsistencies, errors or even [fraud](https://www.amazon.com.br/Benfords-Law-Applications-Accounting-Detection/dp/1118152859).\n\nThis open source module is an attempt to facilitate the \nperformance of Benford's Law-related tests by people using\nPython, whether interactively or in an automated, scripting way.\n\nIt uses the versatility of numpy and pandas, along with\nmatplotlib for vizualization, to deliver results like [this one](https://github.com/milcent/benford_py/blob/master/img/SPY-f2d-conf_level-95.png) and much more.\n\n\nIt has been a long time since I last tested it in Python 2. The death clock has stopped ticking, so officially it is for Python 3 now. It should work on Linux, Windows and Mac, but please file a bug report if you run into some trouble.\n\nAlso, if you have some nice data set that we can run these tests on, let'us try it.\n\nThanks!\n\nMilcent\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmilcent%2Fbenford_py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmilcent%2Fbenford_py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmilcent%2Fbenford_py/lists"}