{"id":17348005,"url":"https://github.com/jelchison/estimate-randomness","last_synced_at":"2025-08-19T11:10:36.756Z","repository":{"id":80501294,"uuid":"103690866","full_name":"JElchison/estimate-randomness","owner":"JElchison","description":"Estimate the randomness of a sequence of integers meant to model a discrete distribution","archived":false,"fork":false,"pushed_at":"2017-09-15T19:15:25.000Z","size":17,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-27T11:22:00.565Z","etag":null,"topics":["entropy","python","randomness","randomness-testing"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JElchison.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-15T18:50:51.000Z","updated_at":"2022-01-02T23:58:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"5a93f54c-191b-462c-b29f-27c56bdcbb60","html_url":"https://github.com/JElchison/estimate-randomness","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/JElchison/estimate-randomness","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JElchison%2Festimate-randomness","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JElchison%2Festimate-randomness/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JElchison%2Festimate-randomness/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JElchison%2Festimate-randomness/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JElchison","download_url":"https://codeload.github.com/JElchison/estimate-randomness/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JElchison%2Festimate-randomness/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271143398,"owners_count":24706346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-19T02:00:09.176Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entropy","python","randomness","randomness-testing"],"created_at":"2024-10-15T16:50:48.963Z","updated_at":"2025-08-19T11:10:36.737Z","avatar_url":"https://github.com/JElchison.png","language":"Python","readme":"# estimate-randomness\nEstimate the randomness of a sequence of integers meant to model a discrete distribution.\n\nA perfectly random discrete distribution will model a [discrete uniform distribution](https://en.wikipedia.org/wiki/Discrete_uniform_distribution), having a maximum entropy of `ln(n)`.  This script attempts to estimate the amount of randomness of a sequence of integers in that distribution.\n\nIt does so by first calculating the [entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) of the data set.  Then, it repeats the process, but replacing the data set with its differential complement.  (e.g. Item 5 in the list is replaced by the difference between item 5 and 4.)  This is to prevent a sequence like `1 2 3 4 5` from being considered random.  While perfectly uniform, it has maximal entropy; however, it should not be considered random.\n\nThe value reported as estimated_randomness is the minimum value of the entropy of the data set and the entropy of its differential complement.\n\n* 0.0 means not random\n* 1.0 means highly random\n\nWith a large enough data set, the reported estimated_randomness value should approximate 1.0 for truly random data sources.\n\nDisclaimer:  A high estimated_randomness value does not *prove* that the input stream is random.  [Proving randomness](https://en.wikipedia.org/wiki/Randomness_tests) is a difficult task.\n\n\n## Usage\n\n```\n./estimate-randomness.py [input_file [(lower_bound) (upper_bound)]]\n\ninput_file - Place from which to read list of integers.  Can be '-' for stdin.\n             Defaults to '-' for stdin.\n             \nlower_bound - Lowest integer value expected to be seen in input_file (inclusive).\n              Defaults to mininum of data set.\n              Useful for detecting when items at beginning and end of range are not used.\n              \nupper_bound - Highest integer value expected to be seen in input_file (inclusive).\n              Defaults to maximum of data set.\n              Useful for detecting when items at beginning and end of range are not used.\n```\n\n## Examples\nInput sequence = 1 1 1 1 1\n```\n$ for NUM in 1 1 1 1 1; do echo $NUM; done | ./estimate-randomness.py\nNo range provided.  Not counting any values with frequency of 0.\nUsing total range of 1.\n\nmax_possible_entropy = 0.0\n\noverall_entropy = 0.0\noverall_randomness = 0\n\ndifferential_entropy = 0.0\ndifferential_randomness = 0\n\nestimated_randomness = 0\n```\n\nInput sequence = 1 2 3 4 5\n```\n$ for NUM in 1 2 3 4 5; do echo $NUM; done | ./estimate-randomness.py\nNo range provided.  Not counting any values with frequency of 0.\nUsing total range of 5.\n\nmax_possible_entropy = 1.60943791243\n\noverall_entropy = 1.60943791243\noverall_randomness = 1.0\n\ndifferential_entropy = 0.0\ndifferential_randomness = 0\n\nestimated_randomness = 0\n```\n\nInput sequence = set of 2048 integers (between 0 and 255) from /dev/urandom\n```\n$ for HEX in $(head -c 2048 /dev/urandom | xxd -p | sed -r 's/[0-9a-f]{2}/\\0 /g' | tr -d '\\n'); do echo $((16#$HEX)); done | ./estimate-randomness.py - 0 255\nUsing total range of 256.\n\nmax_possible_entropy = 5.54517744448\n\noverall_entropy = 5.48412677091\noverall_randomness = 0.988990312\n\ndifferential_entropy = 5.47797110335\ndifferential_randomness = 0.987880218118\n\nestimated_randomness = 0.987880218118\n```\n\nInput sequence = set of source port numbers used by your operating system\n```\n$ tcpdump -i eth0 -w mycap.pcap\n$ bro -r mycap.pcap\n$ awk '{print $4}' conn.log | egrep '[0-9]+' | ./estimate-randomness.py - 1024 65535\nUsing total range of 65403.\n\nmax_possible_entropy = 11.088323408\n\noverall_entropy = 3.10649690465\noverall_randomness = 0.280159298241\n\ndifferential_entropy = 3.2913361528\ndifferential_randomness = 0.296829018392\n\nestimated_randomness = 0.280159298241\n```\n\nThis last example shows a low estimated_randomness since the pcap only contains a few connections (thus, the distribution is considered sparse in the range 1024-65535).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjelchison%2Festimate-randomness","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjelchison%2Festimate-randomness","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjelchison%2Festimate-randomness/lists"}