{"id":25927913,"url":"https://github.com/samirelanduk/inferi","last_synced_at":"2026-06-11T07:31:19.119Z","repository":{"id":69802555,"uuid":"85763050","full_name":"samirelanduk/inferi","owner":"samirelanduk","description":"A Python data processing library.","archived":false,"fork":false,"pushed_at":"2018-05-01T13:25:17.000Z","size":170,"stargazers_count":0,"open_issues_count":6,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-05-05T20:46:02.682Z","etag":null,"topics":["data-science","probability","statistics"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/samirelanduk.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":".github/CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2017-03-21T23:14:38.000Z","updated_at":"2022-01-06T03:15:22.000Z","dependencies_parsed_at":"2023-06-15T07:15:28.629Z","dependency_job_id":null,"html_url":"https://github.com/samirelanduk/inferi","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/samirelanduk/inferi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samirelanduk%2Finferi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samirelanduk%2Finferi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samirelanduk%2Finferi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samirelanduk%2Finferi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/samirelanduk","download_url":"https://codeload.github.com/samirelanduk/inferi/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samirelanduk%2Finferi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34188272,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-11T02:00:06.485Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","probability","statistics"],"created_at":"2025-03-03T21:11:05.465Z","updated_at":"2026-06-11T07:31:19.113Z","avatar_url":"https://github.com/samirelanduk.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"|travis| |coveralls| |pypi|\n\n.. |travis| image:: https://api.travis-ci.org/samirelanduk/inferi.svg?branch=0.5\n  :target: https://travis-ci.org/samirelanduk/inferi/\n\n.. |coveralls| image:: https://coveralls.io/repos/github/samirelanduk/inferi/badge.svg?branch=0.5\n  :target: https://coveralls.io/github/samirelanduk/inferi/\n\n.. |pypi| image:: https://img.shields.io/pypi/pyversions/inferi.svg\n  :target: https://pypi.org/project/inferi/\n\ninferi\n=======\n\ninferi is a statistics and data science Python library.\n\nExample\n-------\n\n  \u003e\u003e\u003e import inferi\n  \u003e\u003e\u003e variable = inferi.Variable(11, 45, 23, 12, 10)\n  \u003e\u003e\u003e variable.mean\n  20.2\n  \u003e\u003e\u003e variable.variance()\n  219.7\n\n\n\n\nInstalling\n----------\n\npip\n~~~\n\ninferi can be installed using pip:\n\n``$ pip3 install inferi``\n\ninferi is written for Python 3, and does not support Python 2. It currently\nsupports Python 3.6.\n\nIf you get permission errors, try using ``sudo``:\n\n``$ sudo pip3 install inferi``\n\n\nDevelopment\n~~~~~~~~~~~\n\nThe repository for inferi, containing the most recent iteration, can be\nfound `here \u003chttp://github.com/samirelanduk/inferi/\u003e`_. To clone the\ninferi repository directly from there, use:\n\n``$ git clone git://github.com/samirelanduk/inferi.git``\n\n\nRequirements\n~~~~~~~~~~~~\n\ninferi currently has no dependencies, compiled or otherwise.\n\n\nOverview\n--------\n\ninferi is a tool for performing basic statistical analysis on datasets. It is\npure-Python, and has no compiled dependencies.\n\nVariables\n~~~~~~~~~\n\nThe fundamental unit of inferi data analysis is the ``Variable``. It\nrepresents a set of measurements, such as heights, or favourite colours. It is\n`not` the same as a Python variable - it represents variables in the statistics\nsense of the word.\n\n    \u003e\u003e\u003e import inferi\n    \u003e\u003e\u003e heights = inferi.Variable(178, 156, 181, 175, 178)\n    \u003e\u003e\u003e heights\n    '\u003cVariable (178, 156, 181, 175, 178)\u003e'\n    \u003e\u003e\u003e heights.length\n    5\n\nIf you like, you can give the variable a name as an appropriate label:\n\n    \u003e\u003e\u003e heights = inferi.Variable(178, 156, 181, 175, 178, name=\"heights\")\n    \u003e\u003e\u003e heights.name\n    'heights'\n\nYou can also give an existing sequence, such as a list, and the result will be\nthe same:\n\n  \u003e\u003e\u003e heights = inferi.Variable([178, 156, 181, 175, 178])\n  \u003e\u003e\u003e heights\n  '\u003cVariable (178, 156, 181, 175, 178)\u003e'\n\nValues can be accessed by indexing:\n\n  \u003e\u003e\u003e weights.values\n  (12, 19, 11)\n  \u003e\u003e\u003e weights[0]\n  12\n  \u003e\u003e\u003e weights[-1]\n  11\n  \u003e\u003e\u003e weights.max\n  19\n  \u003e\u003e\u003e weights.min\n  11\n\nMeasures of Centrality\n######################\n\nVariables have the basic measures of centrality - mean, median and range.\n\n    \u003e\u003e\u003e heights = inferi.Variable(178, 156, 181, 175, 178)\n    \u003e\u003e\u003e heights.mean\n    173.6\n    \u003e\u003e\u003e heights.median\n    178\n    \u003e\u003e\u003e heights.mode(\n    178\n\nSee the full documentation for details on ``Variable.mean``,\n``Variable.median``, and ``Variable.mode``. Note that if the\nvariable has more than one mode, ``None`` will be returned.\n\n\nMeasures of Dispersion\n######################\n\nVariables can also calculate various measures of dispersion, the simplest being\nthe range:\n\n    \u003e\u003e\u003e heights = inferi.Variable(178, 156, 181, 175, 178)\n    \u003e\u003e\u003e heights.range\n    25\n\nYou can also calculate the variance and the standard deviation - measures of\nhow far individual measurements tend to be from the mean:\n\n    \u003e\u003e\u003e heights.variance()\n    101.3\n    \u003e\u003e\u003e heights.st_dev()\n    10.064790112068906\n\nBy default the Variables will be treated as samples rather than populations,\nwhich has consequences on the value of both the variance and the standard\ndeviation. To get the population values for each, simply set this when you\ncall the method:\n\n  \u003e\u003e\u003e heights.variance(population=True)\n  81.04\n  \u003e\u003e\u003e heights.st_dev(population=True)\n  9.00222194794152\n\nAgain, see the full documentation of ``Variable.range``,\n``Variable.variance``, and ``Variable.st_dev`` for\nmore details.\n\n\nComparing Variables\n###################\n\nIt is often useful to compare how two variables are related - whether there is a\ncorrelation between them or if they are independent.\n\nA simple way of doing this is to find the covariance between them, using the\n``Variable.covariance_with`` method:\n\n    \u003e\u003e\u003e variable1 = inferi.Variable(2.1, 2.5, 4.0, 3.6)\n    \u003e\u003e\u003e variable2 = inferi.Variable(8, 12, 14, 10)\n    \u003e\u003e\u003e variable1.covariance_with(variable2)\n    0.8033333333333333\n\nThe sign of this value tells you the relationship - if it is positive they are\npositively correlated, negative and they are negatively correlated, and the\ncloser to zero it is, the more independent the variable are.\n\nHowever the actual value of the covariance doesn't tell you much because it\ndepends on the magnitude of the values in the variable. The correlation metric\nhowever, is normalised to be between -1 and 1, so it is easier to quantify how\nrelated the two variable are. ``Variable.correlation_with`` is used to\ncalculate this:\n\n    \u003e\u003e\u003e variable1 = inferi.Variable(2.1, 2.5, 4.0, 3.6)\n    \u003e\u003e\u003e variable2 = inferi.Variable(8, 12, 14, 10)\n    \u003e\u003e\u003e variable1.correlation_with(variable2)\n    0.662573882203029\n\nDatasets\n~~~~~~~~\n\nUsually, more than one thing is measured in an experiment, and so you would have\nmore than one variable. For example, you might ask someone's name, their age,\ntheir height, and whether or not they smoke. Each of these four metrics is a\nvariable:\n\n  \u003e\u003e\u003e variable1 = inferi.Variable(\"Jon\", \"Sue\", \"Bob\", name=\"Names\")\n  \u003e\u003e\u003e variable2 = inferi.Variable(19, 34, 38, name=\"Ages\")\n  \u003e\u003e\u003e variable3 = inferi.Variable(1.87, 1.67, 1.73, name=\"Heights\")\n  \u003e\u003e\u003e variable4 = inferi.Variable(False, True, True, name=\"Smokes\")\n\nThese can be combined into a single ``Dataset`` as follows:\n\n  \u003e\u003e\u003e dataset = inferi.Dataset(variable1, variable2, variable3, variable4)\n  \u003e\u003e\u003e dataset.variables\n  (\u003cVariable 'Names' ('Jon', 'Sue', 'Bob')\u003e, \u003cVariable 'Ages' (19, 34, 38)\u003e, \u003cVa\n  riable 'Heights' (1.87, 1.67, 1.73)\u003e, \u003cVariable 'Smokes' (False, True, True)\u003e)\n\nA dataset can be thought of as representing a table of data, where each variable\nis a column. This dataset represents a table like this::\n\n    Names Ages Heights Smokes\n\n    Jon   19   1.87    No\n    Sue   34   1.67    Yes\n    Bob   38   1.73    Yes\n\nYou can get the rows of a dataset too:\n\n  \u003e\u003e\u003e dataset.rows\n  (('Jon', 19, 1.87, False), ('Sue', 34, 1.67, True), ('Bob', 38, 1.73, True))\n\nA Dataset can be sorted, by default by the first column but this can be made\notherwise:\n\n  \u003e\u003e\u003e dataset.sort()\n  \u003e\u003e\u003e datset.rows\n  (('Bob', 38, 1.73, True), ('Jon', 19, 1.87, False), ('Sue', 34, 1.67, True))\n  \u003e\u003e\u003e dataset.sort(variable3)\n  \u003e\u003e\u003e dataset.rows\n  (('Sue', 34, 1.67, True), ('Bob', 38, 1.73, True), ('Jon', 19, 1.87, False))\n\nProbability\n~~~~~~~~~~~\n\nProbabilty is a way of looking all the ways something *can* happen and assessing\nhow likely the outcomes are.\n\nEveryone's favourite example is rolling a die - there are six possible outcomes\nin the Sample Space:\n\n  \u003e\u003e\u003e space = inferi.SampleSpace(1, 2, 3, 4, 5, 6)\n\nThis defines a sample space with six outcomes. Each of these is a simple event:\n\n  \u003e\u003e\u003e space.simple_events\n  {\u003cSimpleEvent: 1\u003e, \u003cSimpleEvent: 2\u003e, \u003cSimpleEvent: 3\u003e, \u003cSimpleEvent: 4\u003e, \u003cSimp\n  leEvent: 5\u003e, \u003cSimpleEvent: 6\u003e}\n  \u003e\u003e\u003e space.event(5)\n  \u003cSimpleEvent: 5\u003e\n  \u003e\u003e\u003e space.event(5).probability()\n  0.16666666666666666\n  \u003e\u003e\u003e space.event(5).probability(fraction=True)\n  Fraction(1, 6)\n  \u003e\u003e\u003e space.chances_of(5)\n  0.16666666666666666\n\nEvents are some combination of simple events. For example, to define the event\nthat a rolled die produces an even number:\n\n  \u003e\u003e\u003e even_event = space.event(lambda o: o % 2 == 0, name=\"even\")\n  \u003e\u003e\u003e even_event\n  \u003cEvent: even\u003e\n  \u003e\u003e\u003e even_event.name\n  'even'\n  \u003e\u003e\u003e even_event.probability()\n  0.5\n  \u003e\u003e\u003e even_event.outcomes()\n  {2, 4, 6}\n  \u003e\u003e\u003e even_event.outcomes(p=True)\n  {2: 0.16666666666666666, 4: 0.16666666666666666, 6: 0.16666666666666666}\n  \u003e\u003e\u003e even_event in space\n  True\n\nTwo events can be compared. Here we create two more events:\n\n  \u003e\u003e\u003e odd_event = space.event(lambda o: o % 2 != 0, name=\"odd\")\n  \u003e\u003e\u003e large_event = space.event(lambda o: o \u003e 4)\n  \u003e\u003e\u003e odd_event.mutually_exclusive_with(even_event)\n  True\n  \u003e\u003e\u003e large_event.mutually_exclusive_with(even_event)\n  False\n  # Does knowing number is even affect chances of being odd? (Obviously...)\n  \u003e\u003e\u003e odd_event.dependent_on(even_event)\n  True\n  # Does knowing number is even affect chances of being greater than 4?\n  \u003e\u003e\u003e large_event.dependent_on(even_event)\n  False\n\nYou can even make new events from them...\n\n  \u003e\u003e\u003e small_and_even = large_event.complement \u0026 even_event\n  \u003e\u003e\u003e small_and_even.probability()\n  0.333333333333333\n  \u003e\u003e\u003e small_and_even.outcomes()\n  {2, 4}\n\n\nChangelog\n---------\n\nRelease 0.5.0\n~~~~~~~~~~~~~\n\n`1 May 2018`\n\n* Implemented combinatorics and permutations.\n\n* Added basic probability tools:\n\n  * Events, simple events and event spaces.\n\n  * Conditional probability.\n\n  * Concept of 'and' and 'or'.\n\n* Turned certain property methods into actual properties.\n\n\nRelease 0.4.0\n~~~~~~~~~~~~~\n\n`6 October 2017`\n\n* Added Dataset class for collating Variables.\n\n\nRelease 0.3.0\n~~~~~~~~~~~~~\n\n`27 August 2017`\n\n* Renamed Series 'Variable'\n\n* Added error handling.\n\n* Added Variable averaging and adding/subtracting.\n\n* Added z-score.\n\n* Generally overhauled everything.\n\n\nRelease 0.2.0\n~~~~~~~~~~~~~\n\n`26 March 2017`\n\n* Added option to make a Series a population rather than a sample.\n\n* Added covariance and correlation measures.\n\nRelease 0.1.0\n~~~~~~~~~~~~~\n\n`21 March 2017`\n\n* Added basic Series class.\n\n* Added methods for measures of centrality and basic measures of dispersion.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamirelanduk%2Finferi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamirelanduk%2Finferi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamirelanduk%2Finferi/lists"}