{"id":19669180,"url":"https://github.com/tommyod/efficient-apriori","last_synced_at":"2025-05-15T23:05:00.995Z","repository":{"id":32596823,"uuid":"136807475","full_name":"tommyod/Efficient-Apriori","owner":"tommyod","description":"An efficient Python implementation of the Apriori algorithm.","archived":false,"fork":false,"pushed_at":"2024-09-11T15:06:39.000Z","size":475,"stargazers_count":332,"open_issues_count":1,"forks_count":60,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-05-15T23:04:42.679Z","etag":null,"topics":["apriori-algorithm","association-rules","data-mining","data-science","machinelearning"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tommyod.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-10T12:35:16.000Z","updated_at":"2025-05-13T12:46:47.000Z","dependencies_parsed_at":"2023-02-17T15:46:24.297Z","dependency_job_id":"e10aec50-8d11-4d0a-9194-7783a60392aa","html_url":"https://github.com/tommyod/Efficient-Apriori","commit_stats":{"total_commits":107,"total_committers":8,"mean_commits":13.375,"dds":"0.36448598130841126","last_synced_commit":"7a8b00a2b2c95930f5c60ed386d0b4248b11de96"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tommyod%2FEfficient-Apriori","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tommyod%2FEfficient-Apriori/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tommyod%2FEfficient-Apriori/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tommyod%2FEfficient-Apriori/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tommyod","download_url":"https://codeload.github.com/tommyod/Efficient-Apriori/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254436944,"owners_count":22070946,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apriori-algorithm","association-rules","data-mining","data-science","machinelearning"],"created_at":"2024-11-11T16:39:17.042Z","updated_at":"2025-05-15T23:05:00.968Z","avatar_url":"https://github.com/tommyod.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Efficient-Apriori ![Build Status](https://github.com/tommyod/Efficient-Apriori/workflows/Python%20CI/badge.svg?branch=master) [![PyPI version](https://badge.fury.io/py/efficient-apriori.svg)](https://pypi.org/project/efficient-apriori/) [![Documentation Status](https://readthedocs.org/projects/efficient-apriori/badge/?version=latest)](https://efficient-apriori.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/efficient-apriori)](https://pepy.tech/project/efficient-apriori) [![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n\nAn efficient pure Python implementation of the Apriori algorithm.\n\nThe apriori algorithm uncovers hidden structures in categorical data.\nThe classical example is a database containing purchases from a supermarket.\nEvery purchase has a number of items associated with it.\nWe would like to uncover association rules such as `{bread, eggs} -\u003e {bacon}` from the data.\nThis is the goal of [association rule learning](https://en.wikipedia.org/wiki/Association_rule_learning), and the [Apriori algorithm](https://en.wikipedia.org/wiki/Apriori_algorithm) is arguably the most famous algorithm for this problem.\nThis repository contains an efficient, well-tested implementation of the apriori algorithm as described in the [original paper](https://www.macs.hw.ac.uk/~dwcorne/Teaching/agrawal94fast.pdf) by Agrawal et al, published in 1994.\n\n**The code is stable and in widespread use.** It's cited in the book \"*Mastering Machine Learning Algorithms*\" by Bonaccorso.\n\n**The code is fast.** See timings in [this PR](https://github.com/tommyod/Efficient-Apriori/pull/40).\n\n\n## Example\n\nHere's a minimal working example.\nNotice that in every transaction with `eggs` present, `bacon` is present too.\nTherefore, the rule `{eggs} -\u003e {bacon}` is returned with 100 % confidence.\n\n```python\nfrom efficient_apriori import apriori\ntransactions = [('eggs', 'bacon', 'soup'),\n                ('eggs', 'bacon', 'apple'),\n                ('soup', 'bacon', 'banana')]\nitemsets, rules = apriori(transactions, min_support=0.5, min_confidence=1)\nprint(rules)  # [{eggs} -\u003e {bacon}, {soup} -\u003e {bacon}]\n```\nIf your data is in a pandas DataFrame, you must [convert it to a list of tuples](https://github.com/tommyod/Efficient-Apriori/issues/12).\nDo you have **missing values**, or does the algorithm **run for a long time**? See [this comment](https://github.com/tommyod/Efficient-Apriori/issues/30#issuecomment-626129085).\n**More examples are included below.**\n\n## Installation\n\nThe software is available through GitHub, and through [PyPI](https://pypi.org/project/efficient-apriori/).\nYou may install the software using `pip`.\n\n```bash\npip install efficient-apriori\n```\n\n## Contributing\n\nYou are very welcome to scrutinize the code and make pull requests if you have suggestions and improvements.\nYour submitted code must be PEP8 compliant, and all tests must pass.\nSee list of contributors [here](https://github.com/tommyod/Efficient-Apriori/graphs/contributors).\n\n## More examples\n\n### Filtering and sorting association rules\n\nIt's possible to filter and sort the returned list of association rules.\n\n```python\nfrom efficient_apriori import apriori\ntransactions = [('eggs', 'bacon', 'soup'),\n                ('eggs', 'bacon', 'apple'),\n                ('soup', 'bacon', 'banana')]\nitemsets, rules = apriori(transactions, min_support=0.2, min_confidence=1)\n\n# Print out every rule with 2 items on the left hand side,\n# 1 item on the right hand side, sorted by lift\nrules_rhs = filter(lambda rule: len(rule.lhs) == 2 and len(rule.rhs) == 1, rules)\nfor rule in sorted(rules_rhs, key=lambda rule: rule.lift):\n  print(rule)  # Prints the rule and its confidence, support, lift, ...\n```\n\n### Transactions with IDs\n\nIf you need to know which transactions occurred in the frequent itemsets, set the `output_transaction_ids` parameter to `True`.\nThis changes the output to contain `ItemsetCount` objects for each itemset.\nThe objects have a `members` property containing is the set of ids of frequent transactions as well as a `count` property. \nThe ids are the enumeration of the transactions in the order they appear.    \n\n```python\nfrom efficient_apriori import apriori\ntransactions = [('eggs', 'bacon', 'soup'),\n                ('eggs', 'bacon', 'apple'),\n                ('soup', 'bacon', 'banana')]\nitemsets, rules = apriori(transactions, output_transaction_ids=True)\nprint(itemsets)\n# {1: {('bacon',): ItemsetCount(itemset_count=3, members={0, 1, 2}), ...\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftommyod%2Fefficient-apriori","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftommyod%2Fefficient-apriori","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftommyod%2Fefficient-apriori/lists"}