{"id":19469702,"url":"https://github.com/niloth-p/apriori-implementation-in-python","last_synced_at":"2025-04-25T11:33:25.200Z","repository":{"id":130153874,"uuid":"125255422","full_name":"Niloth-p/Apriori-Implementation-in-Python","owner":"Niloth-p","description":"Implementation of the Apriori algorithm in python, to generate frequent itemsets and association rules. Experimentation with different values of confidence and support values.","archived":false,"fork":false,"pushed_at":"2020-08-08T22:26:28.000Z","size":79,"stargazers_count":6,"open_issues_count":1,"forks_count":10,"subscribers_count":1,"default_branch":"master","last_synced_at":"2023-10-19T20:28:05.564Z","etag":null,"topics":["apriori","association-rules","confidence","frequent-itemsets","transaction"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Niloth-p.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-03-14T18:17:11.000Z","updated_at":"2023-10-16T21:56:49.000Z","dependencies_parsed_at":"2023-07-20T23:30:42.405Z","dependency_job_id":null,"html_url":"https://github.com/Niloth-p/Apriori-Implementation-in-Python","commit_stats":null,"previous_names":["niloth-p/apriori-implementation-in-python","nilothpal-pillai/apriori-implementation-in-python"],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Niloth-p%2FApriori-Implementation-in-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Niloth-p%2FApriori-Implementation-in-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Niloth-p%2FApriori-Implementation-in-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Niloth-p%2FApriori-Implementation-in-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Niloth-p","download_url":"https://codeload.github.com/Niloth-p/Apriori-Implementation-in-Python/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224000813,"owners_count":17239000,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apriori","association-rules","confidence","frequent-itemsets","transaction"],"created_at":"2024-11-10T18:53:35.013Z","updated_at":"2024-11-10T18:53:36.467Z","avatar_url":"https://github.com/Niloth-p.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apriori-Implementation-in-Python\nThis Python program generates frequent item sets and association rules from given datasets using Apriori algorithm.\n\n## Different support and confidence values - outputs:\n  \n``` Support, confidence - #rules generated\n0.02, 0.35 - 26\n0.02, 0.42 - 10\n0.03, 0.39 - 6\n0.04, 0.35 - 5\n0.04, 0.42 - 2\n0.05, 0.35 - 0\n```\n\nThe output is stored in 2 files:\n\n### File1:\n\u003cpre\u003eFormat: Freq Itemset \u003e\u003e\u003e count\nGlobal variable f1\nDefault value of f1 : FItems.txt\n\u003c/pre\u003e\n\n### File2:\n\u003cpre\u003eFormat: LHS itemset (count) -\u003e RHS itemset (count) [confidence] \nGlobal variable f2\nDefault value of f2 : Rules.txt\n\nThe outputs are appended to the files\nSo if you want to run the program multiple times, remember that the data will be written multiple times\n\u003c/pre\u003e\n### Dataset used:\ngroceries.csv\n\n### Rules for using other datasets:\nChange the global variable DataFile to the filename\n\n## Pre-processing of data\n### Sorter.py - to sort the transactions data in lexicographical order Stripped off whitespaces and newlines.\n\u003cpre\u003eAnd converted the data into a more comfortable format for running the program,\nwith each line representing a single transaction, with the items being comma separated.\nGot each transaction as a list from the csv and sorted each list and wrote the sorted transactions into a new csv.\n\u003c/pre\u003e\n## Formulae used and pseudo code of algorithm:\n\n## Apriori:-\n\u003cpre\u003eGenerate frequent 1-itemsets - L1()\nGenerate Ck from Lk-1 - generateCk()\nGenerate Lk from Ck - generateLk()\nGenerate rules from frequent itemsets - rulegenerator()\n\nEach of these are written in detail below.\n\u003c/pre\u003e\n### L1():\tFind frequent 1-itemsets\n\u003cpre\u003eRead data from the csv file and store it into a list.\nSort the data if necessary.\nGo through all the elements in each transaction and store their counts in a dictionary.\nThreshold them i.e create a new dictionary with old dictionary values that had a support greater than the support threshold.\nThe final list is made into a set, to avoid repetition.\n\u003c/pre\u003e\n### generateCk(Lk_1, flag, data): Generate Ck by joining 2 Lk-1 \n\u003cpre\u003eTraverse through all the itemsets of Lk_1 and on finding 2 itemsets that are identical,\nexcept for the last element, merge them (i.e their union)in a sorted manner and insert into Ck.\nThe final list Ck is made into a set, to avoid repetition.\n\u003c/pre\u003e\n### generateLk(Ck, data):\tCk -\u003e Ct -\u003e L\n\u003cpre\u003eIf itemset in Ck belongs to a transaction, it makes it into list Ct, and its support is updated by 1,\neach time a transaction contains the itemset. Then Ct is thresholded to form L,\nusing the support calculated during creation of Ct. L is stored in a new dicitonary,\nby choosing itemsets above threshold from the old dictionary.\n\u003c/pre\u003e\n### rulegenerator(fitems): Generates association rules from the frequent itemsets\n\u003cpre\u003eFor each itemset in the frequent items list, compute its total support.\nThen get a list of all possible combinations of splitting the itemset into LHS and RHS, with min of 1 element.\nCalculare support for each of these combinations from the dictionary, \nand if total_support/combination_support is greater than the min confidence value,\nit is added as a rule, and written to f2.\n\u003c/pre\u003e\n\nA lot of conversion of lists to tuples would be required, since lists cannot be hashed into dictionaries as keys.\n\nAnd lists should be converted into sets, to avoid repetition, which could affect the count values significantly, otherwise.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fniloth-p%2Fapriori-implementation-in-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fniloth-p%2Fapriori-implementation-in-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fniloth-p%2Fapriori-implementation-in-python/lists"}