https://github.com/hannahgsimon/apriorialgorithm

Developed code for Apriori's Algorithm, an association rule data mining technique to identify underlying relations between different items
https://github.com/hannahgsimon/apriorialgorithm

apriori-algorithm association-rule data-mining flask frequent-itemsets html python

Last synced: 5 days ago
JSON representation

Developed code for Apriori's Algorithm, an association rule data mining technique to identify underlying relations between different items

Host: GitHub
URL: https://github.com/hannahgsimon/apriorialgorithm
Owner: hannahgsimon
Created: 2024-09-27T02:43:38.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-12-26T07:51:56.000Z (11 months ago)
Last Synced: 2024-12-26T08:27:27.406Z (11 months ago)
Topics: apriori-algorithm, association-rule, data-mining, flask, frequent-itemsets, html, python
Language: Python
Homepage:
Size: 43.9 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Apriori Algorithm Implementation

## Overview
- The Apriori algorithm is a classic algorithm used in data mining for extracting frequent itemsets from large datasets. It operates under the principle of "bottom-up" generation, where it generates candidate itemsets from the frequent itemsets found in the previous iteration. This code is designed to analyze a CSV file containing data and identify itemsets that meet a specified minimum support threshold.
- This README focuses on the command-line version from the last commit on 11/03/2024.
- The latest commit introduces a Flask web application that implements the same functionality. It includes a `static` folder storing web assets such as images, a `templates` folder for HTML files, a `Procfile` for deployment, and `requirements.txt` to manage dependencies.

## Requirements
- Python 3.x
- Standard Python libraries: `itertools`, `sys`

## Usage
1. Prepare your dataset in a CSV file format. Each row should represent a numerical data entry, with each item in a separate cell. The first column is ignored, and only the subsequent columns are processed as items. This program assumes that items are in ascending numerical order within each row, and ignores duplicates within a single row.
2. Run the script with the CSV `file_name` and the desired minimum support threshold `min_sup` passed as command-line arguments..
```bash
python3 apriori-algorithm.py 1000-out1.csv 20

## Output
The program generates the following output to the command terminal:
- File Run: `file_name`
- Minimum Support: `min_sup`
- List of frequent itemsets
- Number of frequent itemsets

## Author
Hannah G. Simon is the sole developer of this Apriori algorithm implementation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hannahgsimon/apriorialgorithm

Awesome Lists containing this project

README