An open API service indexing awesome lists of open source software.

https://github.com/urvee1810/market_basket_analysis

A data mining project analyzing Instacart's 3 million grocery orders to uncover customer shopping patterns and product associations. Using market basket analysis and the Apriori algorithm, the project reveals key insights about shopping behavior, product combinations, and temporal patterns, providing valuable recommendations for retail strategy
https://github.com/urvee1810/market_basket_analysis

apriori-algorithm data-mining data-visualization machine-learning market-basket-analysis matplotlib mlxtend numpy pandas python seaborn

Last synced: 4 months ago
JSON representation

A data mining project analyzing Instacart's 3 million grocery orders to uncover customer shopping patterns and product associations. Using market basket analysis and the Apriori algorithm, the project reveals key insights about shopping behavior, product combinations, and temporal patterns, providing valuable recommendations for retail strategy

Awesome Lists containing this project

README

        

# Instacart Market Basket Analysis

*This project was completed as part of PG Level Advanced Certification Programme in Computational Data Science coursework at Centre for Continuing Education - Indian Institute of Science in collaboration with Talent Sprint*

A special thanks to Prof. Shashi Jain & Mentor Mr. Sachin Sharma

Problem Statement: Extract association rules and find groups of frequently purchased items from a large-scale grocery orders dataset.

Module: Business Analytics

Project Type: Team

## Project Overview
A comprehensive analysis of Instacart's customer purchase patterns using market basket analysis techniques. The project analyzes over 3 million grocery orders from 200,000+ Instacart users to uncover shopping patterns, product associations, and temporal trends.

## Dataset
- Over 3 million grocery orders
- 200,000+ Instacart users
- Data spread across multiple files:
- orders.csv
- products.csv
- aisles.csv
- departments.csv
- order_products_train.csv

## Analysis Components

### 1. Data Integration & Preprocessing
- Merged multiple data sources into a unified dataset
- Handled missing values and data transformations
- Created purchase frequency matrices

### 2. Exploratory Data Analysis
- Product frequency analysis
- Department-wise purchase patterns
- Temporal analysis (day of week, hour of day)
- Reorder behavior analysis

### 3. Market Basket Analysis
- Implemented Apriori algorithm
- Generated association rules
- Analyzed product co-occurrence patterns

## Key Findings

### Shopping Patterns
- Most popular product: Bananas
- Peak ordering hours: 10 AM - 4 PM
- Higher order frequencies during weekends
- Clear patterns in reorder behavior

### Product Associations
- Generated frequent itemsets with minimum support of 0.01
- Identified strong product associations using lift metric
- Discovered valuable product grouping patterns

## Technologies Used
- Python
- Pandas
- NumPy
- Seaborn
- Matplotlib
- MLxtend (for Apriori algorithm)
- Scipy
- Apriori Algorithm

## Business Applications
- Inventory optimization
- Store layout recommendations
- Targeted marketing strategies
- Product recommendation systems
- Staffing optimization
- Reorder prediction

## Acknowledgments
- Instacart for providing the dataset

---
**Note**: This project is for educational purposes and uses a public dataset from Instacart - Please download this by yourself.