An open API service indexing awesome lists of open source software.

https://github.com/scarlet-enlight/retailrocket-recommender-system

E-commerce recommendation system simulation using the Retailrocket dataset, Apriori algorithm, ASP.NET Core, FastAPI, DuckDB, and PostgreSQL.
https://github.com/scarlet-enlight/retailrocket-recommender-system

apriori-algorithm asp-net-core duckdb fastapi postgresql retailrocket

Last synced: 10 days ago
JSON representation

E-commerce recommendation system simulation using the Retailrocket dataset, Apriori algorithm, ASP.NET Core, FastAPI, DuckDB, and PostgreSQL.

Awesome Lists containing this project

README

          

# Retailrocket Recommendation System & Shop Simulation

An end-to-end e-commerce application and analytical pipeline built as an engineering thesis project at Silesian University of Technology (Politechnika Śląska). The system processes historical e-commerce logs to generate market basket insights using the Apriori algorithm and serves real-time product recommendations within a simulated online store.

---

## Architecture & Team Roles

The system uses a single **PostgreSQL** instance isolated into three database schemas to maintain a strict separation of concerns:

| Schema | Ownership | Description | Tech Stack |
| :--- | :--- | :--- | :--- |
| `historical` | **Data Engineering**
[@LonelyLake](https://github.com/LonelyLake) | Raw data ingestion, cleaning, and preparation of the "basket" format. | DuckDB, Python |
| `shop` | **Web Development**
[@ElPollaco](https://github.com/ElPollaco) | Operational shop data (users, carts, simulated checkout transactions). | ASP.NET Core, EF Core |
| `ml` | **Data Science**
[@Blazejost](https://github.com/Blazejost) | Association rules generated by the Apriori algorithm, exposed via API. | FastAPI, mlxtend |

---

## Project Structure

```text
├── data/ # Raw and processed datasets (Local only, Git ignored)
│ ├── raw/ # Place downloaded Retailrocket CSVs here
│ └── processed/ # DuckDB analytical storage files
├── database/ # Docker Compose & Database Initialization scripts
├── data-pipeline/ # ETL processes (DuckDB analytical engine)
├── backend/ # E-commerce web backend (C# / .NET)
└── ml-service/ # Association rule mining engine & REST API (Python)
```

## Quick Start (Local Database Deployment)

1. **Configure Environment Variables:**

Navigate to the database directory and copy the template environment file:

```bash
cd database
cp .env.example .env
```

(Optional: Open the newly created `.env` file and change the password if needed).

2. **Start the Database:**

Spin up the PostgreSQL instance with all predefined schemas and tables:

```bash
docker compose up -d
```

The database will be automatically initialized using `init.sql`. You can connect via DBeaver or any other client using the credentials defined in your local `.env` file (default values):

- Host: `localhost`
- Port: `5559` (or whatever you set as `DB_HOST_PORT` in `database/.env`)
- Database: `retailrocket`
- User: `admin`
- Password: `admin` (or your custom password from `.env`)

3. **Dataset Setup (Local Only):**
- Ensure the `data/raw/` and `data/processed/` folders exist in the project root.
- Download the raw Retailrocket CSV files from the team's Google Drive (link shared in chat) and place them inside `data/raw/`.

## Git & Development Workflow

To ensure clean collaboration, the team follows the GitHub Flow:
`main` branch is protected. No direct commits allowed.

Create a feature branch for your work: `feature/your-feature-name`.

Open a **Pull Request (PR)** to merge into `main`. At least one team member must review it.

Keep code and comments strictly in **English**.