An open API service indexing awesome lists of open source software.

https://github.com/mtholahan/kafka-mini-project

Built a streaming fraud detection system with Apache Kafka and Python. Deployed a Kafka cluster via Docker Compose, implemented a transaction generator and fraud detector using kafka-python, and routed suspicious transactions to separate topics for real-time monitoring. Demonstrates event streaming, producers, consumers, and containerization.
https://github.com/mtholahan/kafka-mini-project

bootcamp consumers data-engineering docker docker-compose event-driven fraud-detection kafka producers python springboard streaming

Last synced: about 1 month ago
JSON representation

Built a streaming fraud detection system with Apache Kafka and Python. Deployed a Kafka cluster via Docker Compose, implemented a transaction generator and fraud detector using kafka-python, and routed suspicious transactions to separate topics for real-time monitoring. Demonstrates event streaming, producers, consumers, and containerization.

Awesome Lists containing this project

README

          

# Kafka Mini Project

## 📖 Abstract
This project implements a real-time fraud detection pipeline using Apache Kafka and Python. The system simulates financial transactions, streams them through Kafka, and applies rule-based filtering to flag suspicious activity. The goal is to gain practical experience with streaming architectures, producers, consumers, and containerized deployments.

The workflow includes:

* Running a local Kafka cluster using Docker Compose with broker and Zookeeper services.

* Building a transaction generator that continuously produces randomized account transfers into a Kafka topic.

* Creating a fraud detector application that consumes transactions, evaluates them against business rules, and branches outputs into "legit" or "fraud" topics.

* Packaging all components with Dockerfiles, requirements.txt, and docker-compose.yml for reproducibility.

* Verifying results by consuming messages from output topics, confirming that transactions over $900 are correctly flagged as fraudulent.

Through this project, I gained hands-on skills in stream processing, Kafka topic design, producer/consumer APIs, and containerized workflow orchestration, while also exploring real-world challenges in fraud detection systems.

## 🛠 Requirements
- Docker Engine 20.x or later

- Docker Compose v2

- Ubuntu 22.04 LTS environment (tested)

- docker-compose.yml defining all services:

- zookeeper (Confluent cp-zookeeper)

- kafka broker (Confluent cp-kafka)

- generator (Python producer app)

- detector (Python consumer/producer app)

- Python dependency (inside app containers):

- kafka-python

## 🧰 Setup
- Clone repository and navigate to kafka-docker/ directory

- Build images: docker-compose build --no-cache

- Start cluster + apps: docker-compose up -d

- Verify broker startup logs (Kafka ready)

- Verify generator and detector services running

- Inspect Kafka topics via kafka-console-consumer from broker container

## 📊 Dataset
- Streaming data consists of synthetic transactions generated by the producer app

- Transaction schema includes: transaction_id, account_id, timestamp, amount, merchant, location

## ⏱️ Run Steps
- Start services with: docker-compose up -d

- Producer (generator) writes messages into topic: queueing.transactions

- Consumer (detector) reads queueing.transactions, applies fraud detection rules, and branches to:

- streaming.transactions.legit

- streaming.transactions.fraud

- Verify output using kafka-console-consumer inside broker container

## 📈 Outputs
- Two Kafka topics with processed messages:

- streaming.transactions.legit (valid transactions)

- streaming.transactions.fraud (flagged transactions)

- Console logs showing consumed/produced records

- Demonstration of near real-time fraud detection pipeline

## 📸 Evidence

![01_docker_running.png](./evidence/01_docker_running.png)
Screenshot of Dockerized Kafka running

![02_code_being_executed.png](./evidence/02_code_being_executed.png)
Screenshot of code execution

![03_legit_transactions.png](./evidence/03_legit_transactions.png)
Screenshot of legitimate transactions

![04_fraudulent_transactions.png](./evidence/04_fraudulent_transactions.png)
Screenshot of fraudulent transactions

## 📎 Deliverables

- [`docker-compose.yml`](./deliverables/docker-compose.yml)

- [`detector_requirements.txt`](./deliverables/detector_requirements.txt)

- [`detector_app.py`](./deliverables/detector_app.py)

- [`generator_requirements.txt`](./deliverables/generator_requirements.txt)

- [`generator_app.py`](./deliverables/generator_app.py)

## 🛠️ Architecture
- Multi-container Docker environment

- Services:

- Producer app → Kafka broker

- Detector app (consumer + branching producer)

- Zookeeper for coordination

- Data flow:

generator → queueing.transactions → detector → (fraud or legit topics)

## 🔍 Monitoring
- Kafka CLI tools (kafka-console-consumer) to inspect topics

- Docker logs for generator and detector services

- Broker logs for message flow validation

## ♻️ Cleanup
- Stop services: docker-compose down

- Remove local Docker volumes for Kafka logs/state if re-running

- Delete external Docker network if created manually

*Generated automatically via Python + Jinja2 + SQL Server table `tblMiniProjectProgress` on 11-11-2025 15:31:05*