{"id":49874331,"url":"https://github.com/mtholahan/kafka-mini-project","last_synced_at":"2026-05-15T11:41:30.273Z","repository":{"id":314153192,"uuid":"1054317029","full_name":"mtholahan/kafka-mini-project","owner":"mtholahan","description":"Built a streaming fraud detection system with Apache Kafka and Python. Deployed a Kafka cluster via Docker Compose, implemented a transaction generator and fraud detector using kafka-python, and routed suspicious transactions to separate topics for real-time monitoring. Demonstrates event streaming, producers, consumers, and containerization.","archived":false,"fork":false,"pushed_at":"2025-11-11T20:31:08.000Z","size":513,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-11T22:23:05.925Z","etag":null,"topics":["bootcamp","consumers","data-engineering","docker","docker-compose","event-driven","fraud-detection","kafka","producers","python","springboard","streaming"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mtholahan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-10T17:04:10.000Z","updated_at":"2025-11-11T20:31:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"d4392662-c9db-4631-96c0-16df70dfef2d","html_url":"https://github.com/mtholahan/kafka-mini-project","commit_stats":null,"previous_names":["mtholahan/kafka-mini-project"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mtholahan/kafka-mini-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtholahan%2Fkafka-mini-project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtholahan%2Fkafka-mini-project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtholahan%2Fkafka-mini-project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtholahan%2Fkafka-mini-project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mtholahan","download_url":"https://codeload.github.com/mtholahan/kafka-mini-project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtholahan%2Fkafka-mini-project/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33066031,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-15T11:35:32.926Z","status":"ssl_error","status_checked_at":"2026-05-15T11:35:31.362Z","response_time":103,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bootcamp","consumers","data-engineering","docker","docker-compose","event-driven","fraud-detection","kafka","producers","python","springboard","streaming"],"created_at":"2026-05-15T11:41:29.788Z","updated_at":"2026-05-15T11:41:30.268Z","avatar_url":"https://github.com/mtholahan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kafka Mini Project\r\n\r\n\r\n## 📖 Abstract\r\nThis project implements a real-time fraud detection pipeline using Apache Kafka and Python. The system simulates financial transactions, streams them through Kafka, and applies rule-based filtering to flag suspicious activity. The goal is to gain practical experience with streaming architectures, producers, consumers, and containerized deployments.\r\r\n\r\r\nThe workflow includes:\r\r\n\r\r\n* Running a local Kafka cluster using Docker Compose with broker and Zookeeper services.\r\r\n\r\r\n* Building a transaction generator that continuously produces randomized account transfers into a Kafka topic.\r\r\n\r\r\n* Creating a fraud detector application that consumes transactions, evaluates them against business rules, and branches outputs into \"legit\" or \"fraud\" topics.\r\r\n\r\r\n* Packaging all components with Dockerfiles, requirements.txt, and docker-compose.yml for reproducibility.\r\r\n\r\r\n* Verifying results by consuming messages from output topics, confirming that transactions over $900 are correctly flagged as fraudulent.\r\r\n\r\r\nThrough this project, I gained hands-on skills in stream processing, Kafka topic design, producer/consumer APIs, and containerized workflow orchestration, while also exploring real-world challenges in fraud detection systems.\r\n\r\n\r\n\r\n## 🛠 Requirements\r\n- Docker Engine 20.x or later\r\r\n- Docker Compose v2\r\r\n- Ubuntu 22.04 LTS environment (tested)\r\r\n- docker-compose.yml defining all services:\r\r\n  - zookeeper (Confluent cp-zookeeper)\r\r\n  - kafka broker (Confluent cp-kafka)\r\r\n  - generator (Python producer app)\r\r\n  - detector (Python consumer/producer app)\r\r\n- Python dependency (inside app containers):\r\r\n  - kafka-python\r\n\r\n\r\n\r\n## 🧰 Setup\r\n- Clone repository and navigate to kafka-docker/ directory\r\r\n- Build images: docker-compose build --no-cache\r\r\n- Start cluster + apps: docker-compose up -d\r\r\n- Verify broker startup logs (Kafka ready)\r\r\n- Verify generator and detector services running\r\r\n- Inspect Kafka topics via kafka-console-consumer from broker container\r\n\r\n\r\n\r\n## 📊 Dataset\r\n- Streaming data consists of synthetic transactions generated by the producer app\r\r\n- Transaction schema includes: transaction_id, account_id, timestamp, amount, merchant, location\r\n\r\n\r\n\r\n## ⏱️ Run Steps\r\n- Start services with: docker-compose up -d\r\r\n- Producer (generator) writes messages into topic: queueing.transactions\r\r\n- Consumer (detector) reads queueing.transactions, applies fraud detection rules, and branches to:\r\r\n  - streaming.transactions.legit\r\r\n  - streaming.transactions.fraud\r\r\n- Verify output using kafka-console-consumer inside broker container\r\n\r\n\r\n\r\n## 📈 Outputs\r\n- Two Kafka topics with processed messages:\r\r\n  - streaming.transactions.legit (valid transactions)\r\r\n  - streaming.transactions.fraud (flagged transactions)\r\r\n- Console logs showing consumed/produced records\r\r\n- Demonstration of near real-time fraud detection pipeline\r\n\r\n\r\n\r\n## 📸 Evidence\r\n\r\n![01_docker_running.png](./evidence/01_docker_running.png)  \r\nScreenshot of Dockerized Kafka running\r\n\r\n![02_code_being_executed.png](./evidence/02_code_being_executed.png)  \r\nScreenshot of code execution\r\n\r\n![03_legit_transactions.png](./evidence/03_legit_transactions.png)  \r\nScreenshot of legitimate transactions\r\n\r\n![04_fraudulent_transactions.png](./evidence/04_fraudulent_transactions.png)  \r\nScreenshot of fraudulent transactions\r\n\r\n\r\n\r\n\r\n## 📎 Deliverables\r\n\r\n- [`docker-compose.yml`](./deliverables/docker-compose.yml)\r\n\r\n- [`detector_requirements.txt`](./deliverables/detector_requirements.txt)\r\n\r\n- [`detector_app.py`](./deliverables/detector_app.py)\r\n\r\n- [`generator_requirements.txt`](./deliverables/generator_requirements.txt)\r\n\r\n- [`generator_app.py`](./deliverables/generator_app.py)\r\n\r\n\r\n\r\n\r\n## 🛠️ Architecture\r\n- Multi-container Docker environment\r\r\n- Services:\r\r\n  - Producer app → Kafka broker\r\r\n  - Detector app (consumer + branching producer)\r\r\n  - Zookeeper for coordination\r\r\n- Data flow:\r\r\n  generator → queueing.transactions → detector → (fraud or legit topics)\r\n\r\n\r\n\r\n## 🔍 Monitoring\r\n- Kafka CLI tools (kafka-console-consumer) to inspect topics\r\r\n- Docker logs for generator and detector services\r\r\n- Broker logs for message flow validation\r\n\r\n\r\n\r\n## ♻️ Cleanup\r\n- Stop services: docker-compose down\r\r\n- Remove local Docker volumes for Kafka logs/state if re-running\r\r\n- Delete external Docker network if created manually\r\n\r\n\r\n*Generated automatically via Python + Jinja2 + SQL Server table `tblMiniProjectProgress` on 11-11-2025 15:31:05*","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtholahan%2Fkafka-mini-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmtholahan%2Fkafka-mini-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtholahan%2Fkafka-mini-project/lists"}