https://github.com/gamussa/parkflow
Real-time parking management system with stream processing and analytics capabilities.
https://github.com/gamussa/parkflow
duckdb flink-stream-processing kafka kafka-streams python uv
Last synced: about 2 months ago
JSON representation
Real-time parking management system with stream processing and analytics capabilities.
- Host: GitHub
- URL: https://github.com/gamussa/parkflow
- Owner: gAmUssA
- License: mit
- Created: 2024-12-17T18:34:28.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-23T23:16:53.000Z (over 1 year ago)
- Last Synced: 2025-01-03T04:41:47.212Z (over 1 year ago)
- Topics: duckdb, flink-stream-processing, kafka, kafka-streams, python, uv
- Language: Kotlin
- Homepage:
- Size: 558 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.adoc
Awesome Lists containing this project
README
= ParkFlow
:toc: macro
:toc-title: Table of Contents
:toclevels: 3
[.lead]
PARKFLOWβ’ (Parking Analytics & Real-time Knowledge Framework for Live Operations Window).
Real-time parking management system with stream processing and analytics capabilities.
WARNING: Not an actual system, just a demo of integrating a bunch of technologies of Modern Data Stack β Kafka for Streaming, DuckDB for Analytics, Plotly/Dash for visualisation.
== Quick Start
[source,bash]
----
# Clone and start the project
git clone https://github.com/yourusername/parkflow.git
cd parkflow
make install # Install dependencies
make start # Start all services
# Send some test events
make simulate EVENTS=20 DELAY=500
# Open dashboard
open http://localhost:8050
----
toc::[]
== About ParkFlow
ParkFlow is a modern parking management system designed to handle real-time vehicle entry/exit events, process payments, and provide instant analytics.
Key features:
* Real-time vehicle detection and entry/exit management
* Stream processing of parking events using Kafka Streams and Apache Flink
* Analytics dashboard with DuckDB for quick insights
* Scalable microservices architecture using Kotlin and Python
* Modern web interface built with Plotly Dash
== Project Structure
Each component is designed to be independent and scalable.
The system follows modern microservices architecture principles.
All services communicate through well-defined APIs and event streams.
image::img.png[]
[cols="3"]
|===
|Module |Status |Description
|`parkflow-common`
|π’ Ready
|Shared models and utilities.
This module contains code used across multiple services.
|`parkflow-entry-exit`
|π’ Ready
|Gate control service.
Manages vehicle entry and exit points.
|`parkflow-dashboard`
|π’ Ready
|Plotly Dash UI.
Provides real-time visualization and monitoring.
|`parkflow_cli`
|π’ Ready
|Command Line Interface.
Provides development tools and utilities.
|`parkflow-stream`
|π‘ In Progress
|Kafka Streams processor.
Handles real-time event processing and state updates.
|`parkflow-analytics`
|π‘ In Progress
|DuckDB analytics engine.
Provides fast analytical queries and insights.
|`parkflow-flink`
|π΄ Not Started
|Flink processor implementation.
Provides advanced stream processing capabilities.
|`parkflow-payment`
|π΄ Not Started
|Payment processing service.
Handles all financial transactions.
|`parkflow-gateway`
|π΄ Not Started
|API Gateway.
Routes requests and handles service discovery.
|===
== Technology Stack
=== JVM Services
* Kotlin with Gradle (Kotlin DSL)
=== Python Services
* Python 3.11+ with UV package manager
=== Stream Processing
* Kafka Streams 3.6.1
* Apache Flink 1.19
=== Services & Analytics
* REST Services: FastAPI
* Analytics: DuckDB
* Dashboards: Plotly/Dash
* Event Schemas: Apache Avro
=== Testing
* JVM: Kotest with TestContainers
* Python: pytest with pytest-asyncio
* Coverage minimum: 80%
== Event Schema and Flow
=== Event Types
==== VehicleEntryEvent
Records vehicle entry into parking facility:
* Unique event ID and timestamp
* License plate and recognition confidence
* Gate and lane identifiers
* Optional vehicle image URL
* Vehicle type (CAR, MOTORCYCLE, TRUCK)
==== PaymentEvent
Records parking payment transactions:
* Unique event and transaction IDs
* License plate reference
* Amount and currency
* Payment method (CREDIT_CARD, DEBIT_CARD, MOBILE_PAYMENT, CASH)
* Payment status (PENDING, COMPLETED, FAILED, REFUNDED)
* Parking duration
==== VehicleExitEvent
Records vehicle exit from parking facility:
* Unique event ID and timestamp
* License plate and recognition confidence
* Gate and lane identifiers
* Reference to entry event
* Optional vehicle image URL
=== Event Flow Logic
The system generates events following a strict sequence:
. *Vehicle Entry*
** Generated when capacity allows
** More frequent during peak hours (8-9 AM, 4-5 PM)
** Less frequent during quiet periods (11 PM - 5 AM)
. *Payment Processing*
** Occurs after minimum 5-minute stay
** 85% of sessions require payment
** Payment amount based on duration (base $2 + $3/hour, max $25)
** Payment methods distribution:
*** Credit Card: 70%
*** Debit Card: 25%
*** Cash: 5%
. *Vehicle Exit*
** Generated after payment completion (if required)
** Must reference original entry event
** Completes the parking session
=== Stay Duration Patterns
* Quick stops: 5-15 minutes (10% of sessions)
* Shopping: 1-3 hours (60% of sessions)
* Work/Long-term: 8-10 hours (30% of sessions)
=== Traffic Patterns
* Peak Hours (8-9 AM, 4-5 PM)
** 80% chance of new entry
** 6-12 second intervals
* Normal Hours
** 30% chance of new entry
** 20-60 second intervals
* Quiet Period (11 PM - 5 AM)
** 10% chance of new entry
** 10-15 minute intervals
== Analytics Service (DuckDB)
The analytics service provides a REST API for querying and analyzing parking data using DuckDB.
=== API Endpoints
[cols="2,1,2"]
|===
|Endpoint |Method |Description
|`/health`
|GET
|Check service health
|`/query`
|POST
|Execute SQL queries
|`/upload`
|POST
|Upload CSV files to DuckDB tables
|`/tables`
|GET
|List available tables
|`/schema/{table_name}`
|GET
|Get schema for a specific table
|`/analyze/{table_name}`
|POST
|Get basic statistics for a table
|===
=== Example Usage
.Check service health
[source,bash]
----
curl http://localhost:3000/health
----
.Execute a SQL query
[source,bash]
----
curl -X POST http://localhost:3000/query \
-H "Content-Type: application/json" \
-d '{"query": "SELECT * FROM parking_events LIMIT 5"}'
----
.Upload a CSV file
[source,bash]
----
curl -X POST http://localhost:3000/upload \
-F "file=@data.csv" \
-F "table_name=parking_events"
----
.Get table schema
[source,bash]
----
curl http://localhost:3000/schema/parking_events
----
.Get table statistics
[source,bash]
----
curl -X POST http://localhost:3000/analyze/parking_events \
-H "Content-Type: application/json" \
-d '{"columns": ["duration", "amount"]}'
----
=== Docker Configuration
The DuckDB service is containerized using Docker with the following features:
* Uses official DuckDB binary (v1.1.0)
* FastAPI-based REST interface
* Persistent storage in `/data/analytics.db`
* Health monitoring
* CORS support for web clients
=== Environment Variables
[cols="1,2"]
|===
|Variable |Description
|`DUCKDB_DATABASE`
|Path to DuckDB database file (default: `/data/analytics.db`)
|`PYTHONUNBUFFERED`
|Python output buffering (set to 1 for immediate logs)
|===
== Development Setup
=== Prerequisites
* JDK 17+
* Python 3.11+
* Docker and Docker Compose
* UV package manager
=== Building the Project
.JVM services
[source,bash]
----
./gradlew build
----
.Python services
[source,bash]
----
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
----
=== Running Infrastructure
[source,bash]
----
docker-compose up -d
----
== Testing
.JVM Services
[source,bash]
----
./gradlew test
----
.Python Services
[source,bash]
----
pytest
----
[NOTE]
====
Coverage reports will be generated in:
* JVM services: `build/reports`
* Python services: `.coverage`
====
== Running the Applications
=== Prerequisites
* Java 17 or higher
* Docker and Docker Compose
* Gradle 8.x
=== Running parkflow-entry-exit
==== Local Development
To run the application locally with default settings:
[source,bash]
----
./gradlew parkflow-entry-exit:run
----
This will use the following default configuration:
* Kafka Bootstrap Servers: localhost:29092
* Schema Registry URL: http://localhost:8081
* Application Port: 8085
* Host: 0.0.0.0
==== Environment Variables
You can customize the application behavior using environment variables:
[cols="1,1,1"]
|===
|Variable |Description |Default Value
|KAFKA_BOOTSTRAP_SERVERS
|Comma-separated list of Kafka brokers
|localhost:29092
|KAFKA_TOPIC
|Name of the Kafka topic
|parking.entry.events
|SCHEMA_REGISTRY_URL
|URL of the Schema Registry
|http://localhost:8081
|PORT
|Application port
|8085
|HOST
|Application host
|0.0.0.0
|===
==== Running with Different Profiles
Use the provided script to set environment variables for different profiles:
[source,bash]
----
# For local development
source ./scripts/set-profile.sh local
# For cloud deployment
source ./scripts/set-profile.sh cloud
----
==== Testing the Application
===== Single Event
[source,bash]
----
curl -X POST http://localhost:8085/api/v1/entry/event
----
===== Simulation
[source,bash]
----
curl -X POST http://localhost:8085/api/v1/entry/simulate \
-H "Content-Type: application/json" \
-d '{"numberOfEvents": 10, "delayBetweenEventsMs": 1000}'
----
=== Infrastructure Services
The application requires the following services that are defined in docker-compose.yml:
* Kafka (apache/kafka:3.8.0)
** Running in Kraft mode
** External port: 29092
** Internal port: 9092
* Schema Registry (confluentinc/cp-schema-registry:7.8.0)
** Port: 8081
** Depends on Kafka
* DuckDB Analytics
** Port: 3000
** Mounted volume: ./data
=== Troubleshooting
==== Common Issues
1. Port Already in Use
+
[source,bash]
----
lsof -i :8085 # Check if port 8085 is in use
kill -9 # Kill the process if needed
----
2. Kafka Connection Issues
+
[source,bash]
----
# Check if Kafka is running
docker compose ps
# Check Kafka logs
docker compose logs kafka
# Restart Kafka
docker compose restart kafka
----
3. Schema Registry Issues
+
[source,bash]
----
# Check Schema Registry status
curl -X GET http://localhost:8081
# Check Schema Registry logs
docker compose logs schema-registry
----
=== Health Checks
The application provides health check endpoints:
* Kafka: Check via producer metrics
* Schema Registry: Available at http://localhost:8081/subjects
* Application: Main endpoint at http://localhost:8085/api/v1/entry/event
== Contributing
We welcome contributions!
Here's how you can help:
1. *Fork the Repository*
* Fork the repo on GitHub
* Clone your fork locally
2. *Create a Branch*
* Create a new branch for your feature
* Make your changes
* Write or update tests as needed
3. *Submit a Pull Request*
* Push your changes to your fork
* Submit a pull request to the main repository
* Describe your changes in detail
* Link any relevant issues
4. *Code Review*
* Wait for review from maintainers
* Make any requested changes
* Once approved, your PR will be merged
=== Development Guidelines
* Follow existing code style and conventions
* Write meaningful commit messages
* Add tests for new features
* Update documentation as needed
* Keep PRs focused and atomic
== License
This project is licensed under the MIT License.
See the link:LICENSE[LICENSE] file for details.
Copyright (c) 2024 Viktor Gamov