https://github.com/konh/analyticspipeline
The opinionated approach combines a client-side SDK, operative + historical data storage, and a data mart
https://github.com/konh/analyticspipeline
analytics business-intelligence csharp data-driven data-storage dotnet unity
Last synced: 4 months ago
JSON representation
The opinionated approach combines a client-side SDK, operative + historical data storage, and a data mart
- Host: GitHub
- URL: https://github.com/konh/analyticspipeline
- Owner: KonH
- License: mit
- Created: 2024-03-04T09:06:46.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-06T11:08:55.000Z (about 1 year ago)
- Last Synced: 2025-02-01T16:26:09.178Z (5 months ago)
- Topics: analytics, business-intelligence, csharp, data-driven, data-storage, dotnet, unity
- Language: ShaderLab
- Homepage:
- Size: 1.36 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AnalyticsPipeline
## Summary
As organizations increasingly rely on data-driven insights, creating a robust analytics system becomes paramount.
My project aims to deliver an opinionated approach to building such a system, combining efficiency, scalability, and practicality.
Let’s dive into the key aspects:### Objective and Scope:
- Analytics system serves as a bridge between raw data and actionable insights. It empowers decision-makers by providing real-time and historical data analysis.
- That system focus on a medium-scale solution, striking a balance between agility and depth.### Components:
- **Client-Side SDK**: our SDK facilitates seamless data collection from various sources (Unity game engine: standalone, web, mobile apps). It ensures minimal impact on user experience while capturing essential events.
- **Operative and Historical Data Storage:**
- **Operative Data:** real-time data storage for immediate processing. Minimal latency, but limited volume / time range.
- **Historical Data:** long-term storage for trend analysis, reporting, and compliance. Enables retrospective insights and predictive modeling. Full data retention, but higher latency.
- **Data mart** - both data storages allows to query data using SQL and set up useful views for business users.### Technology Stack:
I’ve chosen the .NET ecosystem for several reasons:
- **Unity Game Developers:** .NET is familiar to Unity developers, allowing seamless transitions between game development and analytics.
- **Language Consistency:** Using C# for both client and server-side development streamlines code maintenance and collaboration.
- **Rich Libraries:** Leverage .NET libraries for data manipulation, security, and scalability.### Business Impact:
That analytics system empowers stakeholders to:
- Optimize user experiences based on real-time data.
- Identify bottlenecks, anomalies, and growth opportunities.
- Make informed decisions backed by historical trends.
- Enhance operational efficiency and resource allocation.In summary, my opinionated analytics system combines technical excellence with practicality, enabling data-driven success.
Let’s turn insights into action! 🚀📊## Architecture chart legend
```mermaid
flowchart TD
ICA(Internal Component A: tech descripotion) --> |Sync operation| EC>External Component]
EC -.-> |Async operation| ICB(Internal Component B: tech description)
ICB ==> |Sync write operation| S[(Storage)]
```## Abstract architecture
```mermaid
flowchart TD
C(Client SDK) --> |HTTP POST| R(Data Receiver)
R --> Q>Event Queue]
Q -.-> |Poll| W(Data Writer)
W ==> OD[(Operative Database)]
OD -.-> |Retention period| M(Data Migrator)
M ==> HD[(Historical Database)]
OD --> DM>Data Mart]
HD --> DM
```## Self-hosted architecture
```mermaid
flowchart TD
C(Client SDK: C#, Unity) --> |HTTP POST| LB
subgraph docker compose
LB>nginx]
R(Data Receiver: C#, ASP.NET Core)
Q>redpanda]
W(Data Writer: C#, ASP.NET Core, EF.Core)
M(Data Migrator: C#, ASP.NET Core, EF.Core)
P>Apache Presto]
DM>Data Mart: Apache Superset]
end
LB --> R
R --> Q
Q -.-> |Poll| W
subgraph Local Database
OD[(Operative Database: Postgres)]
end
W ==> OD
OD -.-> |Retention period| M
subgraph Local MiniIO
HD[(Historical database bucket)] --> P
end
M ==> HD
OD --> DM
P --> DM
```## Cloud architecture (AWS-based)
```mermaid
flowchart TD
C(Client SDK: C#, Unity) --> |HTTP POST| ALB>Load balancer: AWS ALB]
ALB --> R
subgraph ECS [AWS ECS]
R(Data Receiver: C#, ASP.NET Core)
W(Data Writer: C#, ASP.NET Core, EF.Core)
M(Data Migrator: C#, ASP.NET Core, EF.Core)
P>Apache Presto]
DM>Data Mart: Apache Superset]
end
R --> Q>Event Queue: AWS SQS]
Q -.-> |Poll| W
subgraph RDS [AWS RDS]
OD[(Operative Database: Postgres)]
end
W ==> OD
OD -.-> |Retention period| M
subgraph S3 [AWS S3]
HD[(Historical database bucket)] --> P
end
M ==> HD
OD --> DM
P --> DM
```## Self-hosted implementation
### Dependencies
- Docker
### Usage
To run whole system:
```
docker-compose up -d
```To run with full services rebuild:
```
docker-compose up -d --build
```#### Superset configuration
To setup Superset, you need to run the following commands:
```
docker exec -it analyticspipeline-superset sh
export FLASK_APP=superset; flask fab create-admin
superset db upgrade
superset init
```Then you can access Superset at `http://localhost:8088`.
To configure access to operative storage, you need to add a new database connection using web UI:
```
Plus Sign -> Data -> Connect database:
> Host: postgres
> Post: 5432
> Database name: analytics_database
> Username: analytics_r_user
> Password: analytics_r_user_password
```### Examples
#### Unity Sample Project

#### Superset Dashboard
