https://github.com/iulian-sandu/football-data-analytics-frf-datacamp
FRF Datacamp - End to end flow with Google Cloud Platform
https://github.com/iulian-sandu/football-data-analytics-frf-datacamp
automation bigquery football-analytics football-data gcp
Last synced: about 2 months ago
JSON representation
FRF Datacamp - End to end flow with Google Cloud Platform
- Host: GitHub
- URL: https://github.com/iulian-sandu/football-data-analytics-frf-datacamp
- Owner: iulian-sandu
- Created: 2025-07-31T10:00:49.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-08-12T08:28:17.000Z (about 2 months ago)
- Last Synced: 2025-08-12T10:20:00.238Z (about 2 months ago)
- Topics: automation, bigquery, football-analytics, football-data, gcp
- Language: Python
- Homepage:
- Size: 245 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Football Data Analytics - Superliga Datacamp - FRF
## Overview
End-to-end football data analytics flow built on Google Cloud services. The current implementation uses dummy data and some manually created resources. In the future implementation, infrastructure provisioning will be fully automated and managed through a GitOps process.
## Features and functionality
1. Trigger a Python Cloud Run service using Cloud Scheduler.
2. Simulate API scraping and store the data in a Cloud Storage bucket as a JSONL file for data backup purposes.
3. Upload data to BigQuery and apply data transformations. Save the transformed data in a new table.
4. Use Looker Studio to create a dashboard based on the transformed data.## Architecture

### Components
1. **Cloud Scheduler** – Runs on a defined schedule and publishes a message to a Pub/Sub topic to start the automated process.
2. **Pub/Sub** – Receives the scheduled message and triggers the main serverless Python service.
3. **Cloud Run** - Main Python service responsible for data ingestion and upload.
4. **BigQuery** - Managed, serverless data warehouse for storing raw and transformed data.
5. **Looker Studio** - Visualization layer using BigQuery as the data source.## Future Enhancements
- Automate infrastructure deployment using Terraform.
- Automate dashboard creating and sharing.
- Implement GitOps process for CICD.
- Integrate with AI layer (VertexAI, OpenAI) to create dashboards on demand based on user prompt. Can also create full pre-match/after-match reports.
- Alerting, cost control and monitoring.
- Split the main Python function into multiple Cloud Run services.