https://github.com/m77rahman/uk-transit-weather
TfL + Open-Meteo ETL → DuckDB → Streamlit. Hourly GitHub Actions.
https://github.com/m77rahman/uk-transit-weather
data-engineering duckdb etl open-meteo python streamlit tfl
Last synced: about 1 month ago
JSON representation
TfL + Open-Meteo ETL → DuckDB → Streamlit. Hourly GitHub Actions.
- Host: GitHub
- URL: https://github.com/m77rahman/uk-transit-weather
- Owner: M77Rahman
- License: other
- Created: 2025-09-17T23:20:31.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-09-20T12:31:49.000Z (9 months ago)
- Last Synced: 2025-09-20T13:05:49.455Z (9 months ago)
- Topics: data-engineering, duckdb, etl, open-meteo, python, streamlit, tfl
- Language: Python
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# UK Transit + Weather
## Overview
This project is a small end-to-end data workflow that combines live transit and weather data into a structured reporting pipeline.
It pulls data from:
- **TfL API** for transit information
- **Open-Meteo API** for hourly weather data
The workflow then:
- extracts and cleans the data
- applies basic validation and reliability checks
- stores the data in **DuckDB**
- presents the outputs through a **Streamlit dashboard**
- runs on a schedule using **GitHub Actions**
The project was built to strengthen practical skills in ETL-style thinking, API integration, structured data handling, dashboard delivery, and workflow reliability.
---
## Project Goal
The goal of this project is to show how live external data can be collected, transformed, stored, and presented in a way that is structured and repeatable.
Rather than building a one-off script, this project was designed as a lightweight data workflow with:
- clear project structure
- scheduled execution
- validation and error handling
- dashboard output
- tests and documentation
---
## Data Sources
### TfL API
Used to retrieve transit-related data.
### Open-Meteo API
Used to retrieve hourly weather data.
---
## Workflow
### 1. Extract
The pipeline requests data from the TfL and Open-Meteo APIs.
### 2. Transform
The data is cleaned and structured into a more consistent format for downstream use.
### 3. Validate
Basic checks are applied to improve reliability and reduce the chance of poor-quality outputs.
Examples include:
- missing value checks
- consistency checks
- handling failed or incomplete API responses
### 4. Load
The cleaned data is stored in **DuckDB** for structured querying and dashboard use.
### 5. Present
A **Streamlit dashboard** displays the processed outputs in a user-friendly way.
### 6. Automate
The workflow is scheduled through **GitHub Actions** so it can run on a repeatable basis.
---
## Tech Stack
- **Python**
- **TfL API**
- **Open-Meteo API**
- **DuckDB**
- **Streamlit**
- **GitHub Actions**
- **Pytest** (for testing)
---
## Repository Structure
```text
uk-transit-weather/
│
├── .github/workflows/ # Scheduled GitHub Actions workflow
├── src/ # Pipeline and application source code
├── tests/ # Test files
├── .env.example # Example environment variables
├── requirements.txt # Project dependencies
└── README.md # Project documentation