{"id":15176590,"url":"https://github.com/ajschofield/de-project-bentley","last_synced_at":"2025-10-01T14:32:21.607Z","repository":{"id":253394832,"uuid":"841353608","full_name":"ajschofield/de-project-bentley","owner":"ajschofield","description":"A project demonstrating an ETL pipeline primarily using AWS infrastructure into a data warehouse","archived":true,"fork":false,"pushed_at":"2024-09-03T15:25:03.000Z","size":381,"stargazers_count":2,"open_issues_count":4,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-09-28T13:21:31.142Z","etag":null,"topics":["aws","eventbridge","lambda","northcoders","python","rds-postgres","step-functions","terraform"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ajschofield.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-12T08:36:45.000Z","updated_at":"2024-09-03T15:26:09.000Z","dependencies_parsed_at":"2024-08-27T09:45:39.597Z","dependency_job_id":"28caa73c-4587-4e91-9b76-445bcb19e7ec","html_url":"https://github.com/ajschofield/de-project-bentley","commit_stats":null,"previous_names":["ajschofield/de-project-bentley"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajschofield%2Fde-project-bentley","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajschofield%2Fde-project-bentley/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajschofield%2Fde-project-bentley/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajschofield%2Fde-project-bentley/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ajschofield","download_url":"https://codeload.github.com/ajschofield/de-project-bentley/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234878315,"owners_count":18900676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","eventbridge","lambda","northcoders","python","rds-postgres","step-functions","terraform"],"created_at":"2024-09-27T13:21:36.608Z","updated_at":"2025-10-01T14:32:21.235Z","avatar_url":"https://github.com/ajschofield.png","language":"Python","readme":"\u003e [!NOTE]\n\u003e Considering that myself and my team have graduated from the Northcoders Data Engineering course, this project will be archived and made read-only.\n\u003e I will be continuing this project solo, which you can find [here](https://github.com/ajschofield/ETL-Project), where I will be adding more features\n\u003e over time.\n\n# ToteSys - Data Engineering Project\n[![Python](https://img.shields.io/badge/Python-FFD43B?style=for-the-badge\u0026logo=python\u0026logoColor=blue)](https://www.python.org/)\n[![AWS](https://img.shields.io/badge/Amazon_AWS-FF9900?style=for-the-badge\u0026logo=amazonaws\u0026logoColor=white)](https://aws.amazon.com/)\n[![Terraform](https://img.shields.io/badge/Terraform-7B42BC?style=for-the-badge\u0026logo=terraform\u0026logoColor=white)](https://www.terraform.io/)\n[![Postgresql](https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white)](https://www.postgresql.org/)\n[![GitHub Actions](https://img.shields.io/badge/GitHub_Actions-2088FF?style=for-the-badge\u0026logo=github-actions\u0026logoColor=white)](https://github.com/features/actions)\n\n[![Terraform Main Deployment Workflow Status](https://img.shields.io/github/actions/workflow/status/ajschofield/de-project-bentley/deploy.yml?branch=main\u0026style=flat-square\u0026label=deploy)](https://github.com/ajschofield/de-project-bentley/actions/workflows/deploy.yml?query=branch%3Amain)\n[![Production Environment Status](https://img.shields.io/github/deployments/ajschofield/de-project-bentley/production?style=flat-square\u0026label=env)](https://github.com/ajschofield/de-project-bentley/deployments/production)\n\n# Contributors\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/ellsymonds\"\u003e\n        \u003cimg src=\"https://github.com/ellsymonds.png\" width=\"100px;\" alt=\"ellsymonds\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eEllie Symonds\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/lian-manonog\"\u003e\n        \u003cimg src=\"https://github.com/lian-manonog.png\" width=\"100px;\" alt=\"lian-manonog\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eLianmei Manon-og\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/T-Aji\"\u003e\n        \u003cimg src=\"https://github.com/T-Aji.png\" width=\"100px;\" alt=\"T-Aji\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eTolu Ajibade\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/HastarTara\"\u003e\n        \u003cimg src=\"https://github.com/HastarTara.png\" width=\"100px;\" alt=\"HastarTara\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eJoslin Rashleigh\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/bulve-ad\"\u003e\n        \u003cimg src=\"https://github.com/bulve-ad.png\" width=\"100px;\" alt=\"bulve-ad\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eAnzelika Belotelova\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003ca href=\"https://github.com/ajschofield\"\u003e\n        \u003cimg src=\"https://github.com/ajschofield.png\" width=\"100px;\" alt=\"ajschofield\"/\u003e\n        \u003cbr /\u003e\n        \u003csub\u003e\u003cb\u003eAlex Schofield\u003c/b\u003e\u003c/sub\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n# Summary\nThe project aims to implement a data platform that can extract data from an\noperational database, archive it in a data lake, and make it easily accessible\nwithin a remodelled OLAP data warehouse.\n\nThe solution showcases our skills in:\n\n- Python\n- PostgreSQL\n- Database modelling\n- Amazon Web Services (AWS)\n- Agile methodologies\n\n# Main Objectives\n\nOur goal is to create a reliable ETL (Extract, Transform, Load) pipeline that\ncan:\n\n1. Extract the data from the `totesys` operational database\n2. Store the data in AWS S3 buckets, that will form our data lake\n3. Transform the data into a suitable schema for the data warehouse\n4. Load the transformed data into the data warehouse hosted on AWS\n\n# Key Features\n\nWe aim for the project to have certain features. Some are more prioritised than\nothers.\n\n- Automated data ingestion from `totesys` db\n- Data storage for ingested and processed data in S3 buckets\n- Data transformation for data warehouse schema\n- Automated data loading into the data warehouse schema\n- Logging and monitoring with CloudWatch\n- Notifications for errors and successful runs (e.g. successful ingestion)\n- Visualisation of warehouse data\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fajschofield%2Fde-project-bentley","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fajschofield%2Fde-project-bentley","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fajschofield%2Fde-project-bentley/lists"}