{"id":24808503,"url":"https://github.com/madhurimarawat/data-warehousing","last_synced_at":"2026-04-28T17:34:27.936Z","repository":{"id":273844004,"uuid":"921060332","full_name":"madhurimarawat/Data-Warehousing","owner":"madhurimarawat","description":"This repository contains practical examples of data warehousing concepts, including star schema and ETL processes, all implemented using MySQL.","archived":false,"fork":false,"pushed_at":"2026-01-15T06:50:43.000Z","size":11021,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-15T13:04:08.172Z","etag":null,"topics":["data-aggregation","data-cleaning","data-cleaning-and-preprocessing","data-warehousing","detailed-documentation","etl","etl-pipeline","mysql","normalization","olap-cube","olap-data","olap-database","query-optimization","snowflake-schema","star-schema"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/madhurimarawat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-23T09:07:09.000Z","updated_at":"2026-01-15T06:50:47.000Z","dependencies_parsed_at":"2025-01-23T10:24:16.323Z","dependency_job_id":"da19b1ab-0bfe-47f8-972a-8a70532fcc91","html_url":"https://github.com/madhurimarawat/Data-Warehousing","commit_stats":null,"previous_names":["madhurimarawat/data-warehousing"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/madhurimarawat/Data-Warehousing","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madhurimarawat%2FData-Warehousing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madhurimarawat%2FData-Warehousing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madhurimarawat%2FData-Warehousing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madhurimarawat%2FData-Warehousing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/madhurimarawat","download_url":"https://codeload.github.com/madhurimarawat/Data-Warehousing/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madhurimarawat%2FData-Warehousing/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32392299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T14:34:11.604Z","status":"ssl_error","status_checked_at":"2026-04-28T14:32:37.009Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-aggregation","data-cleaning","data-cleaning-and-preprocessing","data-warehousing","detailed-documentation","etl","etl-pipeline","mysql","normalization","olap-cube","olap-data","olap-database","query-optimization","snowflake-schema","star-schema"],"created_at":"2025-01-30T10:17:47.598Z","updated_at":"2026-04-28T17:34:27.924Z","avatar_url":"https://github.com/madhurimarawat.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data-Warehousing\nThis repository contains practical examples of data warehousing concepts, including star schema and ETL processes, all implemented using MySQL.\n\n\u003cp align=\"center\"\u003e\n  \u003c!-- Repository Size --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/repo-size/madhurimarawat/Data-Warehousing?color=%23E6A8D7\u0026label=Repo%20Size\u0026labelColor=%23D89AC5\u0026style=for-the-badge\u0026logo=github\" alt=\"Repo Size\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Stars --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/stargazers\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/stars/madhurimarawat/Data-Warehousing?color=%23FFB6C1\u0026label=Stars\u0026labelColor=%23F3A6B2\u0026style=for-the-badge\u0026logo=star\" alt=\"GitHub Stars\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Forks --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/network/members\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/forks/madhurimarawat/Data-Warehousing?color=%23B3D9D9\u0026label=Forks\u0026labelColor=%23A1D8D8\u0026style=for-the-badge\u0026logo=git\" alt=\"GitHub Forks\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Issues (Open + Closed) --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/issues\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/issues/madhurimarawat/Data-Warehousing?color=%23FFF5C3\u0026label=Open%20Issues\u0026labelColor=%23F9E5A4\u0026style=for-the-badge\u0026logo=bug\" alt=\"GitHub Issues\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/issues?q=is%3Aissue+is%3Aclosed\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/issues-closed/madhurimarawat/Data-Warehousing?color=%23F1D1A1\u0026label=Closed%20Issues\u0026labelColor=%23FFB88D\u0026style=for-the-badge\u0026logo=bug\" alt=\"Closed Issues\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Pull Requests (Open + Closed) --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/pulls\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/issues-pr/madhurimarawat/Data-Warehousing?color=%23F7CAC9\u0026label=Open%20PRs\u0026labelColor=%23F1A7B8\u0026style=for-the-badge\u0026logo=git\" alt=\"Open Pull Requests\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/pulls?q=is%3Apr+is%3Aclosed\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/issues-pr-closed/madhurimarawat/Data-Warehousing?color=%23D6E2E9\u0026label=Closed%20PRs\u0026labelColor=%23C4D4DF\u0026style=for-the-badge\u0026logo=git\" alt=\"Closed Pull Requests\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Discussions --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/discussions\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/discussions/madhurimarawat/Data-Warehousing?color=%23F5B7B1\u0026label=Discussions\u0026labelColor=%23F2A5A1\u0026style=for-the-badge\u0026logo=discourse\" alt=\"GitHub Discussions\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Contributors --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/graphs/contributors\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/contributors/madhurimarawat/Data-Warehousing?color=%232A9D8F\u0026label=Contributors\u0026labelColor=%231C6A61\u0026style=for-the-badge\u0026logo=github\" alt=\"GitHub Contributors\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- License --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/blob/main/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/madhurimarawat/Data-Warehousing?color=%23FF6B8B\u0026label=License\u0026labelColor=%23E65F73\u0026style=for-the-badge\u0026logo=open-source-initiative\" alt=\"License\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Last Commit --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/commits/main\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/last-commit/madhurimarawat/Data-Warehousing?color=%23F39C12\u0026label=Last%20Commit\u0026labelColor=%23D68910\u0026style=for-the-badge\u0026logo=github\" alt=\"Last Commit\"\u003e\n  \u003c/a\u003e\n\n  \u003c!-- Watchers --\u003e\n  \u003ca href=\"https://github.com/madhurimarawat/Data-Warehousing/watchers\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/watchers/madhurimarawat/Data-Warehousing?color=%23F5B041\u0026label=Watchers\u0026labelColor=%23D68910\u0026style=for-the-badge\u0026logo=github\" alt=\"GitHub Watchers\"\u003e\n  \u003c/a\u003e\n\n\u003c/p\u003e\n\n\u003cimg src = \"https://keyit.com.au/wp-content/uploads/2023/05/data-wearhousing-copy.webp\" width= \"100%\" height= \"530px\"\u003e\n\n---\n\n## Tools and Technologies ⚙️💻\n\n1. [MySQL](https://dev.mysql.com/doc/):  \u0026nbsp; An open-source relational database management system for managing and organizing structured data using SQL. \n2. [Python](https://www.python.org/doc/):  \u0026nbsp; A high-level, interpreted programming language known for its readability and versatility. It supports multiple programming paradigms and is widely used for web development, data analysis, automation, and scientific computing.  \n3. [Pandas](https://pandas.pydata.org/docs/):  \u0026nbsp; An open-source data analysis and manipulation library for Python. It provides data structures like DataFrames and Series, enabling efficient handling and analysis of structured data.  \n4. [NumPy](https://numpy.org/doc/):  \u0026nbsp; A fundamental package for numerical computing in Python. It offers support for multi-dimensional arrays and matrices, along with a collection of mathematical functions for performing efficient operations on these data structures.   \n5. [MySQL Connector](https://dev.mysql.com/doc/connector-python/en/):  \u0026nbsp; A Python library that enables connecting to a MySQL database server. It allows developers to execute SQL queries, manage database connections, and interact with MySQL databases directly from Python applications.\n\n---\n\n## Directory Structure 📂\n\n```\nData-Warehousing/\n│\n├── Experiment 1/\n│   ├── Documentation/ 📝\n|   │   ├── Explanation of methods and key observations from Experiment 1.\n│\n├── Experiment 2/\n│   ├── Codes/ 💻\n│   │   └── Contains the MySQL script for input and output in Experiment 2.\n│   ├── Documentation/ 📝\n│   │   ├── Detailed documentation explaining the methodology and analysis for Experiment 2.\n│   ├── Output/ 📊\n│   │   └── Contains the results and analysis of Experiment 2.\n├── Experiment 3/\n│   ├── Codes/ 💻\n│   │   └── Contains the MySQL script for input and output in Experiment 3.\n│   ├── Documentation/ 📝\n│   │   ├── Detailed documentation explaining the methodology and analysis for Experiment 3.\n│   ├── Output/ 📊\n│   │   └── Contains the results and analysis of Experiment 3.\n.....\n```\n\n### **Project Folder Structure**  \n\n- **Codes** 💻 (If applicable)  \nContains the source code files used for data processing and analysis in each experiment. These scripts are essential for executing tasks within the experiment. Additionally, the following files are included:\n  - **MySQL Commands and Output (TXT)**: This text file contains the specific MySQL command-line operations used in the experiment, documenting both the input commands and their corresponding outputs. A detailed explanation of these commands and their results can be found in the **Documentation** folder, available in both **MD** and **PDF** formats.\n\n- **Dataset** 📁 (If applicable)  \n  Stores datasets used in experiments, ensuring easy access and organization.  \n  - e.g., `data.csv`, `stream_data.json`  \n\n- **Output** 📊  \n  Stores results generated from experiments, including visualizations, processed data, logs, and analysis reports. Each experiment's output is stored separately with a relevant name.  \n  - e.g., `Experiment_X_Output` (where \"X\" refers to the relevant experiment number)  \n\n- **Documentation** 📝  \n  Contains detailed documentation for each experiment, covering methodology, analysis, and insights. Documentation is provided in both Markdown (`.md`) and PDF formats for easy reference.  \n  - `documentation.md` (Markdown version)  \n  - `documentation.pdf` (PDF version, converted from Markdown)  \n\n- **Commands File (📋)**  \n  A text file stored in the **Codes** folder, documenting specific commands, steps, and MySQL output used in the experiment. This is especially useful for tracking command-line operations and database interactions.  \n  - `MySQL_Commands_Output.txt`\n\n---\n\n## Table Of Contents 📔 🔖 📑\n\n### 1. [Introduction to Data Warehousing Concepts](Experiment%201)\n\nThis experiment introduces the fundamental concepts and architecture of data warehousing, including ETL processes, data modeling techniques, and OLAP functionalities.  \n\n### 2. [Creating Star Schema in Data Warehouse](Experiment%202)\n\nThis experiment focuses on designing and implementing a star schema data model for a specified business scenario, emphasizing the creation of fact and dimension tables. \n\n### 3. [Implementing Snowflake Schema in Data Warehouse](Experiment%203)\n\nIn this experiment, the Snowflake Schema was implemented to achieve a more \nnormalized data structure than the Star Schema.  \n\n### 4. [Designing ETL Process for Data Warehousing](Experiment%204)\n\nIn this experiment, an ETL process was designed and implemented to migrate \ndata from operational databases to a data warehouse.\n\n### 5. [OLAP Operations in Data Warehousing](Experiment%205)  \n\nIn this experiment, OLAP operations such as **slicing, dicing, drill-down, drill-up, and pivoting** were applied to analyze predefined data in a data warehouse.  \n\n### 6. [Data Cleansing and Transformation](Experiment%206)  \n\nThis experiment involved **cleaning and transforming raw data** before loading it into the data warehouse, ensuring **consistency, accuracy, and completeness**.  \n\n### 7. [Query Optimization in Data Warehousing](Experiment%207)  \n\nSQL queries were **optimized for large-scale data warehouse applications** using techniques like **indexing, partitioning, and query tuning** to improve performance.  \n\n### 8. [Data Aggregation for Reporting](Experiment%208)  \n\nThis experiment implemented **data aggregation techniques** to generate **summarized views of large datasets**, enhancing **reporting and analytical efficiency**.\n\n### 9. [Designing and Implementing a Data Warehouse Report](Experiment%209)  \nThis experiment involves generating business reports from a **MySQL data warehouse** using **SQL queries** and **Python** for data extraction and processing.\n\n### 10. [Real-time Data Warehousing using Streaming Data](Experiment%2010)  \nA **real-time data pipeline** is implemented with **Python**, continuously ingesting streaming data into a **MySQL data warehouse** for immediate analysis.\n\n### 11. [Implementing Slowly Changing Dimensions (SCD) in Data Warehousing](Experiment%2011)  \nThis experiment applies **Slowly Changing Dimensions (SCD)** techniques in a **MySQL data warehouse**, developed using **Python** to maintain historical data accuracy.\n\n---\n\n## Thanks for Visiting 😄\n\n- Drop a 🌟 if you find this repository useful.\u003cbr\u003e\u003cbr\u003e\n- If you have any doubts or suggestions, feel free to reach me.\u003cbr\u003e\u003cbr\u003e\n📫 How to reach me:  \u0026nbsp; [![Linkedin Badge](https://img.shields.io/badge/-madhurima-blue?style=flat\u0026logo=Linkedin\u0026logoColor=white)](https://www.linkedin.com/in/madhurima-rawat/) \u0026nbsp; \u0026nbsp;\n\u003ca href =\"mailto:rawatmadhurima4@gmail.com\"\u003e\u003cimg src=\"https://github.com/madhurimarawat/Machine-Learning-Using-Python/assets/105432776/b6a0873a-e961-42c0-8fbf-ab65828c961a\" height=35 width=30 title=\"Mail Illustration\" alt=\"Mail Illustration📫\" \u003e \u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n- **Contribute and Discuss:** Feel free to open \u003ca href= \"https://github.com/madhurimarawat/Data-Warehousing/issues\"\u003eissues 🐛\u003c/a\u003e, submit \u003ca href = \"https://github.com/madhurimarawat/Data-Warehousing/pulls\"\u003epull requests 🛠️\u003c/a\u003e, or start \u003ca href = \"https://github.com/madhurimarawat/Data-Warehousing/discussions\"\u003ediscussions 💬\u003c/a\u003e to help improve this repository!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadhurimarawat%2Fdata-warehousing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmadhurimarawat%2Fdata-warehousing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadhurimarawat%2Fdata-warehousing/lists"}