https://github.com/mehassanhmood/dataengineering-project
A data engineering project.
https://github.com/mehassanhmood/dataengineering-project
cognos-dashboard datawarehousing etl-pipeline mongodb tableau
Last synced: about 2 months ago
JSON representation
A data engineering project.
- Host: GitHub
- URL: https://github.com/mehassanhmood/dataengineering-project
- Owner: mehassanhmood
- Created: 2023-12-19T09:18:01.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-22T08:54:05.000Z (over 2 years ago)
- Last Synced: 2025-06-14T20:14:59.425Z (about 1 year ago)
- Topics: cognos-dashboard, datawarehousing, etl-pipeline, mongodb, tableau
- Language: Shell
- Homepage:
- Size: 3.23 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DataEngineering-Project
## Tools and Technologies:
1. OLTP database - MySQL
2. NoSql database - MongoDB
3. Staging Data warehouse – PostgreSQL
4. Big data platform - Hadoop
5. Big data analytics platform – Spark
6. Business Intelligence Dashboard - IBM Cognos Analytics and Tableau
7. Data Pipelines - Apache Airflow
### OLTP Database:
- [OLTP](https://github.com/mehassanhmood/DataEngineering-Project/tree/main/OLTP)
- We import the data in MySQL Database
- The loaded data along with sql query is retrieved and stored as a 'sql' file.
### MongoDB:
- [NoSQL-MongoDB](https://github.com/mehassanhmood/DataEngineering-Project/tree/main/NoSQL-MongoDB)
- The data source is a .json file.
- The first step is connecting database and making a collection named 'electronics'
-This is done after establishing the connection with MongoAtlas and then running the *db-conn.js* file using mongo shell:
```load('path/to/db-conn.js)```
- The next step is to load the data in the collection and save a copy of slected fields using *import-sxport.sh* file:
```./import-export.sh```
- Finally we explore the loaded data using *mongodb-queries.js* file using mongo shell:
```load('mongodb-queries.js)```
### Staging Data Warehouse:
- [DataWarehousing-PostgreSQL](https://github.com/mehassanhmood/DataEngineering-Project/tree/main/DataWarehousing-PostgreSQL)
- This is for educational purposes only as a Data Warehouse requires reference data source and metadata.
### Analytics:
- [Dashboards-BI](https://github.com/mehassanhmood/DataEngineering-Project/tree/main/Analytics)
1. Data Visualization for Understanding:
- Dashboards present complex data in visually intuitive formats.
- Visualization enhances comprehension, making data accessible to diverse stakeholders.
2. Real-Time Monitoring and Proactive Analysis:
- Dashboards offer real-time updates on key metrics for prompt decision-making.
- BI tools enable proactive problem identification by highlighting deviations and trends.
3. Informed Decision-Making with Actionable Insights:
- Dashboards provide a centralized source for actionable insights.
- BI facilitates data-driven decisions, fostering accountability and optimizing processes.
### ETL:
- [ETL](https://github.com/mehassanhmood/DataEngineering-Project/tree/main/ETL)
- ETL serves as the backbone for integrating diverse data sources, ensuring a unified and consistent view across the organization.
- ETL enhances operational efficiency, facilitating timely and accurate data delivery to support informed decision-making in business processes.