Projects in Awesome Lists by vaxdata22
A curated list of projects in awesome lists by vaxdata22 .
https://github.com/vaxdata22/water-quality-dw-on-sql-server
This is an MSSQL Data Warehouse and ETL implementation on specially formatted Water Quality dataset from DEFRA, UK
advanced-sql data-cleaning data-transformation data-warehousing database-schema dimension-tables etl extract-transform-load fact-table jupyter-notebook microsoft-sql-server pandas-dataframe pyodbc python sql-server-management-studio staging-area t-sql
Last synced: 17 Mar 2026
https://github.com/vaxdata22/redfin-analytics-etl-data-engg-pipeline-by-airflow-on-ec2
This is an end-to-end AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2 as well as Snowpipe. It demonstrates how to build ETL data pipeline that would perform data transformation using Python on Apache Airflow as well as automatic ingestion into Snowflake data warehouse via Snowpipe. Also features Power BI.
apache-airflow aws-ec2 aws-s3 business-intelligence dags data-visualization etl-pipeline orchestration power-bi python3 redfin snowflake snowpipe sqs-queue
Last synced: 29 Apr 2026
https://github.com/vaxdata22/image-background-remover-demo-with-python
This is a simple but fun exercise that was done to demonstrate the power of Python in image manipulation using libraries like Pillow (PIL) and Rembg as well as leveraging ONNX Runtime for faster processing on GPU.
jupyter-notebook pillow python rembg
Last synced: 18 May 2026
https://github.com/vaxdata22/zillow-rapid-api-end-to-end-etl-data-pipeline-by-airflow-on-ec2
This is an end-to-end AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2 as well as AWS Lambda. It demonstrates how to build ETL data pipeline that would perform data transformation using Lambda function as well as loading into a Redshift cluster table. The data would then be visualized using Amazon QuickSight.
amazon-quicksight amazon-redshift apache-airflow aws-ec2 aws-lambda aws-s3 business-intelligence dags data-visualization etl-pipeline orchestration python3 rapid-api zillow-house-listings
Last synced: 19 May 2026
https://github.com/vaxdata22/customer-churn-data-analytics-etl-pipeline-by-airflow-on-ec2
This is an end-to-end AWS Cloud ETL project. This orchestration uses Apache Airflow on AWS EC2 as well as AWS Glue. It demonstrates how to build ETL pipeline that would perform data transform using Glue job/crawler as well as loading into a Redshift table. It also shows how to connect Amazon Athena to Glue Data Catalog, and Power BI to Redshift.
amazon-athena amazon-redshift apache-airflow aws-ec2 aws-glue aws-s3 business-intelligence customer-churn-analytics dags data-visualization etl-pipeline orchestration power-bi python3
Last synced: 18 Jun 2025
https://github.com/vaxdata22/about-me
Data Analyst || BI Analyst || Spreadsheets || SQL/Python/R || Tableau || DBA (Oracle/MySQL/MSSQL)
business-intelligence data-analytics database-engineering
Last synced: 05 Mar 2026
https://github.com/vaxdata22/foresight-pharmaceutical
This is a Data Analysis case study done on the Foresight Pharmaceutical Company dataset.
actionable-insights business-analytics business-intelligence data data-analytics data-cleaning data-mining data-visualization data-wrangling exploratory-data-analysis spreadsheets sql sql-server sql-server-management-studio statistical-analysis t-sql transact-sql
Last synced: 05 Mar 2025
https://github.com/vaxdata22/cities-weather-s3-snowflake-slack-notif-etl-by-airflow-on-ec2
This is my second industry-level ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would extract data (JSON) from the OpenWeatherMap API, transform it, dump it as CSV in S3 bucket, then copy it to destination tables in Snowflake DW and send Slack notification.
apache-airflow aws-ec2 business-intelligence dags data-warehousing etl-pipeline openweathermap-api orchestration python3 slack-webhook snowflake sql
Last synced: 26 May 2026
https://github.com/vaxdata22/foresight-institution
This is a Data Analysis case study done on the Foresight Institution dataset.
actionable-insights business-analytics business-intelligence data data-analytics data-cleaning data-mining data-processing data-visualization data-wrangling exploratory-data-analysis spreadsheets sql sql-server sql-server-management-studio statistical-analysis t-sql transact-sql
Last synced: 28 May 2026
https://github.com/vaxdata22/countries-population
This is an Exploratory Data Analysis done on a Countries dataset from kaggle
anaconda data-analysis-python data-analytics data-cleaning data-exploration data-mining data-visualization data-wrangling exploratory-data-analysis jupyter-notebook matplotlib-pyplot pandas pandas-dataframe python statistical-analysis
Last synced: 20 Apr 2026
https://github.com/vaxdata22/weatherapi-to-s3-bucket-to-snowflake-etl-by-aiflow-on-ec2
This is my first ever industry-level ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would extract data (JSON) from the OpenWeatherMap API, transform it, dump it as CSV in S3 bucket, then copy it to a destination table in Snowflake DW and send email notification.
apache-airflow aws-ec2 aws-s3 business-intelligence etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 07 Mar 2025
https://github.com/vaxdata22/little-lemon-booking-system-db
Database project for managing the table booking system of the Little Lemon restaurant. This is a capstone project I undertook in order to earn the Meta Database Engineer Professional Certificate credential.
database database-management er-diagram mysql python stored-procedures tableau
Last synced: 13 Apr 2026
https://github.com/vaxdata22/assignment-on-data-scraping
Analyzing Historical Stock/Revenue Data and Building a Dashboard for my IBM Data Analyst Certificate Programme
Last synced: 29 May 2026
https://github.com/vaxdata22/lagos-weather-s3-snowflake-email-notif-etl-by-airflow-on-ec2
This is my first ever AWS Could ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would extract data (JSON) from the OpenWeatherMap API, transform it, dump it as CSV in S3 bucket, then copy it to a destination table in Snowflake DW and send email notification.
apache-airflow aws-ec2 aws-s3 business-intelligence etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 29 May 2026
https://github.com/vaxdata22/city-weather-and-s3file-rds-s3-bigquery-etl-by-airflow-on-ec2
This is my third AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would perform data extraction to a database in parallel to a loading process into the same database, join the tables, copy joined data to S3 and finally copy the S3 file to BigQuery DW.
apache-airflow aws-ec2 aws-rds-postgres aws-s3 bigquery business-intelligence dags data-warehousing etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 21 May 2026
https://github.com/vaxdata22/nosql-and-big-data-demonstration
This is a fun assignment task I undertook to explore the world of NoSQL and Big Data. technologies.
apache-hive cassandra-cql cypher-query-language data-warehouse hadoop-hdfs json mongodb neo4j nosql-databases redis
Last synced: 13 Feb 2026
https://github.com/vaxdata22/salifort-motors-and-waze-churn
Employee retention predictive model development for Salifort Motors and Waze. This is a terminal project I did to earn the Google Advanced Data Analytics Professional Certificate.
data-analytics data-visualization model-development predictive-analytics python statistical-analysis
Last synced: 16 Apr 2026
https://github.com/vaxdata22/automobile-data
This is an Exploratory Data Analysis done on an Automobile dataset from kaggle
actionable-insights anaconda data-analysis-python data-analytics data-cleaning data-exploration data-mining data-visualization data-wrangling exploratory-data-analysis jupyter-notebook matplotlib pandas pandas-dataframe python statistical-analysis
Last synced: 21 Apr 2026
https://github.com/vaxdata22/water-quality-dw-on-oracle-database
This is an Oracle DB Data Warehouse and ETL implementation on specially formatted Water Quality dataset from DEFRA, UK
advanced-sql data-cleansing data-transformation data-warehouse database-schema dimension-tables etl extract-transform-load fact-table jupyter-notebook oracle-21c oracle-database oracle-sql-developer pandas-dataframe pl-sql pl-sql-cursors pyodbc python staging-area
Last synced: 30 Apr 2026
https://github.com/vaxdata22/redfin-analytics-etl-using-amazon-emr-by-airflow-on-ec2
This is an end-to-end AWS Cloud ETL project. This data pipeline uses an Amazon EMR cluster managed by Apache Airflow that is running on an AWS EC2 instance. It demonstrates how to build orchestration that would perform data transformation using Amazon EMR as well as automatic data ingestion into a Snowflake via Snowpipe. It also features Power BI.
amazon-emr-cluster apache-airflow apache-spark aws-ec2 aws-s3 business-intelligence dags data-visualization etl-pipeline google-colab-notebook orchestration power-bi pyspark redfin snowflake snowpipe sqs-queue
Last synced: 02 May 2026
https://github.com/vaxdata22/istanbul-shopping
This is an Exploratory Data Analysis done on Istanbul Shopping dataset from kaggle.
actionable-insights actionable-recommendations anaconda business-analytics business-intelligence data-analysis-python data-analytics data-cleaning data-exploration data-mining data-visualization data-wrangling exploratory-data-analysis jupyter-notebook matplotlib-pyplot pandas pandas-dataframe python seaborn statistical-analysis
Last synced: 08 May 2026
https://github.com/vaxdata22/cyclistic-ride-sharing-company
This is my Google Data Analytics Certificate case study for the Cyclistic ride-sharing company
actionable-insights business-analytics business-intelligence data data-analytics data-cleaning data-mining data-visualization data-wrangling exploratory-data-analysis google-data-analytics spreadsheets sql sql-server sql-server-management-studio statistical-analysis t-sql tableau transact-sql
Last synced: 10 Jun 2026
https://github.com/vaxdata22/city-weather-and-s3file-rds-s3-bigquery-by-airflow-on-ec2
This is my third industry-level ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would perform data extraction to a database in parallel to a loading process into the same database, join the tables, copy joined data to S3 and finally copy the S3 file to BigQuery DW.
apache-airflow aws-ec2 aws-rds-postgres aws-s3 bigquery business-intelligence dags data-warehousing etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 18 Mar 2025
https://github.com/vaxdata22/amazon-product-sales
This is an Exploratory Data Analysis done on the Amazon Product Sales dataset from kaggle.
actionable-insights actionable-recommendations anaconda business-intelligence data-analysis-python data-cleaning data-exploration data-mining data-visualization data-wrangling exploratory-data-analysis jupyter-notebook matplotlib-pyplot pandas pandas-dataframe python seaborn statistical-analysis
Last synced: 11 May 2026