https://github.com/jrili/data-engineer-portfolio
Jessa Rili-Migriño's Data Engineer Portfolio
https://github.com/jrili/data-engineer-portfolio
beautifulsoup4 data-cleaning-and-preprocessing etl pandas python webscraping
Last synced: 2 months ago
JSON representation
Jessa Rili-Migriño's Data Engineer Portfolio
- Host: GitHub
- URL: https://github.com/jrili/data-engineer-portfolio
- Owner: jrili
- Created: 2025-03-24T06:46:57.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-05-26T05:44:47.000Z (about 1 year ago)
- Last Synced: 2025-06-01T17:03:29.716Z (about 1 year ago)
- Topics: beautifulsoup4, data-cleaning-and-preprocessing, etl, pandas, python, webscraping
- Homepage: https://www.linkedin.com/in/jessa-rili-migrino/
- Size: 24.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Data Engineer Portfolio
=======================
Hi, I'm Jessa Rili-Migriño - an experienced Software Engineer transitioning into Data Engineering!
This portfolio showcases my hands-on projects showcasing my skills in data extraction, transformation, loading (ETL), web scraping, and data pipelines.
# Projects
| Project | Description | Tools|
|---------|-------------|------|
| [**ETL Pipeline** - Bank Marketing Campaign](https://github.com/jrili/datacamp-cleaning-bank-marketing) | Extracted, cleaned, and derived the required data from banking marketing data, then split into three (3) separate CSV files |  |
| [**Web Scraping**, **ETL Pipeline** - Top 10 Largest Banks in the World](https://github.com/jrili/ibm-project-world-largest-banks) | Built a web scraping and ETL pipeline to extract financial data on the world's largest banks from [Wikipedia](https://web.archive.org/web/20230908091635%20/https://en.wikipedia.org/wiki/List_of_largest_banks) which are stored on a file and on a database. |  BeautifulSoup4 |
| [**Exploratory Analysis** - Analyzing Students' Mental Health](https://github.com/jrili/datacamp-analyzing-students-mental-health) | Deployed a local Postgres database, loaded student mental health data, and performed exploratory analysis with SQL queries in a Jupyter notebook|  |
| [**Web Scraping**, **ETL Pipeline** - Top 50 Films](https://github.com/jrili/ibm-webscraping-films) | Developed a web scraping and ETL pipeline to extract film data collated on [EverybodyWiki](https://web.archive.org/web/20230902185655/https://en.everybodywiki.com/100_Most_Highly-Ranked_Films), clean and transform the information, and store them into a file and on a database|  BeautifulSoup4 |
| [**ETL Pipeline** - Car Dealership Data](https://github.com/jrili/ibm-etl-car-dealership)| Built an ETL pipeline to extract car dealership data from multiple files of different formats (CSV, JSON, XML), transform them to be uniform, and load into a single file. | |
| [**ETL Pipeline** - Body Measurements](https://github.com/jrili/ibm-etl-heights-weights) | Built an ETL pipeline to extract height and weight data from multiple files of different formats (CSV, JSON, XML), transform the data into the required units, and load into a single file. | |
| [**Web Scraping** - 2025 PH Election Results](https://github.com/jrili/ph-election-results-2025-scraper) | Built a Web Scraping tool to extract the election results of the 2025 PH Elections and loaded into hierarchically-organized JSON files. |  |
| Coming soon:
ETL Pipeline, Visualization - Weather Data ETL | Build an ETL pipeline to extract weather data from using [VisualCrossing Weather API](https://www.visualcrossing.com/), transform data, and load into a postgres database | |
| Coming soon:
ETL Pipeline - Health and Supplements Usage Data | Build an ETL pipeline to extract health data from wearable devices and health apps, transform data in various ways as per requirement, and load into a single file. | ! |
| Coming soon:
ETL Pipeline - Retail Data| Build an ETL pipeline to extract grocery data from a retail company to be augmented with extra data in parquet format, transform and combine the data, and load into a single file. | |
| Coming soon:
Data Transformation - Insurance Policy Data | Load insurance policy data into a locally deployed Postgres database, and produce the required data views using efficient SQL queries in a Jupyter Notebook | |
| Coming soon:
Web Scraping, ETL Pipeline - All Generations Pokemon Data| Scrape the latest pokemon data from the public domain, transform and normalize them, then load into a single file and perhaps into a PostgreSQL database.|  BeautifulSoup4|
| [Other Projects Coming Soon] | Expanding into automated ETL pipelines, ETL of real-time data, and cloud-based storage loading.| |
# Skills Practiced
* **Data Extraction**: APIs, web scraping, file systems
* **Data Transformation**: Data cleaning, normalization, deduplication
* **Data Loading**: CSV exports, database readiness
* **Tools**: Python, Pandas, BeautifulSoup, Bash Scripting, SQL, PostgreSQL, Snowflake, , Microsoft Fabric, Apache Airflow, Kubernetes basics, AWS basics, Microsoft Azure basics
* **Learning in Progress**: Spark, AWS Data Engineering services, Microsoft Azure Data Engineering services, Google Cloud Platform
# About Me
* More than 10 years experience designing, developing, and maintaining backend systems and microservices deployed on cloud services
* Certifications:
* DataCamp Associate Data Engineer ([Track](https://www.datacamp.com/completed/statement-of-accomplishment/track/5dac6f85d32d86a8dccba020cbbeacd8f3c9ed11) | [Certification](https://www.datacamp.com/certificate/DEA0014963158934))
* DataCamp Data Engineer ([Track](https://www.datacamp.com/completed/statement-of-accomplishment/track/9ecdd3624b20f72960dd2c95a33273f05d8ae0ed) | [Certification](https://www.datacamp.com/certificate/DE0013679986474))
* IBM Data Engineering Foundations Specialization ([Certificate](https://www.coursera.org/account/accomplishments/specialization/HKLY7QWR6IVT))
* Actively building scalable, reliable data workflows and pipelines
* Background in AI, Machine Learning, and Deep Learning (Master's degree)
# Connect with Me
* ***LinkedIn profile: [jessa-rili-migrino](https://www.linkedin.com/in/jessa-rili-migrino/)***