{"id":26644936,"url":"https://github.com/jrili/data-engineer-portfolio","last_synced_at":"2026-04-24T12:34:05.182Z","repository":{"id":284106019,"uuid":"953837113","full_name":"jrili/data-engineer-portfolio","owner":"jrili","description":"Jessa Rili-Migriño's Data Engineer Portfolio","archived":false,"fork":false,"pushed_at":"2025-05-26T05:44:47.000Z","size":25,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-01T17:03:29.716Z","etag":null,"topics":["beautifulsoup4","data-cleaning-and-preprocessing","etl","pandas","python","webscraping"],"latest_commit_sha":null,"homepage":"https://www.linkedin.com/in/jessa-rili-migrino/","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jrili.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-24T06:46:57.000Z","updated_at":"2025-05-26T05:44:51.000Z","dependencies_parsed_at":"2025-03-24T07:43:15.651Z","dependency_job_id":"6bec0ede-43ca-42e4-b831-f74be5ab15bc","html_url":"https://github.com/jrili/data-engineer-portfolio","commit_stats":null,"previous_names":["jrili/data-engineer-portfolio"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jrili/data-engineer-portfolio","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrili%2Fdata-engineer-portfolio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrili%2Fdata-engineer-portfolio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrili%2Fdata-engineer-portfolio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrili%2Fdata-engineer-portfolio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jrili","download_url":"https://codeload.github.com/jrili/data-engineer-portfolio/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrili%2Fdata-engineer-portfolio/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32224195,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T10:26:35.452Z","status":"ssl_error","status_checked_at":"2026-04-24T10:25:27.643Z","response_time":64,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","data-cleaning-and-preprocessing","etl","pandas","python","webscraping"],"created_at":"2025-03-24T21:21:00.613Z","updated_at":"2026-04-24T12:34:05.170Z","avatar_url":"https://github.com/jrili.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"Data Engineer Portfolio\n=======================\n\nHi, I'm Jessa Rili-Migriño - an experienced Software Engineer transitioning into Data Engineering!\n\nThis portfolio showcases my hands-on projects showcasing my skills in data extraction, transformation, loading (ETL), web scraping, and data pipelines.\n\n# Projects\n| Project | Description | Tools|\n|---------|-------------|------|\n| [**ETL Pipeline** - Bank Marketing Campaign](https://github.com/jrili/datacamp-cleaning-bank-marketing) | Extracted, cleaned, and derived the required data from banking marketing data, then split into three (3) separate CSV files | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white) |\n| [**Web Scraping**, **ETL Pipeline** - Top 10 Largest Banks in the World](https://github.com/jrili/ibm-project-world-largest-banks) | Built a web scraping and ETL pipeline to extract financial data on the world's largest banks from [Wikipedia](https://web.archive.org/web/20230908091635%20/https://en.wikipedia.org/wiki/List_of_largest_banks) which are stored on a file and on a database. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)![SQLite](https://img.shields.io/badge/sqlite-%2307405e.svg?style=for-the-badge\u0026logo=sqlite\u0026logoColor=white) BeautifulSoup4 |\n| [**Exploratory Analysis** - Analyzing Students' Mental Health](https://github.com/jrili/datacamp-analyzing-students-mental-health) | Deployed a local Postgres database, loaded student mental health data, and performed exploratory analysis with SQL queries in a Jupyter notebook| ![Postgres](https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Bash Script](https://img.shields.io/badge/bash_script-%23121011.svg?style=for-the-badge\u0026logo=gnu-bash\u0026logoColor=white) |\n| [**Web Scraping**, **ETL Pipeline** - Top 50 Films](https://github.com/jrili/ibm-webscraping-films) | Developed a web scraping and ETL pipeline to extract film data collated on [EverybodyWiki](https://web.archive.org/web/20230902185655/https://en.everybodywiki.com/100_Most_Highly-Ranked_Films), clean and transform the information, and store them into a file and on a database| ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)![SQLite](https://img.shields.io/badge/sqlite-%2307405e.svg?style=for-the-badge\u0026logo=sqlite\u0026logoColor=white) BeautifulSoup4 |\n| [**ETL Pipeline** - Car Dealership Data](https://github.com/jrili/ibm-etl-car-dealership)| Built an ETL pipeline to extract car dealership data from multiple files of different formats (CSV, JSON, XML), transform them to be uniform, and load into a single file. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)|\n| [**ETL Pipeline** - Body Measurements](https://github.com/jrili/ibm-etl-heights-weights) | Built an ETL pipeline to extract height and weight data from multiple files of different formats (CSV, JSON, XML), transform the data into the required units, and load into a single file. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)|\n| [**Web Scraping** - 2025 PH Election Results](https://github.com/jrili/ph-election-results-2025-scraper) | Built a Web Scraping tool to extract the election results of the 2025 PH Elections and loaded into hierarchically-organized JSON files. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54) |\n| Coming soon:\u003cbr\u003eETL Pipeline, Visualization - Weather Data ETL | Build an ETL pipeline to extract weather data from using [VisualCrossing Weather API](https://www.visualcrossing.com/), transform data, and load into a postgres database | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)![Postgres](https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white)![Bash Script](https://img.shields.io/badge/bash_script-%23121011.svg?style=for-the-badge\u0026logo=gnu-bash\u0026logoColor=white)| \n| Coming soon:\u003cbr\u003eETL Pipeline - Health and Supplements Usage Data | Build an ETL pipeline to extract health data from wearable devices and health apps, transform data in various ways as per requirement, and load into a single file. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)! |\n| Coming soon:\u003cbr\u003eETL Pipeline - Retail Data| Build an ETL pipeline to extract grocery data from a retail company to be augmented with extra data in parquet format, transform and combine the data, and load into a single file. | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)|\n| Coming soon:\u003cbr\u003eData Transformation - Insurance Policy Data | Load insurance policy data into a locally deployed Postgres database, and produce the required data views using efficient SQL queries in a Jupyter Notebook | ![Postgres](https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Bash Script](https://img.shields.io/badge/bash_script-%23121011.svg?style=for-the-badge\u0026logo=gnu-bash\u0026logoColor=white)|\n| Coming soon:\u003cbr\u003eWeb Scraping, ETL Pipeline - All Generations Pokemon Data| Scrape the latest pokemon data from the public domain, transform and normalize them, then load into a single file and perhaps into a PostgreSQL database.| ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![Bash Script](https://img.shields.io/badge/bash_script-%23121011.svg?style=for-the-badge\u0026logo=gnu-bash\u0026logoColor=white)![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge\u0026logo=jupyter\u0026logoColor=white)![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)![Postgres](https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white) BeautifulSoup4|\n| [Other Projects Coming Soon] | Expanding into automated ETL pipelines, ETL of real-time data, and cloud-based storage loading.| ![Apache Airflow](https://img.shields.io/badge/Apache%20Airflow-017CEE?style=for-the-badge\u0026logo=Apache%20Airflow\u0026logoColor=white)![Apache Spark](https://img.shields.io/badge/Apache%20Spark-FDEE21?style=flat-square\u0026logo=apachespark\u0026logoColor=black)![AWS](https://img.shields.io/badge/AWS-%23FF9900.svg?style=for-the-badge\u0026logo=amazon-aws\u0026logoColor=white)![Azure](https://img.shields.io/badge/azure-%230072C6.svg?style=for-the-badge\u0026logo=microsoftazure\u0026logoColor=white)![Google Cloud](https://img.shields.io/badge/GoogleCloud-%234285F4.svg?style=for-the-badge\u0026logo=google-cloud\u0026logoColor=white)|\n\n# Skills Practiced\n* **Data Extraction**: APIs, web scraping, file systems\n* **Data Transformation**: Data cleaning, normalization, deduplication\n* **Data Loading**: CSV exports, database readiness\n* **Tools**: Python, Pandas, BeautifulSoup, Bash Scripting, SQL, PostgreSQL, Snowflake, , Microsoft Fabric, Apache Airflow, Kubernetes basics, AWS basics, Microsoft Azure basics\n* **Learning in Progress**: Spark, AWS Data Engineering services, Microsoft Azure Data Engineering services, Google Cloud Platform\n\n# About Me\n* More than 10 years experience designing, developing, and maintaining backend systems and microservices deployed on cloud services\n* Certifications:\n * DataCamp Associate Data Engineer ([Track](https://www.datacamp.com/completed/statement-of-accomplishment/track/5dac6f85d32d86a8dccba020cbbeacd8f3c9ed11) | [Certification](https://www.datacamp.com/certificate/DEA0014963158934))\n * DataCamp Data Engineer ([Track](https://www.datacamp.com/completed/statement-of-accomplishment/track/9ecdd3624b20f72960dd2c95a33273f05d8ae0ed) | [Certification](https://www.datacamp.com/certificate/DE0013679986474))\n * IBM Data Engineering Foundations Specialization ([Certificate](https://www.coursera.org/account/accomplishments/specialization/HKLY7QWR6IVT))\n* Actively building scalable, reliable data workflows and pipelines\n* Background in AI, Machine Learning, and Deep Learning (Master's degree)\n\n# Connect with Me\n* ***LinkedIn profile: [jessa-rili-migrino](https://www.linkedin.com/in/jessa-rili-migrino/)***\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrili%2Fdata-engineer-portfolio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjrili%2Fdata-engineer-portfolio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrili%2Fdata-engineer-portfolio/lists"}