{"id":28964736,"url":"https://github.com/rifkyiqbal52/data-analytics-projects","last_synced_at":"2026-05-05T09:31:57.691Z","repository":{"id":300767283,"uuid":"1007090480","full_name":"rifkyiqbal52/data-analytics-projects","owner":"rifkyiqbal52","description":"web scraping online store lazada.co.id, search running shoes","archived":false,"fork":false,"pushed_at":"2025-06-23T14:01:14.000Z","size":57,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-23T14:28:25.128Z","etag":null,"topics":["beautifulsoup","pandas","phyton","postgresql","scraping","scraping-websites","selenium"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rifkyiqbal52.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-23T12:56:27.000Z","updated_at":"2025-06-23T14:05:23.000Z","dependencies_parsed_at":"2025-06-23T14:30:53.275Z","dependency_job_id":"4547f893-c7e6-48cd-9320-72762c26e894","html_url":"https://github.com/rifkyiqbal52/data-analytics-projects","commit_stats":null,"previous_names":["rifkyiqbal52/data-analytics-projects"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rifkyiqbal52/data-analytics-projects","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifkyiqbal52%2Fdata-analytics-projects","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifkyiqbal52%2Fdata-analytics-projects/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifkyiqbal52%2Fdata-analytics-projects/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifkyiqbal52%2Fdata-analytics-projects/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rifkyiqbal52","download_url":"https://codeload.github.com/rifkyiqbal52/data-analytics-projects/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifkyiqbal52%2Fdata-analytics-projects/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32643553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"online","status_checked_at":"2026-05-05T02:00:06.033Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","pandas","phyton","postgresql","scraping","scraping-websites","selenium"],"created_at":"2025-06-24T06:00:42.062Z","updated_at":"2026-05-05T09:31:57.682Z","avatar_url":"https://github.com/rifkyiqbal52.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Web Scraping \u0026 Analysis: Running Shoes on Lazada\nAs part of my training, I was assigned the role of a Data Engineer working on a data pipeline/ETL project. My main task was to extract data from a website, process it, and store it in a PostgreSQL database.\n\nFor this project, I built a web scraping tool to gather product data from Lazada, specifically focusing on running shoes, which are currently trending due to the growing interest in running and fitness.\n\nThis project helped me understand the real-world workflow of a Data Engineer — from data extraction and cleaning to storage and analysis.\n\n---\n\n## 🎯 Objectives\n- Scrape product data related to running shoes from Lazada.\n- Clean and process the collected data.\n- Store the structured data in a PostgreSQL database using pgAdmin4.\n- Perform basic analysis to understand product distribution and popularity.\n  \n## 🛠️ Tools\n- Python: Main programming language\n- Pandas: Data manipulation and analysis\n- BeautifulSoup: HTML parsing for scraping static content\n- Selenium: Automating browser actions and scraping dynamic content\n- PostgreSQL: Database for storing the cleaned data\n- pgAdmin4: GUI for PostgreSQL database management\n\n## 📈 Collected Data Includes:\nthe data I scraped was up to 10 slides, resulting in 400 rows and 6 columns :\n- Product_Name\n- Price\n- Seller Location\n- Sold\n- Rating\n- Review\n\n## 🚀 Outcome\nBy the end of this project, I was able to simulate a real-world ETL (Extract, Transform, Load) process and gain hands-on experience in:\n1. Building web scrapers with Selenium \u0026 BeautifulSoup\n2. Structuring and cleaning data with Pandas\n3. Using PostgreSQL for data storage\n4. Understanding the workflow of a data engineering project\n\n📁 Check the [notebooks folder](notebooks/) for the Jupyter Notebook.\n\n📂 View [data folder](data/) for raw and cleaned datasets.\n\n## 📌 Note\nThis project is for educational purposes only. It complies with Lazada’s terms of use and was not used for commercial purposes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frifkyiqbal52%2Fdata-analytics-projects","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frifkyiqbal52%2Fdata-analytics-projects","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frifkyiqbal52%2Fdata-analytics-projects/lists"}