Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/donotknowwhy/scrape-by-puppeteer
https://github.com/donotknowwhy/scrape-by-puppeteer
puppeteer scraping
Last synced: 10 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/donotknowwhy/scrape-by-puppeteer
- Owner: Donotknowwhy
- Created: 2024-08-17T09:45:32.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-08-18T01:16:45.000Z (3 months ago)
- Last Synced: 2024-10-17T22:26:48.318Z (29 days ago)
- Topics: puppeteer, scraping
- Language: JavaScript
- Homepage:
- Size: 5.14 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraping Project: thegioididong.com Product Data Extraction
## Overview
This project aims to scrape product data from the website [thegioididong.com](https://www.thegioididong.com/), a popular electronics retailer in Vietnam. The script uses Puppeteer, a Node.js library, to automate the process of navigating through the website, handling pagination, and extracting product information. The extracted data is then formatted into a JSON array containing the product name, original price, sale price, and discount rate.
## Features
- **Automated Navigation**: The script automatically navigates through the product listing pages, handling the "Load More" button to load additional products.
- **Data Extraction**: Extracts detailed product information including name, original price, sale price, and discount rate.
- **JSON Output**: Converts the extracted data into a JSON array for easy consumption and further processing.