https://github.com/gabrielianfr/web-scraping-project
A Python-based web scraping tool that extracts and stores data in JSON format using BeautifulSoup and Requests.
https://github.com/gabrielianfr/web-scraping-project
beautifulsoup dataextraction json python requests webscraping
Last synced: about 2 months ago
JSON representation
A Python-based web scraping tool that extracts and stores data in JSON format using BeautifulSoup and Requests.
- Host: GitHub
- URL: https://github.com/gabrielianfr/web-scraping-project
- Owner: gabrielianfr
- Created: 2025-03-20T05:15:44.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-03-20T05:38:18.000Z (about 2 months ago)
- Last Synced: 2025-03-20T06:28:54.670Z (about 2 months ago)
- Topics: beautifulsoup, dataextraction, json, python, requests, webscraping
- Language: Python
- Homepage:
- Size: 2.93 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraping Project
Ini adalah project untuk melakukan web scraping. Project ini menggunakan Python dan beberapa library untuk mengekstrak data dari website target.
## Requirements
- Python 3.x
- BeautifulSoup4
- Requests
- Pandas (opsional untuk analisis data)## Instalasi
1. Clone repositori ini:
git clone https://github.com/username/web-scraping-project.git
2. Install dependencies:## Cara Menggunakan
1. Edit `config.py` untuk menentukan URL yang akan di-scrape.
2. Jalankan script `scraper.py` untuk memulai scraping:## Struktur Folder
- `scripts/`: Berisi file Python untuk melakukan scraping dan utilitas terkait.
- `data/`: Menyimpan hasil scraping dalam format JSON.
- `config.py`: File konfigurasi untuk pengaturan seperti URL, headers, dll.