https://github.com/comba92/paginegialle-scraper
Simple scraper which collects business names, addresses and phones numbers from PagineGialle's search results.
https://github.com/comba92/paginegialle-scraper
business-data scraper
Last synced: 3 months ago
JSON representation
Simple scraper which collects business names, addresses and phones numbers from PagineGialle's search results.
- Host: GitHub
- URL: https://github.com/comba92/paginegialle-scraper
- Owner: Comba92
- Created: 2025-02-27T11:10:45.000Z (4 months ago)
- Default Branch: master
- Last Pushed: 2025-03-21T20:48:44.000Z (3 months ago)
- Last Synced: 2025-03-21T21:26:14.144Z (3 months ago)
- Topics: business-data, scraper
- Language: Rust
- Homepage:
- Size: 14.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Paginegialle Scraper
Simple scraper which collects business names, addresses and phones numbers from PagineGialle's search results.
Given the region, province, and business category, the scraper will for look businesses data in every province's city.
The parser is fast and parallel; expect a scraping execution time (average of a few thousands HTTP requests) below a minute.# Usage
> [!NOTE]
> Be aware, if any of the parameters contains space, it should be replaced with underscores.
```bash
paginegialle-scraper region_name province_name busingess_category [pages_scraping_limit] [output_filename]
```
Esempio:
```bash
paginegialle-scraper lombardia milano ristoranti 20 ristoranti-milano
paginegialle-scraper lombardia milano agenzie_immobiliari 20 agenzie-milano
```# Build
Requires the Rust toolchain. Prefer the release version, as it is way faster.
```bash
cargo build -r
```