Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/philipperemy/japanese-street-addresses-scraper
Scraper for Japanese street addresses (住所).
https://github.com/philipperemy/japanese-street-addresses-scraper
data dataset-creation dataset-generation scraper scraper-engine
Last synced: 20 days ago
JSON representation
Scraper for Japanese street addresses (住所).
- Host: GitHub
- URL: https://github.com/philipperemy/japanese-street-addresses-scraper
- Owner: philipperemy
- Created: 2017-09-28T02:21:29.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2022-01-22T10:47:54.000Z (almost 3 years ago)
- Last Synced: 2024-10-10T19:01:36.331Z (about 1 month ago)
- Topics: data, dataset-creation, dataset-generation, scraper, scraper-engine
- Language: Python
- Homepage:
- Size: 7.02 MB
- Stars: 6
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Japanese Street Addresses Scraper
*Scraper for Japanese street addresses (住所).* From [itp.ne.jp](https://itp.ne.jp/?rf=1).
## Some figures
- **`7,225,873`** is the potential number of distinct postal addresses listed on the Japanese yellow pages.
- **`12`** is the number of days it took to retrieve them all, using VPN and [IP auto switching](https://github.com/philipperemy/expressvpn-python).## Script Requirements
- **Python 3.5+**
- numpy
- [expressvpn_python](https://github.com/philipperemy/expressvpn-python) - if you plan to use the VPN mode.
- requests
- natsort
- beautifulsoup4
- unicode_slugify## Usage
```bash
https://github.com/philipperemy/japanese-street-addresses-scraper.git
cd japanese-street-addresses-scraper
pip3 install -r requirements.txt
./main.sh # it starts the scraping!
```