Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mocfaisal/scraping_wilayah_pdf
Scraping Data Wilayah Indonesia from PDF
https://github.com/mocfaisal/scraping_wilayah_pdf
csv kemendagri mysql node nodejs wilayah wilayah-indonesia
Last synced: 16 days ago
JSON representation
Scraping Data Wilayah Indonesia from PDF
- Host: GitHub
- URL: https://github.com/mocfaisal/scraping_wilayah_pdf
- Owner: mocfaisal
- Created: 2024-06-06T04:44:12.000Z (7 months ago)
- Default Branch: master
- Last Pushed: 2024-06-20T01:38:10.000Z (7 months ago)
- Last Synced: 2024-11-08T07:53:14.748Z (2 months ago)
- Topics: csv, kemendagri, mysql, node, nodejs, wilayah, wilayah-indonesia
- Language: JavaScript
- Homepage:
- Size: 63.9 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Scraping Region PDF
Scraping Region from PDF with node.js
## Features
Export to
- CSV file
- Direct insert into MySQL database## Installation
Clone the project
```bash
git clone https://github.com/mocfaisal/scraping_wilayah_pdf.git
```Go to the project directory
```bash
cd scraping_wilayah_pdf
```Copy `.env` file
```bash
cp .env.example .env
```### Environment Variables
To run this project, you will need to add the following environment variables to your .env file
```bash
DB_USER=YOUR_DB_USERNAME
DB_PASSWORD=YOUR_DB_PASSWORD
DB_NAME=YOUR_DB_DATABASE
DB_TABLE=YOUR_DB_TABLE
```Install dependencies
```bash
npm install
```Start the project
```bash
npm run test
```## Note
- [Raw.pdf](/src/raw.pdf) is an extracted version of the original file, only pages with tables are used
- Please check folder `dist` is exists, if there's some error.## Related
Here are some related projects
[Scraping Permendagri 72/2019 - Python](https://github.com/kodewilayah/permendagri-72-2019)
# Known Bugs
- [x] Kurang 1 Kabupaten/Kota - Fixed
- [x] Jumlah Kelurahan/Desa tidak sama (83.749) seharusnya (83.763), kurang 14 data desa provinsi Papua## Changelog
- [x] 2024-06-06 | v1.0 | Scrapping Data | Data Wilayah Kemendagri - Kepmen 100.1.1-6117 Tahun 2022