https://github.com/ford-prefect/ceo-kar-roll-scraper
PDF scraper for Karnataka electoral rolls
https://github.com/ford-prefect/ceo-kar-roll-scraper
Last synced: 2 months ago
JSON representation
PDF scraper for Karnataka electoral rolls
- Host: GitHub
- URL: https://github.com/ford-prefect/ceo-kar-roll-scraper
- Owner: ford-prefect
- Created: 2013-01-21T04:33:08.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2013-01-21T06:03:50.000Z (over 12 years ago)
- Last Synced: 2025-04-12T04:17:02.830Z (2 months ago)
- Size: 559 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Karnataka electoral roll scraper
----This script is intended to scrape the electoral roll from
http://ceokarnataka.kar.nic.in and generate a text electoral roll.This is in a very early draft stage right now. It has some skeleton code to
pick up each entry from the roll of a single polling station, and parse out a
bunch of fields. This works fine for most cases.Todo
----
Lots! :)* Deal with names that span multiple lines
* Actually dump output in a meaningful format (CSV? JSON?)
* Scrape out polling station information
* Fix up to run over a dump of the PDFs for all polling stations and build up
the electoral roll for Karnataka as a wholeDependencies
----
* Python
* pdfminer