An open API service indexing awesome lists of open source software.

https://github.com/civicdatalab/ndp_scraper


https://github.com/civicdatalab/ndp_scraper

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

          

# Scraper
This code is intended to scrape [data.gov.in](https://data.gov.in/catalogs) website catalogs and dump the result into formatted csv file. scraper_main.py should be run inorder to start scraping the website.

## Assumptions that may affect the code in the future
* XPATHS - all xpaths are working for now. If the site gets updated, xpaths may need an update.
* Number of pages i.e. PAGES_TO_TRAVERSE_IN_SITE in variables.py file.