https://github.com/narpat78/gem-document-scraper
A Selenium-based web scraper for automating Contract downloads from the Government e-Marketplace portal within a given date range for a specified Category. Handles Captcha solving and dynamic content loading to ensure complete data extraction.
https://github.com/narpat78/gem-document-scraper
pillow pytesseract-ocr selenium
Last synced: 2 months ago
JSON representation
A Selenium-based web scraper for automating Contract downloads from the Government e-Marketplace portal within a given date range for a specified Category. Handles Captcha solving and dynamic content loading to ensure complete data extraction.
- Host: GitHub
- URL: https://github.com/narpat78/gem-document-scraper
- Owner: narpat78
- Created: 2025-02-25T07:13:14.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-17T10:53:15.000Z (over 1 year ago)
- Last Synced: 2025-12-27T18:27:46.124Z (6 months ago)
- Topics: pillow, pytesseract-ocr, selenium
- Language: Python
- Homepage: https://gem.gov.in/view_contracts
- Size: 8.79 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GeM-Document-Scraper