Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/memgonzales/parallel-email-scraper
Multiprocess email address scraper for the De La Salle University website staff directory. Our approach models the scraping task as a multiple producer – multiple consumer problem to achieve a 7.22× superlinear speedup compared to serial execution
https://github.com/memgonzales/parallel-email-scraper
email-scraper multiprocessing parallel-programming producer-consumer python queue selenium selenium-webdriver synchronized-queue web-scraper web-scraping
Last synced: about 2 months ago
JSON representation
Multiprocess email address scraper for the De La Salle University website staff directory. Our approach models the scraping task as a multiple producer – multiple consumer problem to achieve a 7.22× superlinear speedup compared to serial execution
- Host: GitHub
- URL: https://github.com/memgonzales/parallel-email-scraper
- Owner: memgonzales
- Created: 2022-11-11T15:55:00.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2022-12-25T14:59:52.000Z (about 2 years ago)
- Last Synced: 2024-04-14T11:45:16.936Z (9 months ago)
- Topics: email-scraper, multiprocessing, parallel-programming, producer-consumer, python, queue, selenium, selenium-webdriver, synchronized-queue, web-scraper, web-scraping
- Language: Python
- Homepage:
- Size: 15.4 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0