https://github.com/montoyamoraga/scrapers
scrapers for building your own image databases
https://github.com/montoyamoraga/scrapers
scrape scraper scraping selenium
Last synced: about 2 months ago
JSON representation
scrapers for building your own image databases
- Host: GitHub
- URL: https://github.com/montoyamoraga/scrapers
- Owner: montoyamoraga
- License: mit
- Created: 2018-08-31T11:46:29.000Z (over 6 years ago)
- Default Branch: main
- Last Pushed: 2022-04-01T11:07:25.000Z (about 3 years ago)
- Last Synced: 2025-03-17T06:35:32.891Z (about 2 months ago)
- Topics: scrape, scraper, scraping, selenium
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 51
- Watchers: 3
- Forks: 7
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# scrapers
## About
scrapers is a collection of free/libre open-source software written by [Aarón Montoya-Moraga](http://montoyamoraga.io/).
scrapers is both a tool for building databases and and educational resource for learning scraping.
scrapers is educational because it tries to be heavily documented, clean, and easy to follow.
scrapers performs the scraping in an explicit way, it shows you the browser going through the data, instead of running in the background, thus being very open in the way it works, which can be used for both documentation and live performance.
## Technical details
All of these scrapers were written using [Python](https://www.python.org/), [Selenium](https://www.seleniumhq.org/), and [ChromeDriver](http://chromedriver.chromium.org/).
## Contents
* bing-images
* captcha
* google-images
* mugshots## Installation and prerrequisites
* Install Python2 and Python3
* Install Homebrew if in Mac
* Install Chromedriver## Acknowledgements
* Thanks to [Sam Lavigne](http://lav.io/) and their class [Detourning the web](https://github.com/antiboredom/detourning-the-web).
* Thanks to my classmates at [School of Machines](http://schoolofma.org/) that helped me test and prototype some of these scrapers.## License
MIT