An open API service indexing awesome lists of open source software.

https://github.com/montoyamoraga/scrapers

scrapers for building your own image databases
https://github.com/montoyamoraga/scrapers

scrape scraper scraping selenium

Last synced: about 2 months ago
JSON representation

scrapers for building your own image databases

Awesome Lists containing this project

README

        

# scrapers

## About

scrapers is a collection of free/libre open-source software written by [Aarón Montoya-Moraga](http://montoyamoraga.io/).

scrapers is both a tool for building databases and and educational resource for learning scraping.

scrapers is educational because it tries to be heavily documented, clean, and easy to follow.

scrapers performs the scraping in an explicit way, it shows you the browser going through the data, instead of running in the background, thus being very open in the way it works, which can be used for both documentation and live performance.

## Technical details

All of these scrapers were written using [Python](https://www.python.org/), [Selenium](https://www.seleniumhq.org/), and [ChromeDriver](http://chromedriver.chromium.org/).

## Contents

* bing-images
* captcha
* google-images
* instagram
* mugshots

## Installation and prerrequisites

* Install Python2 and Python3
* Install Homebrew if in Mac
* Install Chromedriver

## Acknowledgements

* Thanks to [Sam Lavigne](http://lav.io/) and their class [Detourning the web](https://github.com/antiboredom/detourning-the-web).
* Thanks to my classmates at [School of Machines](http://schoolofma.org/) that helped me test and prototype some of these scrapers.

## License

MIT