https://github.com/ruofeidu/ducrawler
An automatic crawler to mine images from Google and Bing Image search (part of SketchyScene at ECCV 2018)
https://github.com/ruofeidu/ducrawler
data google image mining python search
Last synced: 10 months ago
JSON representation
An automatic crawler to mine images from Google and Bing Image search (part of SketchyScene at ECCV 2018)
- Host: GitHub
- URL: https://github.com/ruofeidu/ducrawler
- Owner: ruofeidu
- Created: 2017-12-12T18:25:29.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2022-03-08T02:03:29.000Z (almost 4 years ago)
- Last Synced: 2025-03-25T17:47:40.672Z (10 months ago)
- Topics: data, google, image, mining, python, search
- Language: HTML
- Homepage: https://sketchyscene.github.io/SketchyScene
- Size: 33.2 KB
- Stars: 12
- Watchers: 1
- Forks: 7
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DuCrawler
My crawler to mine image from Google and Bing image search
## Dependencies of crawler_google
* Python 2.7 / 3.6, compatiable with python 3.0+
* pip install bs4
* pip install requests
* pip install opencv-contrib-python
* pip2.7 install configparser
## Additional dependencies of crawler_bing
// pip install -U selenium
* pip install selenium==2.48.0
* see [Selenium](https://pypi.python.org/pypi/selenium)
* [PhantomJS 2.1.1](http://phantomjs.org/download.html)
* pip3.6 install urllib
* or pip2.7 install urlparse
## Author
[Ruofei Du](http://duruofei.com)
## References
[Writing Python 2-3 compatible code](http://python-future.org/compatible_idioms.html#unicode)
## License
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License with 996 ICU clause: [](https://996.icu/#/en_US)
The above license is only granted to entities that act in concordance with local labor laws. In addition, the following requirements must be observed:
* The licensee must not, explicitly or implicitly, request or schedule their employees to work more than 45 hours in any single week.
* The licensee must not, explicitly or implicitly, request or schedule their employees to be at work consecutively for 10 hours.