https://github.com/thaoshibe/crawl-original-google-images
  
  
    python scripts for crawling original image from Google Images 
    https://github.com/thaoshibe/crawl-original-google-images
  
chrome-extension crawler crawling crawling-python google google-images pafy scraper youtube youtube-dl youtube-search
        Last synced: 2 days ago 
        JSON representation
    
python scripts for crawling original image from Google Images
- Host: GitHub
- URL: https://github.com/thaoshibe/crawl-original-google-images
- Owner: thaoshibe
- Created: 2020-06-07T05:50:58.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-05-05T05:09:18.000Z (over 3 years ago)
- Last Synced: 2024-09-29T03:22:40.430Z (about 1 year ago)
- Topics: chrome-extension, crawler, crawling, crawling-python, google, google-images, pafy, scraper, youtube, youtube-dl, youtube-search
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 21
- Watchers: 3
- Forks: 2
- Open Issues: 0
- 
            Metadata Files:
            - Readme: readme.md
 
Awesome Lists containing this project
README
          # Crawl Original Google Images & Youtube Videos
---
This repo contains code to crawl images and videos:
- ORIGINAL images from Google Search
- ORIGINAL videos from Youtube
### Requirements
1. **ChromeDriver**
	- [Check your current Google Chrome Version](https://www.businessinsider.com/what-version-of-google-chrome-do-i-have)
	- Download ChromeDriver corresponding to your Chrome Version at [ChromeDriver](https://chromedriver.chromium.org/downloads), unzip it.
	For example, I'm using Chrome Version `95.0.4638.69`, Linux, so I downloaded [`chromedriver_linux64.zip`](https://chromedriver.storage.googleapis.com/index.html?path=95.0.4638.69/)
1. **Enviroments**
	`conda env create -f environment.yml`
### Crawl Images from Google Image Search
Download original (not thumbnails) from Google Images Search with **multi-threading** :D
1. Get URLs by keywords
	```
		python crawl_url.py
	```
1. Download imgs from URLs
	```
		python crawl_data.py
	```
### Crawl Videos from Youtube
1. Get URLs by keywords
	```
	python crawl_youtube_link.py
	```
1. Download videos from URLs
	```
	python crawl_videos.py
	python crawl_videos.py --metadata --thumbnail # thumbnail and metadata only
	```
##### To-do
- [x] Init
- [x] Multithreading
- [x] Requiremets
- [x] Write Guideline
- [ ] Add parser to save_dirs, chromedriver, etc.