https://github.com/18520339/web-scraping-with-scrapy
Python web scraping with Scrapy
https://github.com/18520339/web-scraping-with-scrapy
scrapy web-crawling web-scraping
Last synced: about 1 year ago
JSON representation
Python web scraping with Scrapy
- Host: GitHub
- URL: https://github.com/18520339/web-scraping-with-scrapy
- Owner: 18520339
- Created: 2020-06-29T16:55:00.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-31T17:17:12.000Z (over 5 years ago)
- Last Synced: 2025-03-25T23:47:10.343Z (about 1 year ago)
- Topics: scrapy, web-crawling, web-scraping
- Language: Python
- Homepage: https://www.youtube.com/playlist?list=PLhTjy8cBISEqkN-5Ku_kXG4QW33sxQo0t
- Size: 479 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Python web scraping with Scrapy
> Demo: https://www.youtube.com/watch?v=ysyskgjsPI0&t=1m45s
## Features:
+ Download Images
+ Store data in many kinds of database: SQLite, MySQL, MongDB
+ Store data in many formats: csv, json, xml
+ Using User Agent, Proxy
## Installation:
1. run `pip install -r requirements.txt`
2. Install and connect SQLite, MySQL, MongDB
## Usage:
+ for .csv: `scrapy crawl amazon -o "store by formats"/products.csv`
+ for .json: `scrapy crawl amazon -o "store by formats"/products.json`
+ for .xml: `scrapy crawl amazon -o "store by formats"/products.xml`