https://github.com/accordbox/scrapy-spider-example
Scrapy spider example for Scrapy Tutorial Series
https://github.com/accordbox/scrapy-spider-example
python3 scrapy-spider scrapy-tutorial
Last synced: about 1 year ago
JSON representation
Scrapy spider example for Scrapy Tutorial Series
- Host: GitHub
- URL: https://github.com/accordbox/scrapy-spider-example
- Owner: AccordBox
- Created: 2018-01-06T02:45:51.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2018-01-06T02:46:44.000Z (over 8 years ago)
- Last Synced: 2025-03-26T07:04:57.095Z (over 1 year ago)
- Topics: python3, scrapy-spider, scrapy-tutorial
- Language: Python
- Homepage: https://blog.michaelyin.info/scrapy-tutorial-series-web-scraping-using-python/
- Size: 7.81 KB
- Stars: 77
- Watchers: 1
- Forks: 17
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This project is a Scrapy spider example collection, [Michael Yin](https://blog.michaelyin.info/) create this project to host the source code of [Scrapy Tutorial Series: Web Scraping Using Python](https://blog.michaelyin.info/scrapy-tutorial-series-web-scraping-using-python/)
You can find Scrapy spider example code which can help you:
1. A simple Scrapy spider shows you how to extract data from the web page.
2. How to handle pagination in Scrapy spider.
3. A simple script which can make your Scrapy shell more powerful.
4. How to define Scrapy item, and how to create a custom Item Pipeline to save the data of Item into Databases such as Mysql or PostgreSQL.
5. All the code can run without problem in Python2 and Python3
## Warning
When you run the code in your local env, make sure to edit `CONNECTION_STRING` in `scrapy_spider/settings.py`. If you want more detail about this project, just check the [Scrapy Tutorial Series: Web Scraping Using Python](https://blog.michaelyin.info/scrapy-tutorial-series-web-scraping-using-python/)
## Feedback
If you have any problem, feel free to fire issues in Github, I will reply ASAP.
## Contact
You can contact me `admin#michaelyin.info`