Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/scrapy/quotesbot
This is a sample Scrapy project for educational purposes
https://github.com/scrapy/quotesbot
Last synced: 27 days ago
JSON representation
This is a sample Scrapy project for educational purposes
- Host: GitHub
- URL: https://github.com/scrapy/quotesbot
- Owner: scrapy
- License: mit
- Created: 2016-09-27T13:55:40.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2023-11-29T22:31:31.000Z (12 months ago)
- Last Synced: 2024-10-01T14:21:31.447Z (about 1 month ago)
- Language: Python
- Homepage: http://doc.scrapy.org/en/latest/intro/tutorial.html
- Size: 5.86 KB
- Stars: 1,296
- Watchers: 71
- Forks: 778
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome - quotesbot - This is a sample Scrapy project for educational purposes (Python)
- awesome-security-collection - **813**星
README
# QuotesBot
This is a Scrapy project to scrape quotes from famous people from http://quotes.toscrape.com ([github repo](https://github.com/scrapinghub/spidyquotes)).This project is only meant for educational purposes.
## Extracted data
This project extracts quotes, combined with the respective author names and tags.
The extracted data looks like this sample:{
'author': 'Douglas Adams',
'text': '“I may not have gone where I intended to go, but I think I ...”',
'tags': ['life', 'navigation']
}## Spiders
This project contains two spiders and you can list them using the `list`
command:$ scrapy list
toscrape-css
toscrape-xpathBoth spiders extract the same data from the same website, but `toscrape-css`
employs CSS selectors, while `toscrape-xpath` employs XPath expressions.You can learn more about the spiders by going through the
[Scrapy Tutorial](http://doc.scrapy.org/en/latest/intro/tutorial.html).## Running the spiders
You can run a spider using the `scrapy crawl` command, such as:
$ scrapy crawl toscrape-css
If you want to save the scraped data to a file, you can pass the `-o` option:
$ scrapy crawl toscrape-css -o quotes.json