https://github.com/apify/actor-scrapy-books-example
Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
https://github.com/apify/actor-scrapy-books-example
apify scrapy
Last synced: 8 months ago
JSON representation
Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
- Host: GitHub
- URL: https://github.com/apify/actor-scrapy-books-example
- Owner: apify
- Created: 2023-12-12T13:29:43.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-04-04T10:52:26.000Z (about 2 years ago)
- Last Synced: 2025-02-16T16:19:53.033Z (over 1 year ago)
- Topics: apify, scrapy
- Language: Python
- Homepage: https://apify.com/vdusek/scrapy-books-example
- Size: 26.4 KB
- Stars: 1
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Actor Scrapy Books Example
This project serves as an example of Python Scrapy project. It scrapes book data from [books.toscrape.com](https://books.toscrape.com/).
## Getting Started
### Install Apify CLI
To use this scraper, you need to install the Apify CLI. Follow the instructions [here](https://docs.apify.com/cli/docs/installation).
### Install Python and Virtualenv
Make sure you have Python installed. If not, download it [here](https://www.python.org/). Any version supported by [Apify SDK](https://pypi.org/project/apify/) and [Scrapy](https://pypi.org/project/Scrapy/) should be fine.
Additionally, install [uv](https://docs.astral.sh/uv/) package manager.
```bash
pip install uv
```
## Run the Actor locally
### Prepare Python environment
Install Python dependencies:
```bash
make install-dev
```
Activate the virtual environment:
```bash
source .venv/bin/activate
```
### Run the scraper as Scrapy project
The project is still runnable as a Scrapy project. Execute the following command:
```bash
scrapy crawl book_spider -o books.json
```
### Run the scraper as Apify Actor
Run the scraper as an Apify Actor using:
```bash
apify run --purge
```
## Deploy on Apify
### Log in to Apify
You will need to provide your [Apify API Token](https://console.apify.com/account/integrations) to complete this action.
```bash
apify login
```
### Deploy your Actor
This command will deploy and build the Actor on the Apify Platform. You can find your newly created Actor under [Actors -> My Actors](https://console.apify.com/actors?tab=my).
```Bash
apify push
```
## Documentation reference
To learn more about Apify and Actors, take a look at the following resources:
- [Integrating Scrapy projects](https://docs.apify.com/cli/docs/integrating-scrapy)
- [Apify SDK for Python](https://docs.apify.com/sdk/js)
- [Apify Platform](https://docs.apify.com/platform)
- [Join our developer community on Discord](https://discord.com/invite/jyEM2PRvMU)