Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/manuelandersen/padel-scrapy
πΈοΈ Collects data from padelfip website
https://github.com/manuelandersen/padel-scrapy
Last synced: 8 days ago
JSON representation
πΈοΈ Collects data from padelfip website
- Host: GitHub
- URL: https://github.com/manuelandersen/padel-scrapy
- Owner: manuelandersen
- License: gpl-3.0
- Created: 2024-06-27T00:08:54.000Z (7 months ago)
- Default Branch: master
- Last Pushed: 2024-08-30T00:59:05.000Z (5 months ago)
- Last Synced: 2024-11-14T01:08:17.612Z (2 months ago)
- Language: Python
- Homepage:
- Size: 70.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## padel-scrapy
This is a project to scrap data from the [International Padel Federation](https://www.padelfip.com/es/) page. It collects players, tourneys and games information into JSON files.
## Installation
1) Clone the repository
``` bash
git clone https://github.com/manuelandersen/padel-scrapy.git
cd padel-scrapy
```2) Create a virtual environment (optional but recommended):
``` bash
python3 -m venv venv
source venv/bin/activate
```3) Install the dependencies:
``` bash
pip install -r requirements.txt
```## Running the spiders
``` console
# you need to be inside the padelscraper directory
cd padelscraper# to run player spider
scrapy crawl playerspider# to run tournament spider
scrapy crawl tournamentspider# to run games spider you need to give it a url and the numbers of days played
# this info can be obtained from the tournamentspider results
scrapy crawl gamespider -a start_url="the_star_url" -a days_played=days_played# if you want to store the json file
scrapy crawl playerspider -O path_to_file.json
```If you prefer not to create a virtual environment, you can use Docker instead.
``` console
# to build the containe
docker build -t scrapy-project .# to run one of the spiders
docker run scrapy-project scrapy crawl tournamentspider
```## Examples
Examples of the way the data look for each spider can be found in the `examples` folder.
## Contributions
We welcome contributions to improve and expand this project! Whether you're fixing a bug, adding a new feature, or improving documentation, your help is appreciated.