Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gabrielagodek/webscraper
The project was developed during master's studies. It is based on the Python library Scrapy.
https://github.com/gabrielagodek/webscraper
data-analysis python scraper scrapy
Last synced: about 2 months ago
JSON representation
The project was developed during master's studies. It is based on the Python library Scrapy.
- Host: GitHub
- URL: https://github.com/gabrielagodek/webscraper
- Owner: GabrielaGodek
- Created: 2022-12-12T21:04:28.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-11-23T16:46:29.000Z (about 1 year ago)
- Last Synced: 2023-11-23T17:41:31.173Z (about 1 year ago)
- Topics: data-analysis, python, scraper, scrapy
- Language: Julia
- Homepage:
- Size: 271 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraper
This repository contains a web scraper and data analysis script for extracting and visualizing currency exchange rates from the website [https://notowania.pb.pl](https://notowania.pb.pl). The web scraper is implemented using Scrapy, a popular web crawling and scraping framework for Python.## Prerequisites
- Python 3.x
- Scrapy
- json_lines
- matplotlib
- numpy## Installation
1. Install the required Python packages using the following command:
```bash
pip install scrapy json_lines matplotlib numpy
```## Usage
### 1. Web Scraper (SpiderSpider)
The web scraper (`SpiderSpider`) is designed to crawl the specified URLs on [https://notowania.pb.pl](https://notowania.pb.pl) and extract relevant information. To run the scraper, use the following command:
```bash
scrapy runspider src/spider.py -o ../data/currency.jl
```This command will execute the scraper and store the extracted data in JSON Lines format (currency.jl).
# Data Analysis Script
To run the data analysis script, use the following command:
```bash
python data.py
```
This command will generate three line charts (eur_pln.png, usd_pln.png, and eur_usd.png) in the public directory.## EUR/PLN
![euro to pln chart](public/eur_pln.png)
## EUR/USD
![euro to usd chart](public/eur_usd.png)
## USD/PLN
![usd to pln chart](public/usd_pln.png)