https://github.com/mert-byrktr/scrape-epl-data
Scrape English Premier League data from fbref and save it as csv for future works.
https://github.com/mert-byrktr/scrape-epl-data
beautifulsoup beautifulsoup4 data-science epl football football-data pandas python python3 requests scraping webscraping
Last synced: about 1 year ago
JSON representation
Scrape English Premier League data from fbref and save it as csv for future works.
- Host: GitHub
- URL: https://github.com/mert-byrktr/scrape-epl-data
- Owner: mert-byrktr
- Created: 2024-01-23T14:59:03.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-02-10T11:36:48.000Z (over 1 year ago)
- Last Synced: 2025-05-07T03:01:51.120Z (about 1 year ago)
- Topics: beautifulsoup, beautifulsoup4, data-science, epl, football, football-data, pandas, python, python3, requests, scraping, webscraping
- Language: Python
- Homepage:
- Size: 44.9 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Premier League Data Scraping
This Python script scrapes data from the Premier League using BeautifulSoup and Pandas. It collects match and shooting statistics for each Premier League team over multiple seasons and stores the data in a CSV file.
## Prerequisites
Before running the script, make sure you have the following Python libraries installed:
- requests
- BeautifulSoup (bs4)
- pandas
You can install them using pip:
`pip install requests beautifulsoup4 pandas`
## Usage
1. Clone or download this repository to your local machine.
2. Open a terminal or command prompt and navigate to the project directory.
3. Run the script by executing the following command:
`python scrape.py`
4. The script will start scraping data from the website, and the collected data will be saved to a file named `matches.csv` in the same directory.
## Configuration
You can adjust the scraping behavior by modifying the constants in the script:
- `DATA_DELAY`: Adjust the delay (in seconds) between web requests to avoid overloading the website's server.