https://github.com/liamarguedas/steam-scraping-selenium
Selenium-based scraping algorithm for top Steam store seller games.
https://github.com/liamarguedas/steam-scraping-selenium
python scraper selenium
Last synced: about 2 months ago
JSON representation
Selenium-based scraping algorithm for top Steam store seller games.
- Host: GitHub
- URL: https://github.com/liamarguedas/steam-scraping-selenium
- Owner: liamarguedas
- License: mit
- Created: 2023-04-07T22:57:40.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-04-17T11:29:03.000Z (about 3 years ago)
- Last Synced: 2025-12-28T20:54:28.343Z (6 months ago)
- Topics: python, scraper, selenium
- Language: Python
- Homepage: https://github.com/liamarguedas/steam-games-price
- Size: 6.7 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🎮 Steam Scraping Selenium
Welcome to my open Steam Scraping open source Python project!
As a PC Gamer and Data Geek this project is a labor of love that I have been working on for some time, and I am excited to share it with the world. As an open source project, anyone can access and contribute to the code, making it a collaborative effort with the potential to reach a wide audience.
Through the use of Python programming language and Selenium, I have created a scraper that collects information about specific games and outputs it in a csv file for data uses. This project is designed to be user-friendly, efficient, and scalable, allowing for easy implementation and customization for a variety of use cases.
Whether you're a seasoned analyst or just starting out, I believe this project will be a valuable tool in your toolkit. I encourage you to explore the code, contribute your own ideas and improvements, and help make this project the best it can be.
Thank you for your interest and support!
## 📖 Prerequisites
In order to run the scraper you will need `Python >= 3.9.0`, `Selenium >= 4.8.3`, `pandas >= 1.5.0` and `numpy >= 1.10.0` installed in your enviroment. You can install `requirements.txt` with:
```shell
pip install -r requirements.txt
```
You will also need a ChromeDriver in order to scrape with Selenium, you can get the current Browser Driver [here](https://chromedriver.chromium.org/downloads) and set it as your webdriver path.
An easy alternative is by installing `webdriver-manager >= 3.8.5`, method used in the scraper:
```shell
pip install webdriver-manager
```
Source code: [github](https://github.com/SergeyPirogov/webdriver_manager)
## 📃 Instructions
In order to import the scraper to your project use (Source code needs to be in the same folder):
```shell
from scraper import GetSteamGames
```
GetSteamGames() has 4 attributes:
```shell
GetSteamGames(ToScrape = 10, ToWait = 0.5, verbose = True, Scroll = 5)
```
**ToScrape :** ***int, default = 10***
Number of games to scrape by the scraper
**ToWait :** ***float, default = 0.5***
Time to wait for each scrape to finish, need to be equal or higher than 0.5
**verbose :** ***bool, default = True***
Prints progress and information about the scraping
**Scroll :** ***int, default = 5***
Times to scroll in the steam games website. Each scroll gets you about 20 games, from there you can select how many to scrape with `ToScrape`, example:
If you want a dataset with about 1000 samples, set `Scroll = 50` and `ToScrape = 1000`
## 🗳️ Output
The scraper outputs a `games.csv` file in the same folder containing the following data:
| Column | Description |
| ------------- |:-------------:|
| **GameName** | The name of the game |
| **AgeRestriction** | Whether or not the data has age restriction |
| **GameDescription** | Steam Description about the game |
| **Reviews** | Total reviews of the game |
| **ReleaseDate** | Indicates the release day/month/year of the game |
| **Developer** | Developer of the game |
| **FullPrice** | Indicates the full price of the game without any discounts |
| **DiscountedPrice** | Indicates the discounted price if there was a sale at the time of the scrape, if not value would be 0. |
| **PEGI** | URL of an image with the PEGI rank, image url finish with */DEJUS/{rank}.png* |
| **MetacriticScore** | Metacritic Score of the game if available |
| **Type** | Indicates de tagged category of the game |
| **LastUpdate** | Last time it was updated |
| **GamesLanguages** | Number of languages available |
| **GameFeatures** | A list containing Steam Features present in the game |
| **DRM Notice** | Whether the user needs to sign a DRM or not |
| **GameAchievements** | Number of Achievements the game has |
| **CuratorReviews** | Number of Curator Reviews the game has |
## ⚖️ License
MIT © [Steam-Scraping-Selenium](LICENSE)