Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wkwan/perch
Mining Game Analytics from Steam and Social Media
https://github.com/wkwan/perch
data-mining datamining game-analytics gameanalytics python social-media social-media-analytics steam steam-api tiktok tiktok-api twitch twitch-api youtube youtube-api youtube-api-v3
Last synced: 15 days ago
JSON representation
Mining Game Analytics from Steam and Social Media
- Host: GitHub
- URL: https://github.com/wkwan/perch
- Owner: wkwan
- License: mit
- Created: 2025-01-04T01:38:15.000Z (22 days ago)
- Default Branch: main
- Last Pushed: 2025-01-04T01:54:44.000Z (22 days ago)
- Last Synced: 2025-01-04T02:38:43.866Z (22 days ago)
- Topics: data-mining, datamining, game-analytics, gameanalytics, python, social-media, social-media-analytics, steam, steam-api, tiktok, tiktok-api, twitch, twitch-api, youtube, youtube-api, youtube-api-v3
- Language: Python
- Homepage: https://perch.gg
- Size: 13.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Perch
> 📈 Mining Game Analytics from Steam and Social MediaPerch aggregates game data from Steam, Twitch, YouTube, and TikTok to help you do market research on video games.
This repo contains all the backend code to retrieve the data (by querying API's and webscraping), organize it with algorithms and AI, and save it to a PostgreSQL database.
### If you just want to see the data, you can use the website: [Perch.gg](https://perch.gg)
## Why Perch?
Perch has 2 big advantages over other game market research tools:
#### 1. Not platform-specific
Other tools like SteamDB, VG Insights, TwitchTracker, etc. have more in-depth metrics for specific platforms, while Perch's goal is to show a simple holistic overview of game performance across every platform. Aggregating the data in a useful way is hard, especially because most social media API's besides Twitch lack metadata for what game a piece of content is about.
#### 2. You can read the code to understand what the data actually means
Steam and social media recommendation algorithms are mysterious and constantly changing, and a lot of important metrics available from closed-source market research tools like game sales, social media revenue, etc. can only be approximated very roughly. The point of market research is to help you understand the market, so it helps to understand how the numbers are calculated.
## Installation
Tested with Python 3.12.8. Don't use Python 3.13, _psycopg2_binary_ currently doesn't support it.
1. Clone this repo
```shell
git clone https://github.com/wkwan/perch.git
cd perch
```2. Created a new Python virtual environment named _venv_
```shell
python -m venv venv
```3. Activate the environment
#### Windows PowerShell
```shell
.\venv\Scripts\activate
```#### macOS/Linux
```shell
source venv/bin/activate
```4. Install the Python packages
You might get an error saying you need _Microsoft C++ Build Tools_ which you can get here: https://visualstudio.microsoft.com/visual-cpp-build-tools/
```shell
pip install -r requirements.txt
```[requirements.txt](requirements.txt) is generated with pipreqs. If you add Python packages, you can regenerate it:
```shell
pip install pipreqs
pipreqs . --force --ignore venv
```## Authentication Credentials and Configuration
Required credentials and other configuration variables are defined as environment variables in [.env](.env). Replace the placeholders in [.env](.env) or set the environment variables on your machine yourself ([.env](.env) won't override your local environment variables).
You can setup a free PostgreSQL local database using the official documentation at https://www.postgresql.org/
The unofficial TikTok API from RapidAPI is a paid service: https://rapidapi.com/Lundehund/api/tiktok-api23
Download ChromeDriver here: https://developer.chrome.com/docs/chromedriver/downloads
## Mining Data
With your venv activated:
```shell
python scheduler.py
```This fetches the data and saves it to your database. It fetches immediately and then hangs until the start of the next hour (1pm, 2pm, 3pm, etc).
For faster debugging with fewer requests:
```shell
python scheduler.py --fast
```Steam and Twitch data will be fetched every hour. YouTube and TikTok data will be fetched every 25 hours because of rate limits.
[scheduler.py](scheduler.py) uses a lock file to prevent multiple processes running it simultaneously. This means you can schedule a cron job to make sure it restarts when it fails, and it won't lead to multiple processes mining data redundantly.
For example, to setup a cron job on a Linux server that tries to start [scheduler.py](scheduler.py) every minute, open your crontab file with:
```shell
crontab -e
```And add this:
```
* * * * * //venv/bin/python //scheduler.py
```## Contributing
So far I've written all the code for Perch myself. AMA about Perch when I'm working on it live at: https://twitch.tv/willkwan
Schedule: Sunday and Monday, 11am-9pm PT
### 3 ways to contribute:
1. Buy a subscription to [Perch.gg](https://perch.gg)
2. Pull requests
3. Twitch subscriptions (you can subscribe to 1 Twitch channel for free every month if you have Amazon Prime, and I get up to $2.25 for each sub depending on what country you're in)## Master Plan
### Phase 1
Do whatever it takes to maximize active Perch.gg paid subscriptions and community code contributions.### Phase 2
Expand to other big data problems in games.## Licensing
This project is licensed under the MIT License. See [LICENSE](LICENSE).