{"id":24087980,"url":"https://github.com/tinybirdco/games-analytics","last_synced_at":"2026-06-15T05:32:33.095Z","repository":{"id":113972300,"uuid":"375366961","full_name":"tinybirdco/games-analytics","owner":"tinybirdco","description":"A demo to show how to build real-time leaderboards for a video games platform, supporting billions of daily game plays and more ","archived":false,"fork":false,"pushed_at":"2021-06-16T11:39:30.000Z","size":336,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-27T05:25:14.491Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tinybirdco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-09T13:34:06.000Z","updated_at":"2021-06-16T11:39:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"68734979-defb-4586-bcc5-90c9d2fd78f3","html_url":"https://github.com/tinybirdco/games-analytics","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tinybirdco/games-analytics","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tinybirdco%2Fgames-analytics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tinybirdco%2Fgames-analytics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tinybirdco%2Fgames-analytics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tinybirdco%2Fgames-analytics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tinybirdco","download_url":"https://codeload.github.com/tinybirdco/games-analytics/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tinybirdco%2Fgames-analytics/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34349925,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-10T03:56:46.981Z","updated_at":"2026-06-15T05:32:33.090Z","avatar_url":"https://github.com/tinybirdco.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# About\n\nThis is a POC to show how a video-game company could use Tinybird to do real-time analytics and have updated leaderboards for teams and players in their platform.\n\n## Installation and running the project\n\nCreate a virtual environment and install the necessary packages\n\n```bash\npython3 -mvenv .e # creates environment in\n.e/bin/activate # activates environment\npip install -r requirements.txt\n```\n\nTo replicate the project on your Tinybird account, run\n```bash\ntb auth # will ask you for you auth token\ntb push # will create all the Data Sources, Materialized Views and endpoints in your account\ntb datasource append gameplays_string https://storage.googleapis.com/tinybird-assets/datasets/demos/games-analytics/gameplays.csv\n```\nAfter this, a token named 'games_demo_poc' will be created in your account as well, that will give you access to two dynamic API endpoints: one for a players leaderboard and one for a teams leaderboard. Both will be accessible from a [URL like this](https://api.tinybird.co/endpoints?token=p.eyJ1IjogImU3NWNmMjUxLThlNjctNGRlOC1iM2FlLTdmMzhlZGIwODdmOSIsICJpZCI6ICJhYjY4OWQ0OS0zODFhLTQzNmYtOTZjZS0zNGFmMWI0MGE4OTQifQ.xNAZcDBP-M_fnOcyw7J3QkpOZEzB5IJAWTqyJqrx8pM).\n\nThis project also contains a pipe called `speedup_analysis`, where you can see how we make a query to get one of the rankings 400X faster by using Materialized Views. Check it out to see how to transform the original query to read data from MVs.\n\n## Generating data by yourself\n\nIf you want to use the data we've generated already and is available in [this bucket](https://console.cloud.google.com/storage/browser/tinybird-assets/datasets/demos/games-analytics), skip this section\n\nYou can generate data with the `generate_data.py` script:\n\n```\n\u003e p generate_data.py --help                                                                                                                        \nUsage: generate_data.py [OPTIONS]\n\nOptions:\n  --date-start [%Y-%m-%d]\n  --date-end [%Y-%m-%d]\n  --num-games INTEGER\n  -i, --include-start-date-in-filename BOOLEAN\n  --format [json|csv]\n  --help                          Show this message and exit.\n```\n\nWe'll generate JSON data to show how to work with it on Tinybird. If your data is generated in CSV format already, you can ingest it directly to the `gameplays` datasource and you don't need the `gameplays_string` Data Source or the `gameplays_mv` pipe.\n\n### Generate NDJSON data\n\nTo generate data for the month of may, run `sh generate_data_may.sh`. It'll create 100K gameplays for each day of may and save all data in a NDJSON file. Then, running `sh generate_data_june.sh` will create one independent NDJSON file with 100k gameplays for the first 10 days of June.\n\nAfter doing this I uploaded all the data to this bucket on Google Cloud: https://console.cloud.google.com/storage/browser/tinybird-assets/datasets/demos/games-analytics\n\n## The Tinybird project\n\n### Generating the project structure\nRunnig `tb init` will create the `datasources`, `endpoints`, `explorations` and `pipes` folders.\n\n### Authenticating to your Tinybird account\nRunning `tb auth` will ask you for your admin token and will show you a link to https://ui.tinybird.co/tokens to copy it from there.\n\nThen it'll create a `.tinyb` file in the directory you are with that info.\n\n### Analyzing a data file to generate a Data Source schema\nThis is done by running `tb datasource generate \u003cfilename/url\u003e`. In this case I passed the path of a local file.\n\n#### If your data is in JSON format\n\nIf your data is in a NDJSON format, like in `gameplays_sample.ndjson`, need to convert it to a single String column CSV file to ingest it to Tinybird. Then, you'd create a materialized view on Tinybird to extract each value from that String column.\n\nThis would only be necessary if you generated the data manually. The data in Google Cloud is already in a compatible CSV format. However, if it wasn't, converting an ndjson-formatted file to CSV can be done with a command like this:\n\n```bash\njq -r '[. | tostring] | @csv' data/gameplays_sample.ndjson  \u003e data/gameplays_sample_string.csv\n```\n\nIf you have created all the JSON files and they were in your `data` folder, to convert them to to CSV, you'd do `sh json_to_csv.sh`.\n\nWhat it does is escaping every double quotes and enclosing each line in double quotes.\n\nThen you can generate the schema for the Data Source doing `tb datasource generate gameplays_string.csv` and this would be the file generated:\n\n```\nDESCRIPTION generated from gameplays_string.csv\n\nSCHEMA \u003e\n    `column_00` String\n```\n\nI changed change the `column_00` name to  `value` before pushing\n\n### Ingesting data to Tinybird\n\n\nCreate the Data Source on your Tinybird account running `tb push datasources/gameplays_string.datasource`:\n\nAppend data for may to it running `tb datasource append https://storage.googleapis.com/tinybird-assets/datasets/demos/games-analytics/gameplays.csv`\n\n\n### Extracting data from the gameplays_string Data Source into different columns and materializing the result\n\nThe `gameplays_string` Data Source has a String column where we store a JSON per gameplay. A Pipe like this extract each value from that JSON into a separate column:\n\n```sql\nNODE extract_values\nSQL \u003e\n\n    SELECT \n        CAST(JSONExtractString(value, 'nick'), 'LowCardinality(String)') nick,\n        CAST(JSONExtractString(value, 'team'), 'LowCardinality(String)') team,\n        CAST(JSONExtractString(value, 'game'), 'LowCardinality(String)') game,\n        parseDateTimeBestEffort(JSONExtractString(value, 'datetime')) datetime,\n        JSONExtract(value, 'score', 'UInt64') score\n    FROM gameplays_string\nTYPE materialized\nDATASOURCE gameplays\n```\n\nTo push it to Tinybird, as well as all the needed dependencies and populate the view, we run `tb push pipes/gameplays_mv.pipe --populate --push-deps`\n\nThis is how the Data Flow graph in your account would look after this\n![](images/data-flow-1.png)\n\n### Create materializations to have real-time rankings of top players and teams per date and game\n\nRunning these two commands, two new MVs will be created, to aggregate total scores by date, game and team, and by date, game and player:\n\n```shell\n\u003e tb push pipes/gameplays_group_date_game_team_mv.pipe --push-deps --populate\n\u003e tb push pipes/gameplays_group_date_game_player_mv.pipe --push-deps --populate\n```\n\nThey're very similar. Let's look at `gameplays_group_date_game_team_mv`:\n```sql\nNODE calculate\nSQL \u003e\n\n    SELECT \n        toDate(datetime) date,\n        game,\n        team,\n        sum(score) score\n    FROM gameplays\n    GROUP BY date, game, team\n    ORDER BY date, game, team\nTYPE materialized\nDATASOURCE gameplays_by_date_game_team\n```\n\nAnd this is the schema definition of `gameplays_by_date_game_team`:\n\n```sql\nSCHEMA \u003e\n    `date` Date,\n    `game` LowCardinality(String),\n    `team` LowCardinality(String),\n    `score` UInt64\n\nENGINE \"SummingMergeTree\"\nENGINE_SORTING_KEY \"date,game,team\"\nENGINE_PARTITION_KEY \"toYYYYMM(date)\"\n```\n\nUsing a SummingMergeTree lets you not having to use `sumState` functions in the MV definitions, and also not having to use `sumMerge` functions to see the result of aggregations.\n\n\nAnd finally, to create endpoints to query these two views, you'd run:\n\n```\n\u003e tb push pipes/top_teams_per_day.pipe --force\n\u003e tb push pipes/top_players_per_day.pipe --force\n```\n\nThis is what the `top_teams_per_day` definition looks like. `top_players_per_day` is very similar. As you see, we've added some dynamic parameters that will let you filter the results by date, game and team. By default, it will return all the teams and games with sum of scores they got today.\n\n```sql\nNODE results\nSQL \u003e\n\n    %\n    SELECT date, game, team, sum(score) score FROM gameplays_by_date_game_team\n    WHERE 1=1\n    {% if not defined(date) %}\n        AND date = (SELECT max(date) FROM gameplays_by_date_game_team)\n    {% else %}\n        AND date = {{Date(date, '', description=\"Get only the ranking for this date\")}}\n    {% end %}\n    {% if defined(game) %}\n        AND game = {{String(game, '', description=\"Get only the ranking for this game\")}}\n    {% end %}\n    GROUP BY date, game, team\n    ORDER BY date desc, score desc\n\n```\n\nThis is how your final Data Flow graph would look like:\n![](images/final-data-flow-graph.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftinybirdco%2Fgames-analytics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftinybirdco%2Fgames-analytics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftinybirdco%2Fgames-analytics/lists"}