{"id":21069356,"url":"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow","last_synced_at":"2026-05-12T16:05:15.218Z","repository":{"id":220193104,"uuid":"750780499","full_name":"janaom/gcp-de-project-connect-four-with-python-dataflow","owner":"janaom","description":"Connect Four Data Engineering Project: leveraging GCS for scalable and durable storage, Dataflow for data extraction and transformation, BigQuery as the data repository, Slack Integration for real-time sharing, Looker for insightful reports and visualizations, and Email Scheduler for automated report delivery.","archived":false,"fork":false,"pushed_at":"2024-01-31T19:38:25.000Z","size":185,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-20T21:42:44.993Z","etag":null,"topics":["apache-beam","data-engineering","dataflow","etl","gcp","python","slack-integration"],"latest_commit_sha":null,"homepage":"https://medium.com/google-cloud/%EF%B8%8Fgcp-data-engineering-project-connect-four-game-with-python-and-apache-beam-%EF%B8%8F-21e336c2aa62","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/janaom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-31T09:59:50.000Z","updated_at":"2024-02-02T13:38:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"16bef229-3387-4d90-b250-f1049e721430","html_url":"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow","commit_stats":null,"previous_names":["janaom/gcp-de-project-connect-four-with-python-dataflow"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janaom%2Fgcp-de-project-connect-four-with-python-dataflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janaom%2Fgcp-de-project-connect-four-with-python-dataflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janaom%2Fgcp-de-project-connect-four-with-python-dataflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/janaom%2Fgcp-de-project-connect-four-with-python-dataflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/janaom","download_url":"https://codeload.github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243510335,"owners_count":20302342,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-beam","data-engineering","dataflow","etl","gcp","python","slack-integration"],"created_at":"2024-11-19T18:34:39.696Z","updated_at":"2025-12-30T16:43:49.454Z","avatar_url":"https://github.com/janaom.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# \u003cimg width=\"40\" alt=\"image\" src=\"https://github.com/janaom/gcp-data-engineering-etl-with-composer-dataflow/assets/83917694/60f8f158-3bdc-4b3d-94ae-27a12441e2a3\"\u003e  GCP Data Engineering project: Connect Four game with Python and Apache Beam 🔴⚫\n\n\nGet ready for an exciting adventure in the world of Connect Four!🥳 I have an awesome task for you: we've got 11 players, 88 games, and a text file filled with all their moves. Your mission is to identify the winners and store the results in a database to enable future analytical queries. Sounds like an easy task, right?\n\nBut here's the twist: the winners are unknown!🤨 Participants continue playing until the game grid is completely filled with chips. So your task is to parse the match data (filename: matchdata.txt) and figure out who won the match. And when you have all the data available, the final step is to create a similar table with the following structure:\n\n```\n+-------------+------------+--------------+-----+------+-----------------+\n| player_rank | player_id  | games_played | won | lost | win_percentage  |\n+-------------+------------+--------------+-----+------+-----------------\n```\n\nI will guide you through the process of transforming the Connect Four algorithm into a standalone Python code and subsequently converting it into Apache Beam code to create an ETL Dataflow pipeline. The main focus here is to showcase the flexibility of Apache Beam and demonstrate how you can easily adapt your Python code with any complex logic to create a robust ETL pipeline. ETL, in this context, goes beyond data cleaning and removing invalid characters.\n\nI will provide a step-by-step guide on how to query your data in BigQuery to create the main table. Additionally, we will explore creating a Looker report to visualize the Connect Four game analytics, setting up scheduled email deliveries to each player based on their player_id, and even building a Slack App to automatically share the saved CSV file with the game results in a designated Slack channel.\n\nI use GCP services for this project.\n\n![connect-four2 drawio](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/cc5626fa-8d11-449a-874f-2c64910638a9)\n\n\n\u003cimg width=\"20\" alt=\"image\" src=\"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/0887957d-db1b-4938-a9fa-f497fcebbeff\"\u003e Google Cloud Storage: it is used to store the text file, providing scalable and durable object storage.\n\n\n\u003cimg width=\"20\" alt=\"image\" src=\"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/5df502de-1936-42fb-a3d5-2fa7cf0c5723\"\u003e Dataflow: it is utilized to extract data from the storage bucket, perform data transformations such as counting winners and losers, and load the processed data into BigQuery.\n\n\n\u003cimg width=\"20\" alt=\"image\" src=\"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/7c95b56f-a1fc-49b3-8e96-ed607f2094ea\"\u003e BigQuery: it serves as the data repository for the Connect Four project, enabling efficient querying and analysis of the dataset. \n\n\n\u003cimg width=\"20\" alt=\"image\" src=\"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/1d47b6a1-a957-4156-89f5-1cbbe5d72451\"\u003e Slack Integration: it is used to send the report results via Slack. This integration facilitates real-time sharing and collaboration with team members, ensuring efficient communication and discussion of the Connect Four game results.\n\n\n\u003cimg width=\"20\" alt=\"image\" src=\"https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/512b60f2-f910-46f8-b785-563c92ca70c4\"\u003e Looker: it is used to generate insightful reports and visualizations based on the data in BigQuery, facilitating data exploration and analysis.\n\n\n✉️Email Scheduler: it is used to schedule the delivery of Looker reports via email. This feature ensures regular and automated delivery of the Connect Four game reports to the intended recipients, keeping them updated on the latest analysis and insights.\n\n# 📢 Connect Four rules\n\nBefore we get to the task, let's review the rules of Connect Four. In this game, two players take turns strategically placing red and black chips on a 6x7 grid. Each player is assigned a color, and red always goes first. The goal is to be the first player to connect four of their chips in a horizontal, vertical, or diagonal line on the grid.\n\nThe data is in this format:\n\n```\nplayer_0, player_1\nR1,B1,R2,B2,R3,B3,R4,B6,...\n\nplayer_2, player_3\nR1,B2,R3,B1,R4,...\n```\n\nIn the Connect Four matches, the first player listed always plays as red, while the second player always plays as black. The moves are represented using a combination of the color and the column number: `\u003ccolor\u003e\u003ccolumn\u003e`. In the first match above, player_0 makes the move \"R1\" which denotes that they place their chip in the first column. Since there are no chips in that column, it falls to the bottom. Player_1 (black) responds by placing their chip in the first column as well. However, since there's already a red chip in that column, the black chip ends up on top of the red chip.\n\nEach game contains two rows:\n\n```\nRow 1: player names \nRow 2: moves played in the game\n```\n\nKeep in mind that the game could be over before the final move recorded in the file is made. Once you have identified the winning move, there is no need to continue reading the data. It's important to consider that there are no draw cases (if the board becomes completely filled without either player achieving a four-in-a-row connection, the game is considered a draw).\n\nHere is an example from the second game in the text file with R1,B2,R5 moves. Player_1 secured the win with the winning move on R5. Here, R (red) moves, B (yellow) moves.\n\n\nplayer_1,player_6 R3,B3,R4,B6,R1,B5,R6,B3,R4,B4,R6,B1,R5,B7,R6,B2,R2,B7,`R1,B2,R5`,B1,R4,B3,R2,B2,R1,B4,R3,B5,R5,B5,R3,B7,R2,B1,R7,B6,R7,B6,R7,B4\n\n![20240126_185845](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/ac44a70e-987f-4079-8356-9d9168da8de6)\n\n# 🕹️ Connect Four algorithm\n\nFirst, let's discuss the Connect Four algorithm.\n\n```python\n#Find the next empty row in the specified column of the grid\ndef find_next_empty_row(grid, col):\n    for row in range(5, -1, -1):\n        if grid[row][col] == ' ':\n            return row\n    return None\n\n#Check all cells in the grid for a win condition\ndef check_winner(grid):\n    for row in range(6):\n        for col in range(7):\n            player = grid[row][col]\n            if player == ' ':\n                continue\n\n            #Check horizontal\n            if col \u003c= 3 and all(grid[row][c] == player for c in range(col, col + 4)):\n                return True\n\n            #Check vertical\n            if row \u003c= 2 and all(grid[r][col] == player for r in range(row, row + 4)):\n                return True\n\n            #Check diagonal (top-left to bottom-right)\n            if row \u003c= 2 and col \u003c= 3 and all(grid[row + d][col + d] == player for d in range(4)):\n                return True\n\n            #Check diagonal (bottom-left to top-right)\n            if row \u003e= 3 and col \u003c= 3 and all(grid[row - d][col + d] == player for d in range(4)):\n                return True\n\n    return False\n\ndef determine_winner(moves_list):\n    game_results = []\n\n    for moves in moves_list:\n        grid = [[' ' for _ in range(7)] for _ in range(6)]\n        player_ids = moves[0].split(\",\")  #Extract player IDs\n\n        for i, move in enumerate(moves[1].split(\",\"), start=1):\n            if len(move) \u003c 2:\n                continue  #Skip invalid moves\n\n            try:\n                col = int(move[1]) - 1\n                row = find_next_empty_row(grid, col)\n\n                if i % 2 == 0:\n                    player_id = player_ids[1]\n                    token = 'R'\n                else:\n                    player_id = player_ids[0]\n                    token = 'B'\n\n                grid[row][col] = token\n\n                if check_winner(grid):\n                    game_results.append((player_id, player_ids[0] if i % 2 == 0 else player_ids[1]))\n                    break\n            except ValueError:\n                continue  #Skip invalid moves\n\n    return game_results\n```\n- The `find_next_empty_row` function takes a grid and a column as input and searches for the next empty row in that column. It iterates through the rows in reverse order and returns the index of the first empty row found. If no empty row is found, it returns `None`.\n  \n- The `check_winner` function  meticulously examines the grid to determine if there is a win condition.  It checks for four consecutive cells with the same player value in  horizontal, vertical, and diagonal directions. If any win condition is  encountered, the function returns `True`; otherwise, it returns `False`.\n  \n- The `determine_winner` function takes a list of moves as input and determines the winner of each game. It initializes an empty grid and extracts the player IDs from the first element of each moves list. It then iterates through the moves, updating the grid with the corresponding player's token. After each move, it checks if a winning condition is met using the `check_winner` function. If a winner is found, it appends a tuple of the winning player's ID and the opponent's ID to the `game_results` list. Finally, it returns the `game_results` list, which contains the winners for each game in the moves list.\n\n# 🐍 Python code\n\nUpload matchdata.txt file to the bucket.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/38bc6d6a-c33a-445f-8f95-b9164eaf196e)\n\nAnalyze the Python code `python-to-bq.py` and run it with the command `python python-to-bq.py`\n\nAdditionally to the Connect Four algorithm, this code has these elements:\n\n✔️ `determine_winner_from_file(file_name)` function takes a `file_name` as input and returns a list of game results\n\n✔️ the code reads the txt file from the bucket\n\n✔️ it iterates over the `game_results` list and prints the game number, winner ID, and loser ID in a tabular format\n  \n✔️ the code creates a dataset and table in BigQuery if they do not exist\n  \n✔️ the game results are then loaded into the table\n\n![20240126_210128](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/b8a13b2d-3ffb-4cba-896c-8945d0f74129)\n\nThe game results are loaded into the BigQuery table.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/6f1571f8-0cf1-4707-ab19-e72b0c6da60d)\n\n\nTo order the data by the `game_number` column, you can use the `ORDER BY` clause.\n\n```SQL\nSELECT game_number, winner_id, loser_id\nFROM your_table_name\nORDER BY game_number;\n```\n\nYou'll be able to view the results of all 88 games.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/e99e79c9-c1aa-447d-85b8-df2df84ca78f)\n\n\n\n# \u003cimg width=\"40\" alt=\"image\" src=\"https://beam.apache.org/images/mascot/beam_mascot_500x500.png\"\u003e Beam code\n\nLet's highlight the primary distinction between Beam and Python code.\n\n```python\np = beam.Pipeline()\n\n(\n    p\n    | \"Create game results\" \u003e\u003e beam.Create(game_results)\n    | \"Write to BigQuery\" \u003e\u003e WriteToBigQuery(\n        table=\"project_id:dataset.table\",\n        schema=\"game_number:INTEGER, winner_id:STRING, loser_id:STRING\",\n        write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,\n        custom_gcs_temp_location=\"gs://your-bucket/temp\"\n    )\n)\n```\n\n✔️ a Beam pipeline (`p`) is created.\n\n✔️ the `game_results` list is passed to the pipeline using the `beam.Create` transform, which creates a PCollection containing the game results.\n\n✔️ the `WriteToBigQuery` transform is  applied to the PCollection of game results. It specifies the BigQuery  table to write to, the schema of the table, the write disposition  (appending to the existing table), and the temporary location in Google  Cloud Storage.\n\nIn this case the code assumes that the dataset already exists in BigQuery. \n\nTo execute the code, simply run the command `python beam-to-bq.py`. Executing this command will yield identical results to running the Python code. To verify all game results in a sequential order, run the corresponding SQL query.\n\n```SQL\nSELECT game_number, winner_id, loser_id\nFROM your_table_name\nORDER BY game_number;\n```\n\n# 🌊 Dataflow job\n\nThe last step is to adjust Beam code to the Dataflow job. In this code, the assumption is that the dataset already exists in BigQuery. The code has specific imports and details.\n\n```python\nimport apache_beam as beam\nfrom apache_beam.io.gcp.bigquery import WriteToBigQuery\nfrom google.cloud import storage\nimport argparse\n\n#Function to find the next empty row in a column of a grid\ndef find_next_empty_row(grid, col):\n    for row in range(5, -1, -1):\n        if grid[row][col] == ' ':\n            return row\n    return None\n\n#Function to check if there is a winner in the grid\ndef check_winner(grid):\n    for row in range(6):\n        for col in range(7):\n            player = grid[row][col]\n            if player == ' ':\n                continue\n\n            if col \u003c= 3 and all(grid[row][c] == player for c in range(col, col + 4)):\n                return True\n\n            if row \u003c= 2 and all(grid[r][col] == player for r in range(row, row + 4)):\n                return True\n\n            if row \u003c= 2 and col \u003c= 3 and all(grid[row + d][col + d] == player for d in range(4)):\n                return True\n\n            if row \u003e= 3 and col \u003c= 3 and all(grid[row - d][col + d] == player for d in range(4)):\n                return True\n\n    return False\n\n#Function to determine the winner(s) of each game based on the moves\ndef determine_winner(moves_list):\n    game_results = []\n\n    for moves in moves_list:\n        grid = [[' ' for _ in range(7)] for _ in range(6)]\n        player_ids = moves[0].split(\",\")\n\n        for i, move in enumerate(moves[1].split(\",\"), start=1):\n            if len(move) \u003c 2:\n                continue\n\n            try:\n                col = int(move[1]) - 1\n                row = find_next_empty_row(grid, col)\n\n                if i % 2 == 0:\n                    player_id = player_ids[1]\n                    token = 'R'\n                else:\n                    player_id = player_ids[0]\n                    token = 'B'\n\n                grid[row][col] = token\n\n                if check_winner(grid):\n                    if i % 2 == 0:\n                        winner = player_ids[1]\n                        loser = player_ids[0]\n                    else:\n                        winner = player_ids[0]\n                        loser = player_ids[1]\n\n                    game_results.append((winner, loser))\n                    break\n            except ValueError:\n                continue\n\n    return game_results\n\n#Function to determine the winner(s) from a file\ndef determine_winner_from_file(file_name):\n    game_results = []\n\n    try:\n        #Create a GCS client\n        storage_client = storage.Client()\n\n        #Access the GCS bucket and file\n        bucket_name = \"connect-four-us\"\n        bucket = storage_client.get_bucket(bucket_name)\n        blob = bucket.blob(file_name)\n\n        #Download and read the file contents\n        data = blob.download_as_text()\n\n        matches = data.strip().split(\"\\n\\n\")\n\n        for game_number, match in enumerate(matches, start=1):\n            match_lines = match.strip().split(\"\\n\")\n            if len(match_lines) \u003e= 2:\n                moves = match_lines[1:]\n                winners = determine_winner([match_lines])\n\n                if winners:\n                    for winner, loser in winners:\n                        game_results.append({\"game_number\": game_number, \"winner_id\": winner, \"loser_id\": loser})\n\n    except Exception as e:\n        print(f\"Error accessing GCS file: {e}\")\n\n    return game_results\n\n#Specify the file name\nfile_name = \"matchdata.txt\"\n\n#Get the game results from the file\ngame_results = determine_winner_from_file(file_name)\n\n#Function to run the Beam pipeline\ndef run_pipeline(project, region, bucket, file_name, table):\n    game_results = determine_winner_from_file(file_name)\n\n    options = {\n        \"project\": project,\n        \"region\": region,\n        \"staging_location\": f\"gs://{bucket}/staging\",\n        \"temp_location\": f\"gs://{bucket}/temp\",\n        \"job_name\": \"connect-four-job\",\n        \"runner\": \"DataflowRunner\",\n        \"save_main_session\": True,\n    }\n\n    pipeline_options = beam.pipeline.PipelineOptions(flags=[], **options)\n\n    p = beam.Pipeline(options=pipeline_options)\n\n    (\n        p\n        | \"Create game results\" \u003e\u003e beam.Create(game_results)\n        | \"Write to BigQuery\" \u003e\u003e WriteToBigQuery(\n            table=table,\n            schema=\"game_number:INTEGER, winner_id:STRING, loser_id:STRING\",\n            write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND\n        )\n    )\n\n    p.run().wait_until_finish()\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--project\", help=\"Google Cloud project ID\")\n    parser.add_argument(\"--region\", help=\"Dataflow region\")\n    parser.add_argument(\"--bucket\", help=\"Google Cloud Storage bucket\")\n    parser.add_argument(\"--file\", help=\"Input file name\")\n    parser.add_argument(\"--table\", help=\"BigQuery table\")\n\n    args = parser.parse_args()\n\n    run_pipeline(args.project, args.region, args.bucket, args.file, args.table)\n```\n\nCreate Dataflow job with this command:\n\n```python\npython dataflow.py \\\n  --project project_id \\\n  --region region \\\n  --bucket your-bucket \\\n  --file matchdata.txt \\\n  --table project_id:dataset.table\n```\nHere is an example: `python dataflow-job.py --project connect-four-408317 --region us-central1 --bucket connect-four-us --file matchdata.txt --table connect-four-408317:connect_four.dataflow_results`\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/c01a49d3-7cce-4f4e-b211-b719b5d3b361)\n\n\n❗️ My first job failed with the comment: 'The zone 'projects/connect-four-408317/zones/us-central1-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later.' So I changed us-central1 to us-east1.\n\nYou will see a simple pipeline. And then a table in BigQuery.\n\n![20240128_182806](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/5a955dbc-744a-4f5f-ac06-06cc66bb02cc)\n\nThe `help` parameter in `argparse.ArgumentParser().add_argument()` is used to provide a brief description or help message for the command-line argument. It serves as a helpful guide for users who may not be familiar with the script or its command-line interface.\n\nWhen users run a script with the `--help` flag, argparse will automatically generate a help message that includes the descriptions specified by the `help` parameter for each argument. This helps users understand the purpose and expected values of each argument.\n\nTo see an example, run: `python dataflow.py --help`\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/b497f157-10c5-48ee-a193-1f8967a71dbf)\n\n\n# 📃 A step-by-step guide to data querying\n\nAnd now that we have all the results stored in BigQuery, let's return to our primary objective and proceed with creating the main table.\n\n```\n+-------------+------------+--------------+-----+------+-----------------+\n| player_rank | player_id  | games_played | won | lost | win_percentage  |\n+-------------+------------+--------------+-----+------+-----------------\n```\n\nIn our current schema we have: `game_number`, `winner_id`, `loser_id` fields.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/f13c8b20-e1ed-4059-a359-781a96219d2c)\n\n\nLet's combine the `winner_id` and `loser_id` columns into a single column called `player_id`. Then, the main query selects the `player_id` column from the subquery and uses the `COUNT(*)` function to calculate the number of `games played` by each player.\n\n```SQL\nSELECT player_id, COUNT(*) AS games_played\nFROM (\n  SELECT winner_id AS player_id\n  FROM your_table_name\n  UNION ALL\n  SELECT loser_id AS player_id\n  FROM your_table_name\n) AS subquery\nGROUP BY player_id\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/cc1d06e0-794c-4604-97db-39734f08073b)\n\n\nThen, let's calculate the number of games `won`, and games `lost` for each player.\n\n```SQL\nWITH all_players AS (\n  SELECT winner_id AS player_id, 'won' AS Result\n  FROM your_table_name\n  UNION ALL\n  SELECT loser_id AS player_id, 'lost' AS Result\n  FROM your_table_name\n)\nSELECT player_id, \n       COUNT(*) AS games_played,\n       COUNT(CASE WHEN Result = 'won' THEN 1 END) AS won,\n       COUNT(CASE WHEN Result = 'lost' THEN 1 END) AS lost\nFROM all_players\nGROUP BY player_id;\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/697e0ed8-ae1f-441a-8165-2fbb42e8a36e)\n\nWe will calculate the win percentage by dividing the count of games won by the total count of games, multiplying it by 100, rounding it to two decimal places, and assigns it to the column alias `win_percentage`.\n\n```SQL\nWITH all_players AS (\n  SELECT winner_id AS player_id, 'won' AS Result\n  FROM your_table_name\n  UNION ALL\n  SELECT loser_id AS player_id, 'lost' AS Result\n  FROM your_table_name\n)\nSELECT player_id, \n       COUNT(*) AS games_played,\n       COUNT(CASE WHEN Result = 'won' THEN 1 END) AS won,\n       COUNT(CASE WHEN Result = 'lost' THEN 1 END) AS lost,\n       ROUND((COUNT(CASE WHEN Result = 'won' THEN 1 END) / COUNT(*)) * 100, 2) AS win_percentage\nFROM all_players\nGROUP BY player_id;\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/1a050f03-82d0-4212-98f4-f42b2f8e9261)\n\nHere, the `player_rank` column added, showing the rank of each player based on win percentage, along with other player statistics.\n\n```SQL\nWITH all_players AS (\n  SELECT winner_id AS player_id, 'won' AS Result\n  FROM your_table_name\n  UNION ALL\n  SELECT loser_id AS player_id, 'lost' AS Result\n  FROM your_table_name\n), player_stats AS (\n  SELECT player_id, \n         COUNT(*) AS games_played,\n         COUNT(CASE WHEN Result = 'won' THEN 1 END) AS won,\n         COUNT(CASE WHEN Result = 'lost' THEN 1 END) AS lost,\n         ROUND((COUNT(CASE WHEN Result = 'won' THEN 1 END) / COUNT(*)) * 100, 2) AS win_percentage\n  FROM all_players\n  GROUP BY player_id\n)\nSELECT RANK() OVER (ORDER BY win_percentage DESC) AS player_rank,\n       player_id,\n       games_played,\n       won,\n       lost,\n       win_percentage\nFROM player_stats\nORDER BY player_rank;\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/af20de7f-fd41-4835-bff1-9dba818a858a)\n\n\nJust as example, you will get different results with `ROW_NUMBER()` function:\n```SQL\n\u003c...\u003e\nSELECT ROW_NUMBER() OVER (ORDER BY win_percentage DESC) AS player_rank,\n       player_id,\n       games_played,\n       won,\n       lost,\n       win_percentage\nFROM player_stats;\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/f02786dd-3a95-4dff-951f-34da86f2466c)\n\nNow, let's create a new table `connect_four_performance_summary`.\n\n```SQL\nCREATE TABLE connect_four.connect_four_performance_summary AS (\nWITH all_players AS (\n  SELECT winner_id AS player_id, 'won' AS Result\n  FROM your_table_name\n  UNION ALL\n  SELECT loser_id AS player_id, 'lost' AS Result\n  FROM your_table_name\n), player_stats AS (\n  SELECT player_id,\n         COUNT(*) AS games_played,\n         COUNT(CASE WHEN Result = 'won' THEN 1 END) AS won,\n         COUNT(CASE WHEN Result = 'lost' THEN 1 END) AS lost,\n         ROUND((COUNT(CASE WHEN Result = 'won' THEN 1 END) / COUNT(*)) * 100, 2) AS win_percentage\n  FROM all_players\n  GROUP BY player_id\n)\nSELECT RANK() OVER (ORDER BY win_percentage DESC) AS player_rank,\n       player_id,\n       games_played,\n       won,\n       lost,\n       win_percentage\nFROM player_stats\nORDER BY player_rank\n);\n```\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/c1ca3bcd-fc4c-4bee-83aa-e45b9425d866)\n\n\n\nGreat job! With the `connect_four_performance_summary`, we now have all the information about the players' performance in a Connect Four game.\n\n\n# 📊 Looker\n\nIn BigQuery export your data to the Looker Studio.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/34b54fda-088e-4329-bf52-4709758b25c3)\n\n\nCreate a dashboard. Here is my example.\n\n![20240128_200142](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/e3956434-cf9f-4873-ba36-4b58aafca133)\n\n# 📧 Send the results via email\n\nClick on Share in Looker and Schedule delivery. You can filter the results (use Filters) and send them to each player based on the player_id.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/1074c957-ad5e-4bb8-8dc0-a3c024b4a163)\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/9c14dbff-436e-44a7-8372-7456971a3082)\n\n\nThe participats will receive the PDF version of the report and the link to the interactive report.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/6f4be8e0-6f1f-4bd6-a0ba-2c09630ff434)\n\n\n\n# 💬 Send the results to the Slack\n\nGo to the https://api.slack.com\n\nCreate a new App.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/dcf054e2-c507-4e02-bc8b-4dcb83f06f28)\n\n\nOpen OAuth \u0026 Permissions and click on 'Add an OAuth Scope'.\n\nAdd these permissions\n\n```\nchannels:read\nchannels:join\nusers:read\nfiles:write\ngroups:read\nim:read\nmpim:read\nchat:write\n```\nCopy Bot User OAuth Token: xoxb-\u003c…\u003e\n\nConnect your App to the workspace. You will see your App in the Slack.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/50e025be-6001-469c-a28f-64ffbf5c979c)\n\n\nCreate a channel, e.g. game-results. Run the code to get the channel ID, you will see a similar output: Channel ID: CXXXXXX\n\n```python\nfrom slack_sdk import WebClient\n\n#Create a client\ntoken = \"xoxb-\u003c...\u003e\"\nslack_client = WebClient(token=token)\n\n#Get list of channels\nchannels = slack_client.conversations_list()\nchannel_id = None\n\n#Find the channel ID based on the channel name\nfor channel in channels['channels']:\n    if channel['name'] == 'game-results':\n        channel_id = channel['id']\n        break\n\nif channel_id:\n    print(\"Channel ID: \", channel_id)\nelse:\n    print(\"Channel not found.\")\n```\n\nAdjust this code to your needs: add `channel_id`, token, link to your csv file, `initial_comment`\n\n```python\nimport requests\nfrom slack_sdk import WebClient\n\n#Create a client\ntoken = \"xoxb-\u003c...\u003e\"\nslack_client = WebClient(token=token)\n\n#Join the channel\nchannel_id = \"\u003c...\u003e\"  #Replace with the actual channel ID\nslack_client.conversations_join(channel=channel_id)\n\n#Download the file from Google Cloud Storage\nfile_url = \"https://storage.googleapis.com/your-bucket/connect-four-summary.csv\"\nresponse = requests.get(file_url)\nfile_contents = response.content\n\n#Send a message and file\nslack_client.files_upload(\n    file=file_contents,\n    channels=[channel_id],\n    title='Connect Four Performance Summary',\n    initial_comment='Thank you for your participation and enthusiasm throughout the games. Enjoy reviewing your performance and congratulations on your achievements!'\n)\n```\n\nRun the code and you will see the message in the channel, you can also download the CSV file with the results.\n\n![image](https://github.com/janaom/gcp-de-project-connect-four-with-python-dataflow/assets/83917694/f2eb78ee-891e-4da8-805d-18cc3bf2d510)\n\nCongratulations 👏 \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanaom%2Fgcp-de-project-connect-four-with-python-dataflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjanaom%2Fgcp-de-project-connect-four-with-python-dataflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanaom%2Fgcp-de-project-connect-four-with-python-dataflow/lists"}