{"id":15107668,"url":"https://github.com/bergel/reinforcementlearning","last_synced_at":"2026-02-10T23:31:54.343Z","repository":{"id":140631869,"uuid":"370941400","full_name":"bergel/ReinforcementLearning","owner":"bergel","description":"Implementation of Reinforcement Learning  (Q-Learning) in Pharo Smalltalk. The code is described in a chapter of the Agile Visualization book.","archived":false,"fork":false,"pushed_at":"2021-08-31T06:28:57.000Z","size":1384,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-18T11:14:19.223Z","etag":null,"topics":["machine-learning","pharo","pharo-smalltalk","reinforcement-learning","visualization"],"latest_commit_sha":null,"homepage":"","language":"Smalltalk","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bergel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-26T07:15:41.000Z","updated_at":"2021-08-31T06:28:59.000Z","dependencies_parsed_at":null,"dependency_job_id":"81f33a2c-09b6-40d4-873f-a8f32090ac9c","html_url":"https://github.com/bergel/ReinforcementLearning","commit_stats":{"total_commits":130,"total_committers":3,"mean_commits":"43.333333333333336","dds":0.4769230769230769,"last_synced_commit":"87d65bd317143938c026edb97ba41e7509f54e0d"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bergel%2FReinforcementLearning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bergel%2FReinforcementLearning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bergel%2FReinforcementLearning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bergel%2FReinforcementLearning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bergel","download_url":"https://codeload.github.com/bergel/ReinforcementLearning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247353679,"owners_count":20925324,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","pharo","pharo-smalltalk","reinforcement-learning","visualization"],"created_at":"2024-09-25T21:40:51.680Z","updated_at":"2026-02-10T23:31:53.862Z","avatar_url":"https://github.com/bergel.png","language":"Smalltalk","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reinforcement Learning in Pharo\n\n[![Tests](https://github.com/bergel/ReinforcementLearning/actions/workflows/runTests.yml/badge.svg)](https://github.com/bergel/ReinforcementLearning/actions/workflows/runTests.yml)\n[![UML Class diagram](https://github.com/bergel/ReinforcementLearning/actions/workflows/visualizeClassDiagram.yml/badge.svg)](https://github.com/bergel/ReinforcementLearning/blob/main/ci_data/uml.png)\n[![Coverage](https://raw.githubusercontent.com/bergel/ReinforcementLearning/main/ci_data/coverageBadge.svg)](https://github.com/bergel/ReinforcementLearning/blob/main/ci_data/coverage.png)\n\nReinforcement learning is a machine learning algorithm that (i) explores a graph made of states and actions, and (ii) identifies an optimal route in this graph based on a reward mechanism. The code contained in this repository implements the Q-Learning algorithm. Its implementation is simple and provides some visualizations. The UML Class diagram of this project is [available online](https://github.com/bergel/ReinforcementLearning/blob/main/ci_data/uml.png).\n\nThe content of this repository is designed to run on the [Pharo programming language](https://pharo.org). The code provided in this repository is part of the book titled _Agile Visualization with Pharo -- Crafting Interactive Visual Support Using Roassal_, published by APress.\n\n-----\n## Installation\n\nThe project is known to work on Pharo 9 and Pharo 10. Open a workspace and run the following:\n\n```Smalltalk\n[ Metacello new\n    baseline: 'ReinforcementLearning';\n    repository: 'github://bergel/ReinforcementLearning:main';\n    load ] on: MCMergeOrLoadWarning do: [ :warning | warning load ]\n```\n------\n## Screenshots\n\nHere are some screenshots of the visualization to illustrates the execution of the Q-Learning algorithm. The example describes the classical scenario where a knight need to exit a map while avoiding monsters. Here is the solution that shows the path from the starting point (large blue dot) toward the exit (yellow cell), while avoiding monsters (light red cells):\n\n\u003cimg width=\"595\" alt=\"image\" src=\"https://user-images.githubusercontent.com/393742/131110454-6b1e3313-795c-4459-9a2d-446ba4f6d4d8.png\"\u003e\n\nHere is the Q-Table that indicates the rewards of each actions for all the states:\n\n\u003cimg width=\"593\" alt=\"image\" src=\"https://user-images.githubusercontent.com/393742/131110732-3b88f579-f4f4-4680-aa54-24c7488d673e.png\"\u003e\n\nThe culumated reward along the episodes:\n\n\u003cimg width=\"591\" alt=\"image\" src=\"https://user-images.githubusercontent.com/393742/131110812-af535c4f-062f-4de0-aaf4-b20b00a901c2.png\"\u003e\n\nQ-Learning, as any Reinforcement Learning algorithm, explores a graph composed of nodes (states) and edges (actions leading to a state transition). The graph is also visually represented:\n\n\u003cimg width=\"590\" alt=\"image\" src=\"https://user-images.githubusercontent.com/393742/131110990-86298e0d-26d8-4eb3-8e9b-96fef6d2edb0.png\"\u003e\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbergel%2Freinforcementlearning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbergel%2Freinforcementlearning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbergel%2Freinforcementlearning/lists"}