{"id":15904857,"url":"https://github.com/ashdtu/menu_hotkey_rl","last_synced_at":"2025-04-02T20:42:58.987Z","repository":{"id":109669611,"uuid":"160256946","full_name":"ashdtu/Menu_Hotkey_RL","owner":"ashdtu","description":"A reinforcement learning package to model Menu learning behavior in users.","archived":false,"fork":false,"pushed_at":"2018-12-22T12:39:01.000Z","size":28,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-07T12:42:37.924Z","etag":null,"topics":["bayesian-statistics","gaussian-processes","human-computer-interaction","py-gpsarsa","py-softmax","reinforcement-learning","reinforcement-learning-algorithms"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ashdtu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-03T21:34:39.000Z","updated_at":"2024-04-12T12:03:36.000Z","dependencies_parsed_at":"2023-03-13T14:05:55.263Z","dependency_job_id":null,"html_url":"https://github.com/ashdtu/Menu_Hotkey_RL","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashdtu%2FMenu_Hotkey_RL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashdtu%2FMenu_Hotkey_RL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashdtu%2FMenu_Hotkey_RL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashdtu%2FMenu_Hotkey_RL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ashdtu","download_url":"https://codeload.github.com/ashdtu/Menu_Hotkey_RL/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246892747,"owners_count":20850845,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-statistics","gaussian-processes","human-computer-interaction","py-gpsarsa","py-softmax","reinforcement-learning","reinforcement-learning-algorithms"],"created_at":"2024-10-06T12:42:33.113Z","updated_at":"2025-04-02T20:42:58.959Z","avatar_url":"https://github.com/ashdtu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"A **reinforcement learning** package to determine computationally rational strategy of navigating **Menu-Hotkey interfaces**.  \n\u003e The package is intended to be used as a playground to test HCI models with general Reinforcement learning algorithms. Therefore, the package has been modularised into **4 main classes** to quickly experiment with model definitions and user strategies.  \n\n**Agent**: Policy and decision making(Strategy)\n\n**Environment**:\n- defines Utility(Reward structure)\n- Ecology (observation definition)\n- Mechanism (Observation model,transition dynamics)\n\n**Learner**: Learning algorithms \n\n**Experiment**: Model training/testing and overall structure \n\n---\n\n**Learning Algorithms:**\n\nA. Tabular model free methods (Discrete space):\n\n- Q-learning\n- SARSA\n- N-step TD backup\n- Eligibility traces with Q learning   \n\nB. Bayesian RL :\n\n- Gaussian Process-SARSA  (non-sparse version/computational constraints) \n- Episodic GP-SARSA \t      (sparsified dictionary/fast) \n\nOriginal GP-TD algortithm discussed here:  [Engel et. Al, 2005](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.6420\u0026rep=rep1\u0026type=pdf), Reinforcement learning with Gaussian processes.\n\nEpisodic GP-SARSA for dialogue managers: [Gasic et al](http://mi.eng.cam.ac.uk/~sjy/papers/gayo14.pdf), Gaussian processes for POMDP-based dialogue manager optimisation\n\n\n**Policy(Agent):**\n- Epsilon-greedy Exploration\n- Softmax \n- Covariance based Exploration for Gaussian Process( Active learning)\n- Stochastic exploration for Gaussian Process \n\n**Sample Environment**: A sample **continuous maze** type environement is provided to test new implemented RL algorithms where the goal is to navigate a maze with obstacles to reach goal position. THe envionment can be used as  a testing playground. \n\n---\n\n### Getting started\n\nTo test a new HCI model with the above RL algorithms and policies, we can get started in a few lines of code. Modify the proposed reward structure, transiton model and episode terminal conditon in the **Environment Class** functions.  Then simply import the required policy,learner as below:\n\n```\nenv = Environment()\nlearner = GP_SARSA()\nagent = Agent(env,learner)\n\nwhile(TerminalCondition!=True):\n\n  action=agent.getAction(state)\n  reward=env.getReward(state,action)\n  next_State=env.transition(state, action)\n  learner.learn(state,action,reward,next_State)\n```\n\n4 sample experiments have been demonstrated in the **Experiment directory** detailing the above.\n\n---\n\n**Further updates:**\n- Adding multi kernel support from Gaussian process library [Gpy library](https://sheffieldml.github.io/GPy/) \n- Implementation of Deep-Q with PO-MDP model \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashdtu%2Fmenu_hotkey_rl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fashdtu%2Fmenu_hotkey_rl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashdtu%2Fmenu_hotkey_rl/lists"}