{"id":31021361,"url":"https://github.com/sisl/action_suggestions","last_synced_at":"2025-10-08T07:12:00.246Z","repository":{"id":60234184,"uuid":"494358439","full_name":"sisl/action_suggestions","owner":"sisl","description":"Code supporting the paper Collaborative Decision Making Using Action Suggestions. ","archived":false,"fork":false,"pushed_at":"2024-02-14T19:04:58.000Z","size":33991,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-06T02:41:10.199Z","etag":null,"topics":["collaboration","decision-making-under-uncertainty","pomdp","state-estimation"],"latest_commit_sha":null,"homepage":"","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sisl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-05-20T07:07:41.000Z","updated_at":"2024-12-16T01:14:37.000Z","dependencies_parsed_at":"2024-02-14T20:38:59.660Z","dependency_job_id":null,"html_url":"https://github.com/sisl/action_suggestions","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/sisl/action_suggestions","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sisl%2Faction_suggestions","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sisl%2Faction_suggestions/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sisl%2Faction_suggestions/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sisl%2Faction_suggestions/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sisl","download_url":"https://codeload.github.com/sisl/action_suggestions/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sisl%2Faction_suggestions/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278903844,"owners_count":26065968,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["collaboration","decision-making-under-uncertainty","pomdp","state-estimation"],"created_at":"2025-09-13T11:20:50.050Z","updated_at":"2025-10-08T07:12:00.220Z","avatar_url":"https://github.com/sisl.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Repository for Collaborative Decision Making Using Action Suggestions\nThis repository contains the code used for the experiments in the paper Collaborative Decision Making Using Action Suggestions. You can find the paper [here](https://arxiv.org/abs/2209.13160). \u003c!-- on ArXiv here and the NeurIPS proceedings here. --\u003e\n\n# Setting up the environment\nThe development occurred using Julia v1.7 and v1.8. We recommend using the latest version of Julia. \n\nFirst, clone the repo, change to the main folder, and run Julia.\n```\ngit clone git@github.com:sisl/action_suggestions.git\ncd  action_suggestions\njulia\n```\n\nWe first need to activate the environment and include the supporting scripts. This process is scripted in [`setup.jl`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/setup.jl). You can run this file by:\n```julia\njulia\u003e include(\"setup.jl\")\n```\n\nThis repo contains polices and action value functions for RockSample(8, 4, 10, -1), Tag with the modified transition function, and the original implementation of Tag. You can start running those simulations immediately. Reference the Running Simulations section. To simulate the RockSample(7, 8, 20, 0) environment, you will need to generate the policy and action value matrix. Reference the [Generating Policies](#generating-policies) section for directions on completing that process. The problems are referenced using the `:rs84`, `:tag`, and `:tag_orig_tx` Symbols.  The RockSample(7, 8, 10, 0) problem has the `:rs78` Symbol defined and ready for use after a policy and action value matrix is generated.\n\n# Running Simulations\nThe simulation function is defined in [`run_sims.jl`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/run_sims.jl) and is the `run_sim` function. See the doc string for detailed information about the arguments for this function. This file should be included when running the `setup.jl` script. However, if it was not, we can include this file by\n```julia\njulia\u003e include(\"src/run_sims.jl\")\n```\n\n## Single Simulation\nWe can run a single simulation of the RockSample(8, 4, 10, -1) scenario by\n```julia\njulia\u003e run_sim(:rs84)\n```\nThis command should produce an output similar to\n```\nLoading problem and policy...complete!\nAgent: normal\n         Metric |            Mean |    Standard Dev |  Standard Error |       +/- 95 CI\n--------------- | --------------- | --------------- | --------------- | ---------------\n         Reward |        11.07420 |             NaN |             NaN |             NaN\n          Steps |        15.00000 |             NaN |             NaN |             NaN\n  # Suggestions |         0.00000 |             NaN |             NaN |             NaN\n  # Sugg / Step |         0.00000 |             NaN |             NaN |             NaN\n\n```\n\n### Verbose\nDetails at each step can be output by setting the `verbose` keyword to `true`. A summary of key parameters will be output to the REPL at each time step. Recommend using `verbose` for single runs only (i.e. `num_sims = 1`)!\n\n### Visualize\nA visual depiction of the scenario can be output by setting the `visualize` keyword to `true`. Recommend using `visualize` for single runs only! In each scenario, a visualization of the belief is shown along with images before the suggestion is used to update the belief and after the suggestion is used to update the belief. The actions depicted on the bottom on in reference to the selected action with the displayed belief.\n\n## Multiple Simulations\nWe can run multiple situations by using the `num_sims` keyword argument\n```julia\njulia\u003e run_sim(:tag; num_sims=10)\n```\nThis command should produce an output similar to\n```\nLoading problem and policy...complete!\nRunning Simulations 100%|██████████████████████████████████████████████████| Time: 0:00:16 ( 1.61  s/it)\nAgent: normal\n         Metric |            Mean |    Standard Dev |  Standard Error |       +/- 95 CI\n--------------- | --------------- | --------------- | --------------- | ---------------\n         Reward |       -10.08101 |         6.93397 |         2.19272 |         4.29772\n          Steps |        27.70000 |        16.22789 |         5.13171 |        10.05815\n  # Suggestions |         0.00000 |         0.00000 |         0.00000 |         0.00000\n  # Sugg / Step |         0.00000 |         0.00000 |         0.00000 |         0.00000\n```\n\n## More Examples\n\n```\njulia\u003e run_sim(:rs84; num_sims=50, agent=:noisy, λ=1.0)\nLoading problem and policy...complete!\nRunning Simulations 100%|██████████████████████████████████████████████████| Time: 0:00:00 ( 4.73 ms/it)\nAgent: noisy, λ = 1.00\n         Metric |            Mean |    Standard Dev |  Standard Error |       +/- 95 CI\n--------------- | --------------- | --------------- | --------------- | ---------------\n         Reward |        16.39763 |         3.55900 |         0.50332 |         0.98650\n          Steps |        17.44000 |         5.24564 |         0.74185 |         1.45402\n  # Suggestions |         5.42000 |         1.57907 |         0.22331 |         0.43770\n  # Sugg / Step |         0.32376 |         0.08556 |         0.01210 |         0.02372\n```\n\n```\njulia\u003e run_sim(:tag; num_sims=50, agent=:scaled, τ=0.75)\nLoading problem and policy...complete!\nRunning Simulations 100%|██████████████████████████████████████████████████| Time: 0:00:31 ( 0.63  s/it)\nAgent: scaled, τ = 0.75\n         Metric |            Mean |    Standard Dev |  Standard Error |       +/- 95 CI\n--------------- | --------------- | --------------- | --------------- | ---------------\n         Reward |        -1.61092 |         4.45021 |         0.62936 |         1.23354\n          Steps |        11.34000 |         6.36191 |         0.89971 |         1.76343\n  # Suggestions |         2.84000 |         2.15103 |         0.30420 |         0.59624\n  # Sugg / Step |         0.24639 |         0.10154 |         0.01436 |         0.02815\n```\n\n```\njulia\u003e run_sim(:rs84; num_sims=50, agent=:scaled, τ=0.75, msg_reception_rate=0.75)\nLoading problem and policy...complete!\nRunning Simulations 100%|██████████████████████████████████████████████████| Time: 0:00:00 (11.81 ms/it)\nAgent: scaled, τ = 0.75\n         Metric |            Mean |    Standard Dev |  Standard Error |       +/- 95 CI\n--------------- | --------------- | --------------- | --------------- | ---------------\n         Reward |        15.58007 |         3.58591 |         0.50712 |         0.99396\n          Steps |        17.70000 |         6.02122 |         0.85153 |         1.66900\n  # Suggestions |         2.24000 |         0.79693 |         0.11270 |         0.22090\n  # Sugg / Step |         0.14602 |         0.09124 |         0.01290 |         0.02529\n  ```\n\n\n## Function [`run_sim`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/run_sims.jl#L57)\nRuns simulations and reports key metrics.\n\n### Arguments\n- `problem::Symbol`: Problem to simulate (see [RS_PROBS](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/constants.jl#L1) and [TG_PROBS](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/constants.jl#L2) in [`constants.jl`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/constants.jl) for options)\n\n### Keyword Arguments\n- `num_steps::Int=50`: number of steps in each simulation\n- `num_sims::Int=1`: number of simulations to run\n- `verbose::Bool=false`: print out details of each step\n- `visualize::Bool=false`: render the environment at each step (2x per step)\n- `agent::Symbol=:normal`: Which agent to simulate (see AGENTS for options)\n- `ν=1.0`: hyperparameter for the naive agent (percent of suggestions to follow)\n- `τ=1.0`: hyperparameter for the scaled agent\n- `λ=1.0`: hyperparameter for the noisy agent\n- `max_suggestions=Inf`: Limit of the number of suggestions the agent can receive\n- `msg_reception_rate=1.0`: Reception rate of the agent for suggestions\n- `perfect_v_random=1.0`: Rate of perfect vs random suggestions (1.0=perfect, 0.0=random)\n- `init_rocks=nothing`: For RockSamplePOMDP only. Designate the state of initial rocks. Must\nbe a vector with length equal to the number of rocks (e.g. [1, 0, 0, 1])\n- `suggester_belief=[1.0, 0.0]`: RockSamplePOMDP only. Designate the initial belief over\ngood rocks and bad rocks respectively. [1.0, 0.0] = perfect knowledge suggester,\n[0.75, 0.5] would represent a suggester with a bit more knowledge over good rocks but no\nadditional information for the bad rocks.\n- `init_pos=nothing`: TagPOMDP only. Set the initial positions of the agent and opponent.\nThe form is Vector{Tuple{Int, Int}}. E.g. [(1, 1), (5, 2)].\n- `rng=Random.GLOBAL_RNG`: Provide a random number generator\n\n\n# Generating Policies\n\nThe function to generate and save policies is in [`pol_generator.jl`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/pol_generator.jl) and the function to generate and save the action value function as a matrix is contained in [`generate_q.jl`](https://github.com/sisl/action_suggestions/blob/4b3834319cd86301a118fea0c62a66612e650c86/src/generate_q.jl). Both of these files are included by the `setup.jl` script but can be included manually if needed.\n\nTo generate and save a policy, call `generate_problem_and_policy` with the problem of interest. Parameters can be passed to the SARSOP solver by keywords. For the RockSample(7, 8, 20, 0) results contained in the paper, a timeout value of `10800` was used.\n\n## Example Policy Generation\n\n```\njulia\u003e generate_problem_and_policy(:rs78; timeout=300)\nGenerating a pomdpx file: model.pomdpx\n\nLoading the model ...\n  input file   : model.pomdpx\n  loading time : 301.06s \n\nSARSOP initializing ...\n  initialization time : 0.88s\n\n-------------------------------------------------------------------------------\n Time   |#Trial |#Backup |LBound    |UBound    |Precision  |#Alphas |#Beliefs  \n-------------------------------------------------------------------------------\n 0.88    0       0        7.35092    28.5048    21.1539     13       1        \n 0.95    2       50       7.35092    27.1536    19.8027     10       24       \n 1.01    5       101      11.7638    25.7925    14.0287     22       38       \n 1.1     7       150      12.3727    25.6247    13.2519     52       63       \n 1.22    9       203      12.3727    25.5254    13.1526     78       84       \n \n ...\n \n ...\n \n 263.85  389     9057     15.3982    22.4768    7.07857     2242     3236     \n 267.16  391     9100     15.3982    22.4738    7.0756      2285     3250     \n 269.68  393     9157     15.3982    22.4723    7.07412     2342     3269     \n 272.37  395     9213     15.3982    22.4701    7.07186     2398     3287     \n 275.73  397     9259     15.3982    22.4665    7.06831     2271     3305     \n 277.82  399     9301     15.3982    22.4628    7.06459     2313     3318     \n 281.28  401     9350     15.3982    22.4527    7.05447     2362     3336     \n 284.72  403     9400     15.3982    22.4422    7.04402     2412     3356     \n 286.41  405     9455     15.3982    22.4306    7.03245     2467     3376     \n 289.55  407     9500     15.3982    22.4241    7.0259      2340     3391     \n 293.48  410     9550     15.3982    22.4148    7.0166      2390     3406     \n 297.99  413     9607     15.3982    22.4066    7.00837     2447     3424     \n 301.29  415     9657     15.3982    22.4044    7.00621     2497     3441     \n-------------------------------------------------------------------------------\n\nSARSOP finishing ...\n  Preset timeout reached\n  Timeout     : 300.000000s\n  Actual Time : 301.290000s\n\n-------------------------------------------------------------------------------\n Time   |#Trial |#Backup |LBound    |UBound    |Precision  |#Alphas |#Beliefs  \n-------------------------------------------------------------------------------\n 301.51  415     9657     15.3982    22.4044    7.00621     2356     3441     \n-------------------------------------------------------------------------------\n\nWriting out policy ...\n  output file : policy.out\n\nComplete! Saved as: policies/rs_7-8-20-0_pol.jld2\n```\n\n\n\n## Example Q Matrix Generation\n```\ngenerate_and_save_Q(:rs78)\nLoading problem and policy...complete!\nCalculating action value matrix 100%|████████████████████████████████████████| Time: 1:13:19 (23.60 ms/it)\n```\n\nAfter generating the policy and the Q matrix, you should have two more files saved in the `policies` folder (`rs_7-8-20-0_pol.jld2` and `rs_7-8-20-0_Q.jld2`). Now you can run simulations with the `:rs78` symbol as described in the Running Simulations section.\n\n# Citation\n```\n@inproceedings{Asmar2022},\ntitle = {Collaborative Decision Making Using Action Suggestions},\nauthor = {Dylan M. Asmar and Mykel J. Kochenderfer},\nbooktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\nyear = {2022}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsisl%2Faction_suggestions","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsisl%2Faction_suggestions","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsisl%2Faction_suggestions/lists"}