{"id":31754879,"url":"https://github.com/sahel13/particle-pomdp","last_synced_at":"2026-05-15T08:32:29.049Z","repository":{"id":318632309,"uuid":"1067172254","full_name":"Sahel13/particle-pomdp","owner":"Sahel13","description":"Code accompanying the NeurIPS 2025 paper \"Sequential Monte Carlo for Policy Optimization in Continuous POMDPs\".","archived":false,"fork":false,"pushed_at":"2025-10-08T10:26:16.000Z","size":97,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-08T11:19:16.743Z","etag":null,"topics":["policy-optimization","pomdps","reinforcement-learning","sequential-monte-carlo"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2505.16732","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Sahel13.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.bib","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-30T13:30:38.000Z","updated_at":"2025-10-08T10:26:19.000Z","dependencies_parsed_at":"2025-10-08T11:19:19.805Z","dependency_job_id":"a2278574-4eef-4690-bedb-dc5b24252c3c","html_url":"https://github.com/Sahel13/particle-pomdp","commit_stats":null,"previous_names":["sahel13/particle-pomdp"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Sahel13/particle-pomdp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahel13%2Fparticle-pomdp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahel13%2Fparticle-pomdp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahel13%2Fparticle-pomdp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahel13%2Fparticle-pomdp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Sahel13","download_url":"https://codeload.github.com/Sahel13/particle-pomdp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahel13%2Fparticle-pomdp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001941,"owners_count":26083226,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["policy-optimization","pomdps","reinforcement-learning","sequential-monte-carlo"],"created_at":"2025-10-09T18:25:32.987Z","updated_at":"2025-10-09T18:25:39.062Z","avatar_url":"https://github.com/Sahel13.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Particle POMDP Policy Optimization (P3O)\n\nImplements the P3O algorithm from the NeurIPS 2025 paper [Sequential Monte\nCarlo for Policy Optimization in Continuous POMDPs](https://arxiv.org/abs/2505.16732).\nThis code was written by [Sahel Iqbal](https://github.com/Sahel13) and [Hany\nAbdulsamad](https://github.com/hanyas).\n\nP3O is a policy optimization algorithm for partially observable Markov decision processes (POMDPs) with continuous state, action and observation spaces. See the scripts in `examples/` for demonstrations of how to train policies using P3O.\n\n## Installation\n\nInstall [JAX](https://github.com/jax-ml/jax?tab=readme-ov-file#installation) for the available hardware. Then run\n\n```bash\n$ pip install -e .\n```\n\nfor an editable install.\n\n## Examples\n\nWe provide multiple environments to test P3O's optimal information gathering behavior:\n\n- `pendulum`: A pendulum swing-up task, where only the angular position is observable.\n- `cartpole`: A cart-pole swing-up task, where only the angular and Cartesian positions are observable.\n- `light-dark-2d`: A 2D navigation task with location-dependent noise.\n- `triangulation`: A 2D navigation task with heading-only observations.\n\nEach environment can be ran with two policies:\n\n- a policy with history inputs - `recurrent`\n- a policy with belief state inputs - `attention`\n\nFor example, for the light-dark environment run:\n\n```bash\npython examples/lightdark2d/p3o_recurrent.py\n```\n\nor\n\n```bash\npython examples/lightdark2d/p3o_attention.py\n```\n\n## Baselines\n\nWe provide the following baselines for comparison:\n\n1. [Deep Variational Reinforcement Learning for POMDPs (DVRL)](https://proceedings.mlr.press/v80/igl18a/igl18a.pdf) - See `baselines/dvrl`.\n2. [Stochastic Latent Actor-Critic (SLAC)](https://arxiv.org/pdf/1907.00953) - See `baselines/slac`.\n3. [DualSMC](https://www.ijcai.org/Proceedings/2020/0579.pdf) - See `baselines/dsmc`.\n\nSee `baselines/README.md` for details.\n\n## Citation\n\nIf you find the code useful, please cite our paper\n\n```bib\n@inproceedings{abdulsamad2025sequential,\n  title = {Sequential {Monte Carlo} for policy optimization in continuous {POMDPs}},\n  author = {Hany Abdulsamad and Sahel Iqbal and Simo S{\\\"a}rkk{\\\"a}},\n  booktitle = {Advances in Neural Information Processing Systems},\n  year = {2025},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsahel13%2Fparticle-pomdp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsahel13%2Fparticle-pomdp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsahel13%2Fparticle-pomdp/lists"}