{"id":30957266,"url":"https://github.com/flairox/kinetix","last_synced_at":"2025-09-11T13:45:11.115Z","repository":{"id":261692942,"uuid":"884356580","full_name":"FLAIROx/Kinetix","owner":"FLAIROx","description":"Reinforcement learning on general 2D physics environments in JAX. ICLR 2025 Oral.","archived":false,"fork":false,"pushed_at":"2025-07-24T02:14:15.000Z","size":19162,"stargazers_count":206,"open_issues_count":2,"forks_count":9,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-08-31T12:29:04.472Z","etag":null,"topics":["machine-learning","physics-engine","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://kinetix-env.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FLAIROx.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-06T15:51:31.000Z","updated_at":"2025-08-27T10:28:49.000Z","dependencies_parsed_at":null,"dependency_job_id":"7be4a35d-0c2e-4267-b18c-2906e63f4eba","html_url":"https://github.com/FLAIROx/Kinetix","commit_stats":null,"previous_names":["flairox/kinetix"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/FLAIROx/Kinetix","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FLAIROx%2FKinetix","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FLAIROx%2FKinetix/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FLAIROx%2FKinetix/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FLAIROx%2FKinetix/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FLAIROx","download_url":"https://codeload.github.com/FLAIROx/Kinetix/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FLAIROx%2FKinetix/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274648319,"owners_count":25324299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-11T02:00:13.660Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","physics-engine","reinforcement-learning"],"created_at":"2025-09-11T13:45:07.010Z","updated_at":"2025-09-11T13:45:11.103Z","avatar_url":"https://github.com/FLAIROx.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/kinetix_logo.gif\" width=\"500\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n        \u003ca href= \"https://pypi.org/project/jax2d/\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-blue\" /\u003e\u003c/a\u003e\n        \u003ca href= \"https://pypi.org/project/kinetix-env/\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/pypi-1.0.0-green\" /\u003e\u003c/a\u003e\n       \u003ca href= \"https://github.com/FLAIROx/Kinetix/blob/main/LICENSE\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/License-MIT-yellow\" /\u003e\u003c/a\u003e\n       \u003ca href= \"https://github.com/psf/black\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/code%20style-black-000000.svg\" /\u003e\u003c/a\u003e\n       \u003ca href= \"https://kinetix-env.github.io/\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/online-editor-purple\" /\u003e\u003c/a\u003e\n       \u003ca href= \"https://arxiv.org/abs/2410.23208\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/arxiv-2410.23208-b31b1b\" /\u003e\u003c/a\u003e\n        \u003ca href= \"./docs/README.md\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/docs-green\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n### \u003cb\u003eUpdate: Kinetix was accepted at ICLR 2025 as an oral!\u003c/b\u003e\n\n# Kinetix\n\nKinetix is a framework for reinforcement learning in a 2D rigid-body physics world, written entirely in [JAX](https://github.com/jax-ml/jax).\nKinetix can represent a huge array of physics-based tasks within a unified framework.\nWe use Kinetix to investigate the training of large, general reinforcement learning agents by procedurally generating millions of tasks for training.\nYou can play with Kinetix in our [online editor](https://kinetix-env.github.io/gallery.html?editor=true), or have a look at the JAX [physics engine](https://github.com/MichaelTMatthews/Jax2D) and [graphics library](https://github.com/FLAIROx/JaxGL) we made for Kinetix. Finally, see our [docs](./docs/README.md) for more information and more in-depth examples.\n\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/bb.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/cartpole.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/grasper.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/hc.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/hopper.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/ll.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"middle\"\u003e\n\u003cb\u003eThe above shows specialist agents trained on their respective levels.\u003c/b\u003e\n\u003c/p\u003e\n\n# 📊 Paper TL; DR\n\n\n\nWe train a general agent on millions of procedurally generated physics tasks.\nEvery task has the same goal: make the \u003cspan style=\"color:green\"\u003egreen\u003c/span\u003e and \u003cspan style=\"color:blue\"\u003eblue\u003c/span\u003e touch, without \u003cspan style=\"color:green\"\u003egreen\u003c/span\u003e touching \u003cspan style=\"color:red\"\u003ered\u003c/span\u003e.\nThe agent can act through applying torque via motors and force via thrusters.\n\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_1.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_5.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_3.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_4.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_6.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/random_7.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"middle\"\u003e\n\u003cb\u003eThe above shows a general agent zero-shotting unseen randomly generated levels.\u003c/b\u003e\n\u003c/p\u003e\n\nWe then investigate the transfer capabilities of this agent to unseen handmade levels.\nWe find that the agent can zero-shot simple physics problems, but still struggles with harder tasks.\n\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_1.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_2.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_3.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_4.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_5.gif\" width=\"200\" /\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/FlairOX/Kinetix/main/images/general_6.gif\" width=\"200\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"middle\"\u003e\n\u003cb\u003eThe above shows a general agent zero-shotting unseen handmade levels.\u003c/b\u003e\n\u003c/p\u003e\n\n\n# 📜 Basic Usage\n\nKinetix follows the interface established in [gymnax](https://github.com/RobertTLange/gymnax):\n\n```python\n# Use default parameters\nenv_params = EnvParams()\nstatic_env_params = StaticEnvParams()\n\n# Create the environment\nenv = make_kinetix_env(\n  observation_type=ObservationType.PIXELS,\n  action_type=ActionType.CONTINUOUS,\n  reset_fn=make_reset_fn_sample_kinetix_level(env_params, static_env_params),\n  env_params=env_params,\n  static_env_params=static_env_params,\n)\n\n# Reset the environment state (this resets to a random level)\n_rngs = jax.random.split(jax.random.PRNGKey(0), 3)\n\nobs, env_state = env.reset(_rngs[0], env_params)\n\n# Take a step in the environment\naction = env.action_space(env_params).sample(_rngs[1])\nobs, env_state, reward, done, info = env.step(_rngs[2], env_state, action, env_params)\n\n# Render environment\nrenderer = make_render_pixels(env_params, env.static_env_params)\n\npixels = renderer(env_state)\n\nplt.imshow(pixels.astype(jnp.uint8).transpose(1, 0, 2)[::-1])\nplt.show()\n\n```\n\n\n# ⬇️ Installation\nTo install Kinetix (tested with python3.10):\n```commandline\ngit clone https://github.com/FlairOx/Kinetix.git\ncd Kinetix\npip install -e \".[dev]\"\npre-commit install\n```\n\nPlease see [here](https://docs.jax.dev/en/latest/installation.html) to install jax for your accelerator.\n\n\u003e [!TIP]\n\u003e Setting `export JAX_COMPILATION_CACHE_DIR=\"$HOME/.jax_cache\"` in your `~/.bashrc` helps improve usability by caching the jax compiles.\n\nKinetix is also available on [PyPi](https://pypi.org/project/kinetix-env/), and can be installed using `pip install kinetix-env`\n\n# 🎯 Editor\nWe recommend using the [KinetixJS editor](https://kinetix-env.github.io/gallery.html?editor=true), but also provide a native (less polished) Kinetix editor.\n\nTo open this editor run the following command.\n```commandline\npython3 kinetix/editor.py\n```\n\nThe controls in the editor are:\n- Move between `edit` and `play` modes using `spacebar`\n- In `edit` mode, the type of edit is shown by the icon at the top and is changed by scrolling the mouse wheel.  For instance, by navigating to the rectangle editing function you can click to place a rectangle.\n  - You can also press the number keys to cycle between modes.\n- To open handmade levels press ctrl-O and navigate to the ones in the L folder.  \n- **When playing a level use the arrow keys to control motors and the numeric keys (1, 2) to control thrusters.**\n\n# 📈 Experiments\n\nWe have three primary experiment files,\n1. [**SFL**](https://github.com/amacrutherford/sampling-for-learnability?tab=readme-ov-file): Training on levels with high learnability, this is how we trained our best general agents.\n2. **PLR** PLR/DR/ACCEL in the [JAXUED](https://github.com/DramaCow/jaxued) style.\n3. **PPO** Normal PPO in the [PureJaxRL](https://github.com/luchris429/purejaxrl/) style.\n\nTo run experiments with default parameters run any of the following:\n```commandline\npython3 experiments/sfl.py\npython3 experiments/plr.py\npython3 experiments/ppo.py\n```\n\nWe use [hydra](https://hydra.cc/) for managing our configs.  See the `configs/` folder for all the hydra configs that will be used by default, or the [docs](./docs/configs.md).\nIf you want to run experiments with different configurations, you can either edit these configs or pass command line arguments as follows:\n\n```commandline\npython3 experiments/sfl.py model.transformer_depth=8\n```\n\nThese experiments use [wandb](https://wandb.ai/home) for logging by default.\n\n## 🏋️ Training RL Agents\nWe provide several different ways to train RL agents, with the three most common options being, (a) [Training an agent on random levels](#training-on-random-levels), (b) [Training an agent on a single, hand-designed level](#training-on-a-single-hand-designed-level) or (c) [Training an agent on a set of hand-designed levels](#training-on-a-set-of-hand-designed-levels).\n\n\u003e [!WARNING]\n\u003e Kinetix has three different environment sizes, `s`, `m` and `l`. When running any of the scripts, you have to set the `env_size` option accordingly, for instance, `python3 experiments/ppo.py train_levels=random env_size=m` would train on random `m` levels.\n\u003e It will give an error if you try and load large levels into a small env size, for instance `python3 experiments/ppo.py train_levels=m env_size=s` would error.\n\n### Training on random levels\nThis is the default option, but we give the explicit command for completeness\n```commandline\npython3 experiments/ppo.py train_levels=random\n```\n### Training on a single hand-designed level\n\n\u003e [!NOTE]\n\u003e Check the `kinetix/levels/` folder for handmade levels for each size category. By default, the loading functions require a relative path to the `kinetix/levels/` directory\n\n```commandline\npython3 experiments/ppo.py train_levels=s train_levels.train_levels_list='[\"s/h4_thrust_aim.json\"]'\n```\n### Training on a set of hand-designed levels\n```commandline\npython3 experiments/ppo.py train_levels=s env_size=s eval=eval_auto\n# python3 experiments/ppo.py train_levels=m env_size=m eval=eval_auto\n# python3 experiments/ppo.py train_levels=l env_size=l eval=eval_auto\n```\n\nOr, on a custom set:\n```commandline\npython3 experiments/ppo.py eval=eval_auto train_levels=l env_size=l train_levels.train_levels_list='[\"s/h2_one_wheel_car\",\"l/h11_obstacle_avoidance\"]'\n```\n\n# ❌ Errata\n- The left wall was erroneously misplaced 5cm to the left in all levels and all experiments in the paper (each level is a square with side lengths of 5 metres). This error has been fixed in the latest version of Jax2D, but we have pinned Kinetix to the old version for consistency and reproducability with the original paper.\nFurther improvements have been made, so if you wish to reproduce the paper's results, please use kinetix version 0.1.0, which is tagged on github.\n\n# 🔎 See Also\n- 🌐 [Kinetix.js](https://github.com/Michael-Beukman/Kinetix.js) Kinetix reimplemented in Javascript, with a live demo [here](https://kinetix-env.github.io/gallery.html?editor=true).\n- 🍎 [Jax2D](https://github.com/MichaelTMatthews/Jax2D) The physics engine we made for Kinetix.\n- 👨‍💻 [JaxGL](https://github.com/FLAIROx/JaxGL) The graphics library we made for Kinetix.\n- 📋 [Our Paper](https://arxiv.org/abs/2410.23208) for more details and empirical results.\n\n# 🙏 Acknowledgements\nThe permutation invariant MLP model that is now default was added by [Anya Sims](https://github.com/anyasims).\nThanks to [Thomas Foster](https://github.com/thomfoster) for fixing some macOS specific issues.\nWe'd also like to thank to Thomas Foster, Alex Goldie, Matthew Jackson, Sebastian Towers and Andrei Lupu for useful feedback.\n\n# 📚 Citation\nIf you use Kinetix in your work, please cite it as follows:\n```\n@article{matthews2024kinetix,\n      title={Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks}, \n      author={Michael Matthews and Michael Beukman and Chris Lu and Jakob Foerster},\n      booktitle={The Thirteenth International Conference on Learning Representations},\n      year={2025},\n      url={https://arxiv.org/abs/2410.23208}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflairox%2Fkinetix","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflairox%2Fkinetix","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflairox%2Fkinetix/lists"}