{"id":19467456,"url":"https://github.com/thunlp/legent","last_synced_at":"2025-07-13T13:40:23.644Z","repository":{"id":229218172,"uuid":"771364938","full_name":"thunlp/LEGENT","owner":"thunlp","description":"Open Platform for Embodied Agents","archived":false,"fork":false,"pushed_at":"2024-05-22T14:04:01.000Z","size":1802,"stargazers_count":128,"open_issues_count":1,"forks_count":6,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-05-22T15:50:22.665Z","etag":null,"topics":["embodied-ai","language-grounding","large-multimodal-models","physics-engine","robot-simulator"],"latest_commit_sha":null,"homepage":"https://docs.legent.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-13T06:57:25.000Z","updated_at":"2024-05-27T20:25:23.919Z","dependencies_parsed_at":"2024-05-27T20:25:22.899Z","dependency_job_id":"1491b85f-9c14-490a-91a3-7f23cf334e7e","html_url":"https://github.com/thunlp/LEGENT","commit_stats":null,"previous_names":["thunlp/legent"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FLEGENT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FLEGENT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FLEGENT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FLEGENT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thunlp","download_url":"https://codeload.github.com/thunlp/LEGENT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230462907,"owners_count":18229864,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embodied-ai","language-grounding","large-multimodal-models","physics-engine","robot-simulator"],"created_at":"2024-11-10T18:35:15.434Z","updated_at":"2024-12-19T16:12:02.543Z","avatar_url":"https://github.com/thunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\u003cimg src=\"misc/LEGENT-logo.webp\" alt=\"LEGENT\" width=\"300\" height=\"300\"/\u003e\u003c/div\u003e\n    \n\u003ch3 align=\"center\"\u003e\n    \u003cp\u003eOpen Platform for Embodied Agents\u003c/p\u003e\n\u003c/h3\u003e\n\n\u003ch4 align=\"center\"\u003e\n    \u003cp\u003e\n    【\n        \u003c!-- \u003ca href=\"https://github.com/thunlp/LEGENT/blob/main/docs/README.md\"\u003eDocumentation\u003c/a\u003e | --\u003e\n        \u003ca href=\"https://docs.legent.ai/\"\u003eDocumentation\u003c/a\u003e |\n        \u003ca href=\"https://arxiv.org/pdf/2404.18243\"\u003ePaper\u003c/a\u003e |\n        \u003ca href=\"https://huggingface.co/spaces/LEGENT/LEGENT\"\u003eDemo\u003c/a\u003e |\n        \u003ca href=\"https://docs.legent.ai/blog/introduction\"\u003eQuick Start\u003c/a\u003e |\n        \u003ca href=\"https://discord.gg/FenHQRyFN7\"\u003eDiscord\u003c/a\u003e\n    】\n    \u003c/p\u003e\n\u003c/h4\u003e\n\n---\n\n### Introduction\n\nIn the future, robots will perceive the environment as we do, communicate with us through natural language and help us with our tasks. LEGENT is dedicated to developing robots that can chat, see, and act from virtual worlds to the real world. Designed to integrate large models with embodied agents, this platform prioritizes ease of use and scalability, focusing on developing:\n\n* An easy-to-use environment that simulates a physical world, where an agent can interact with humans through language, receive egocentric vision, and perform physical actions.\n\n* Automated generation of training data, including the generation of scenes, tasks, and agent trajectories. The platform is tailored to train large multimodal models as embodied models, using generated data from simulated worlds at scale. LEGENT serves as the data engine for embodied models in robotics and games, as well as for world models.\n\n### Demonstration\n\nA simple [online demo](https://huggingface.co/spaces/LEGENT/LEGENT) is accessible on HuggingFace Space🤗.\nLet's dive into the immersive interactive world and interact with the agent!\n\nExamples of interaction with the embodied agent:\n\n\u003chttps://github.com/thunlp/LEGENT/assets/50205889/20657124-e2e6-434f-9315-bcbdce26e1f3\u003e\n\n\n\u003chttps://github.com/thunlp/LEGENT/assets/50205889/e667bf3d-1dc5-4ed7-95b7-b3bf6ab60fdf\u003e\n\n\n\n### Features\n\n* **Language Interaction**. Use natural language as the human-robot interaction interface.\n\n\n* **Fundamental Physics**. The simulation incorporates gravity, friction, and collision dynamics.\n\n* **Diverse Rendering**. By adjusting assets and rendering features, LEGENT can achieve photorealistic rendering and stylized rendering. \nInstructions for trying out these scenes can be found [here](https://docs.legent.ai/documentation/getting_started/play/#default-scenes).\n\n  \u003chttps://github.com/thunlp/LEGENT/assets/50205889/bcce2f73-8e8d-420a-85a2-0d7491840e48\u003e\n\n\n\n* **Interactable Objects**. Agents and humans can manipulate various 3D objects.\n\n  \u003chttps://github.com/thunlp/LEGENT/assets/50205889/b2392a4e-0c26-489a-b608-2c11f45c619f\u003e\n  \n* **Scalable Assets**. LEGENT supports importing (1) your own 3D objects, (2) objects from academic datasets, and (3) objects created by generative models. Learn more [here](https://docs.legent.ai/documentation/data/object_assets/). Note that the available adequately annotated 3D objects are very limited and vary a lot in format and quality. We are compiling a unified, open object assets library that can be freely used for embodied agent research.\n\n  \u003chttps://github.com/thunlp/LEGENT/assets/50205889/d5b35c51-4da3-4392-a87e-262ba70a9713\u003e\n\n  \u003chttps://github.com/thunlp/LEGENT/assets/50205889/b90c7ac4-73c6-4dfc-bbd8-9e4cd5051548\u003e\n\n* **Humanoid Animation**. Body movement and nonverbal expression are also important for embodied agents. LEGENT will continue to enhance support in this aspect.\n\n\n* **Scene Generation**. LEGENT integrates advanced scene generation algorithms to support scalable training.\n\n  \u003chttps://github.com/thunlp/LEGENT/assets/50205889/fafaa02e-1050-4dab-a43f-701bca1477b7\u003e\n\n* **Trajectory Generation**. Automatic generation of training data for training multimodal models into language-grounded embodied models. A minimal example of a trajectory:\n  \n  \u003cimg src=\"https://github.com/thunlp/LEGENT/assets/50205889/14a58d07-a28b-45c5-b5f8-323d0690d9cc\" width=\"160\" height=\"160\" alt=\"0000\"\u003e\n  \u003cimg src=\"https://github.com/thunlp/LEGENT/assets/50205889/137bacc9-c144-4ab3-a3bf-97ac216ebac1\" width=\"160\" height=\"160\" alt=\"0001\"\u003e\n  \u003cimg src=\"https://github.com/thunlp/LEGENT/assets/50205889/c0dd17d1-1b62-431d-8db3-96b9a90e8f60\" width=\"160\" height=\"160\" alt=\"0002\"\u003e\n  \u003cimg src=\"https://github.com/thunlp/LEGENT/assets/50205889/1a2e20e0-6bd7-4ff4-873f-93e2eef551f5\" width=\"160\" height=\"160\" alt=\"0003\"\u003e\n\n  ```json\n  {\n    \"id\": \"20240509-223825-320898\",\n    \"interactions\": [\n        {\n            \"from\": \"human\",\n            \"text\": \"Where is the orange?\"\n        },\n        {\n            \"from\": \"agent\",\n            \"trajectory\": [\n                {\n                    \"image\": \"20240509-223825-320898/0000.png\",\n                    \"action\": \"rotate_right(18)\"\n                },\n                {\n                    \"image\": \"20240509-223825-320898/0001.png\",\n                    \"action\": \"move_forward(2.0)\"\n                },\n                {\n                    \"image\": \"20240509-223825-320898/0002.png\",\n                    \"action\": \"move_forward(1.8), rotate_right(30)\"\n                },\n                {\n                    \"image\": \"20240509-223825-320898/0003.png\",\n                    \"action\": \"speak(\\\"It's on the sofa.\\\")\"\n                }\n            ]\n        }\n    ]\n  }\n  ```\n\n* **User-friendly**. LEGENT requires no complex installation and can run cross-platform on both PCs and servers. It is as intuitive as a game while also supporting complex research needs.\n\n### Note\n\nLEGENT is currently organizing code and documents and improving existing features. It will be more convenient to use once this process is complete. If you want a more stable version, please stay tuned!\n\n### TODO List\n\n- Polish APIs and write complete documentations.\n- Release the first stable version.\n- Develop a more powerful data generation system for training LMM-based embodied agents.\n- Add planning-level action APIs to support text-only research.\n- Add humanoid animation action APIs to support text-to-motion research.\n- Add physics-based character/body control by integrating more dedicated tools such as [MuJoCo](https://github.com/google-deepmind/mujoco?tab=readme-ov-file#bindings).\n- Add multi-agent support.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Flegent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthunlp%2Flegent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Flegent/lists"}