{"id":19529834,"url":"https://github.com/internlm/agentlego","last_synced_at":"2025-04-08T00:39:46.890Z","repository":{"id":211267056,"uuid":"642330525","full_name":"InternLM/agentlego","owner":"InternLM","description":"Enhance LLM agents with rich tool APIs","archived":false,"fork":false,"pushed_at":"2024-09-13T08:00:19.000Z","size":4656,"stargazers_count":384,"open_issues_count":6,"forks_count":32,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-03-31T23:37:36.026Z","etag":null,"topics":["large-language-models","llm","llm-agents"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/InternLM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-18T10:28:29.000Z","updated_at":"2025-03-30T03:46:27.000Z","dependencies_parsed_at":"2023-12-18T08:25:30.960Z","dependency_job_id":"bfb6babf-f50e-41a7-a639-7b7a2bd1d44b","html_url":"https://github.com/InternLM/agentlego","commit_stats":null,"previous_names":["internlm/agentlego"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InternLM%2Fagentlego","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InternLM%2Fagentlego/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InternLM%2Fagentlego/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InternLM%2Fagentlego/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/InternLM","download_url":"https://codeload.github.com/InternLM/agentlego/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247755560,"owners_count":20990620,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["large-language-models","llm","llm-agents"],"created_at":"2024-11-11T01:27:49.644Z","updated_at":"2025-04-08T00:39:46.860Z","avatar_url":"https://github.com/InternLM.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289025203-f05733ff-6bbb-46f0-92aa-8827c59df79c.png\" width=\"450\"/\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)\n[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)\n[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)\n[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)\n\nEnglish | [简体中文](./README_zh-CN.md)\n\n\u003c/div\u003e\n\n- [Introduction](#introduction)\n- [Quick Starts](#quick-starts)\n  - [Installation](#installation)\n  - [Use tools directly](#use-tools-directly)\n  - [Integrated into agent frameworks](#integrated-into-agent-frameworks)\n- [Supported Tools](#supported-tools)\n- [Licence](#licence)\n\n## Introduction\n\n\u003cspan style=\"color:blue\"\u003e *AgentLego* \u003c/span\u003e is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:\n\n- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.\n- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.\n- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).\n- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).\n\nhttps://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4\n\n# Quick Starts\n\n## Installation\n\n**Install the AgentLego package**\n\n```shell\npip install agentlego\n```\n\n**Install tool-specific dependencies**\n\nSome tools requires extra packages, please check the readme file of the tool, and confirm all requirements are\nsatisfied.\n\nFor example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of\n[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.\n\n```bash\npip install -U openmim\nmim install -U mmpretrain\n```\n\n## Use tools directly\n\n```Python\nfrom agentlego import list_tools, load_tool\n\nprint(list_tools())  # list tools in AgentLego\n\nimage_caption_tool = load_tool('ImageDescription', device='cuda')\nprint(image_caption_tool.description)\nimage = './examples/demo.png'\ncaption = image_caption_tool(image)\n```\n\n## Integrated into agent frameworks\n\n- [**Lagent**](examples/lagent_example.py)\n- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)\n- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)\n\n# Supported Tools\n\n**General ability**\n\n- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.\n- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.\n\n**Speech related**\n\n- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.\n- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.\n\n**Image-processing related**\n\n- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.\n- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.\n- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.\n- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.\n- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.\n- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.\n- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.\n- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.\n- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.\n- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.\n- Segment Anything series\n  - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.\n  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.\n\n**AIGC related**\n\n- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.\n- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.\n- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.\n- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.\n- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.\n- ControlNet series\n  - [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.\n  - [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.\n  - [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.\n  - [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.\n- ImageBind series\n  - [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.\n  - [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.\n  - [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.\n  - [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.\n\n# Licence\n\nThis project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finternlm%2Fagentlego","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finternlm%2Fagentlego","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finternlm%2Fagentlego/lists"}