https://github.com/InternLM/agentlego

Enhance LLM agents with rich tool APIs
https://github.com/InternLM/agentlego

large-language-models llm llm-agents

Last synced: 6 months ago
JSON representation

Enhance LLM agents with rich tool APIs

Host: GitHub
URL: https://github.com/InternLM/agentlego
Owner: InternLM
License: apache-2.0
Created: 2023-05-18T10:28:29.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-09-13T08:00:19.000Z (over 1 year ago)
Last Synced: 2025-06-29T04:37:45.757Z (6 months ago)
Topics: large-language-models, llm, llm-agents
Language: Python
Homepage:
Size: 4.44 MB
Stars: 390
Watchers: 10
Forks: 33
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

Awesome-AI-Agents - agentlego - Enhance LLM agents with versatile tool APIs ![GitHub Repo stars](https://img.shields.io/github/stars/InternLM/agentlego?style=social) (Platforms/API / Advanced Components)

README

          








[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)

[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)

[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)

[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)

English | [简体中文](./README_zh-CN.md)



- [Introduction](#introduction)

- [Quick Starts](#quick-starts)

  - [Installation](#installation)

  - [Use tools directly](#use-tools-directly)

  - [Integrated into agent frameworks](#integrated-into-agent-frameworks)

- [Supported Tools](#supported-tools)

- [Licence](#licence)

## Introduction

 *AgentLego*  is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:

- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.

- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.

- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).

- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).

https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4

# Quick Starts

## Installation

**Install the AgentLego package**

```shell

pip install agentlego

```

**Install tool-specific dependencies**

Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements are

satisfied.

For example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of

[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.

```bash

pip install -U openmim

mim install -U mmpretrain

```

## Use tools directly

```Python

from agentlego import list_tools, load_tool

print(list_tools())  # list tools in AgentLego

image_caption_tool = load_tool('ImageDescription', device='cuda')

print(image_caption_tool.description)

image = './examples/demo.png'

caption = image_caption_tool(image)

```

## Integrated into agent frameworks

- [**Lagent**](examples/lagent_example.py)

- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)

- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)

# Supported Tools

**General ability**

- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.

- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.

**Speech related**

- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.

- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.

**Image-processing related**

- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.

- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.

- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.

- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.

- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.

- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.

- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.

- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.

- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.

- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.

- Segment Anything series

  - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.

  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.

**AIGC related**

- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.

- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.

- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.

- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.

- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.

- ControlNet series

  - [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.

  - [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.

  - [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.

  - [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.

- ImageBind series

  - [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.

  - [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.

  - [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.

  - [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.

# Licence

This project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/InternLM/agentlego

Awesome Lists containing this project

README