An open API service indexing awesome lists of open source software.

https://github.com/andreascansee/ollama-scout

Minimalist local AI agent that uses tool calls to query and reason about Ollama models β€” just pure Python and prompts
https://github.com/andreascansee/ollama-scout

agent function-calling langchain-free llm ollama tool-use

Last synced: about 2 months ago
JSON representation

Minimalist local AI agent that uses tool calls to query and reason about Ollama models β€” just pure Python and prompts

Awesome Lists containing this project

README

          

# πŸ›°οΈ Ollama Scout

`ollama-scout` is a minimalist AI agent built from scratch that uses a local model (here, `qwen2.5:14b`) to explore and answer questions about other models in the Ollama library.

It demonstrates how to extend LLM capabilities through simple tool-use, without relying on any complex frameworks.

## ✨ Motivation

I wanted to understand how tool-using agents work β€” so I built one as **purely** as possible: No LangChain, no plugins, no hidden abstractions. Just structured prompts, simple tool calls, and a local model.

To make the experiment meaningful, I picked a real-world weakness of LLMs: they often give **outdated or hallucinated** answers when asked about other LLMs.

**Example**:

Asking `qwen2.5:14b`:

> *What is DeepSeek?*

Returns something like:

> *As of my current knowledge, there isn't an official product or service called "DeepSeek" from Alibaba Cloud or any other major technology company.*

Which is **wrong** β€” DeepSeek is a real and actively developed family of open-source reasoning models.

> 🎯 So the goal became to give a local model **accurate, up-to-date knowledge** about a focused topic β€” in this case: **an older LLM that can reason about other, current LLMs (from the Ollama library)**.

## πŸ€– What `ollama-scout` Does

Instead of relying on outdated or incomplete training data, `ollama-scout` augments a local model with live tooling:

- 🧭 **`search_ollama_models`** β€” queries the Ollama model library for relevant entries
- πŸ“„ **`fetch_ollama_metadata`** β€” scrapes detailed metadata from a model's info page

The agent runs a multi-step reasoning loop:

1. Starts with a user question (e.g. *What is DeepSeek?*)
2. Chooses tools and arguments on its own β€” no hardcoded sequence
3. Builds a final answer from all collected information

> πŸ” This allows a relatively lightweight modelβ€”such as `qwen2.5:14b` with 14B parameters (~8β€―GB), in contrast to massive models like GPT-4, which may have up to 1T parametersβ€”to perform grounded reasoning using up-to-date external sources.

## 🧾 Example Output

**User input**:
> *What is DeepSeek?*

**Final answer**:
> *DeepSeek is a family of open-source reasoning models developed by the DeepSeek team, known for their strong performance in tasks such as mathematics, programming, and general logic...*

Despite `qwen2.5:14b` having no built-in knowledge of DeepSeek, the agent was able to:
1. Run `search_ollama_models("deepseek")` to discover related models and URLs like `deepseek-r1 β†’ https://ollama.com/library/deepseek-r1`
2. Use `fetch_ollama_metadata("https://ollama.com/library/deepseek-r1")` to retrieve and parse live information
3. Synthesize a coherent answer β€” entirely using local inference

> βš™οΈ This demonstrates how small LLMs can be extended with real-time tools to provide current answers.

## πŸ›  Tech Stack
- **Python 3.13**
- Standard libraries, plus `BeautifulSoup` for lightweight HTML parsing (used by fetch tools)
- **Ollama (Local LLM Runtime)**
- Install via [https://ollama.com/download](https://ollama.com/download) or your system's package manager
- This project uses **Qwen2.5:14B**, a relatively small (β‰ˆ8GB) but capable model with native support for function calling (tool use)
- The model strikes a good balance between reasoning ability and runtime efficiency on consumer hardware

> ⚠️ You must download the model before use:
> ```bash
> ollama pull qwen2.5:14b
> ```

## πŸ—‚οΈ Project Overview

```text
.
β”œβ”€β”€ agent
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ engine.py
β”‚ β”œβ”€β”€ llm
β”‚ β”‚ β”œβ”€β”€ model.py
β”‚ β”‚ └── prompts.py
β”‚ └── tooling
β”‚ β”œβ”€β”€ parser.py
β”‚ └── runner.py
β”œβ”€β”€ configs
β”‚ └── ollama_tools.json
β”œβ”€β”€ tests
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ test_fetch.py
β”‚ └── test_search.py
β”œβ”€β”€ tools
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ base.py
β”‚ β”œβ”€β”€ fetch.py
β”‚ └── search.py
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
└── run_agent.py
```

### ▢️ Entry Point

- [`run_agent.py`](./run_agent.py) is the entry point of the application
- It loads tool definitions from [`ollama_tools.json`](./configs/ollama_tools.json) β€” which are used **only for prompt construction**, not for tool execution logic β€” and executes the main agent logic via the `run_agent_loop` function in [`engine.py`](./agent/engine.py)

### πŸ” Agent Reasoning Engine

- Defined in [`engine.py`](./agent/engine.py), the loop performs multi-step reasoning
- Prompts are sent to the model step by step, with a **maximum of 4 iterations**:
- One **initial search**
- Up to **three metadata fetches**
> 🧠 The model can stop earlier if it decides it already has enough information to answer confidently. Not all steps are required
- Each model response is inspected for a `{...}` block
- If no such block is found, the response is returned as-is and the loop ends
- If a tool call is found:
- It is **parsed** using `extract_tool_call()` from [`parser.py`](./agent/tooling/parser.py)
- Then **executed** using `dispatch_tool_call()` from [`runner.py`](./agent/tooling/runner.py)
- **Duplicate tool calls** (same tool name + same arguments) are **skipped** to avoid redundancy
- The **result of each tool call** is fed back into the next prompt using `build_followup_prompt()` β€” so the model builds context with every step
- After the loop completes, a **final prompt** is sent to produce a natural-language answer from all collected tool results

### 🧰 Tools

- Tools are located in the [`tools/`](./tools/) directory
- Each tool is implemented as a class
- `SearchOllamaModels` in [`search.py`](./tools/search.py)
- `FetchOllamaMetadata` in [`fetch.py`](./tools/fetch.py)
- All tools must inherit from the `Tool` base class defined in [`base.py`](./tools/base.py) and implement the following:
- A `name` property used for dispatch
- A `run()` method, which encapsulates the tool's logic
- This shared interface allows all tools to be executed generically via `tool.run(args)` without the engine needing to know their internal behavior

### πŸͺͺ Tool Registry

- All tools must be registered in the `TOOL_REGISTRY` dictionary in [`runner.py`](./agent/tooling/runner.py)
- This file is responsible for:
- Instantiating all available tool classes
- Exposing the `dispatch_tool_call()` function, which dispatches tool calls by name and passes in arguments

### πŸ›ŽοΈ Prompt Templates

- Defined in [`prompts.py`](./agent/llm/prompts.py), these templates guide the agent's reasoning across three stages:
- 🟒 **Initial prompt** β€” introduces the model to the user query and available tools
- πŸ”„ **Follow-up prompts** β€” injected after each tool call and include all previous tool outputs
- 🏁 **Final prompt** β€” instructs the model to compose a full, natural-language answer based on all collected information

### βš™οΈ Model Config

- Model configuration is handled in [`model.py`](./agent/llm/model.py)
- By default, the agent queries a locally running Ollama server via HTTP using the `/api/generate` endpoint

## πŸš€ Getting Started

1. First, set up a virtual environment (recommended):

```bash
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
```

2. Install dependencies:

```bash
pip install -r requirements.txt
```

3. Run the agent:

```bash
python run_agent.py
```

## πŸ€Ήβ€β™‚οΈ Experimentation
You can easily extend or modify the agent's behavior:

- **Prompts**: Customize how the agent thinks by editing [`prompts.py`](./agent/llm/prompts.py).

- **Model**: Change the underlying model in [`model.py`](./agent/llm/model.py) by modifying the `MODEL_NAME` variable.

- **Add your own tools**:
1. **Define**: Create a new Python class in [`tools/`](./tools/) that inherits from `Tool` in [`base.py`](./tools/base.py). Put your logic in the `run` method.
2. **Register**: Import and register the tool class in [`runner.py`](./agent/tooling/runner.py), so the engine can call it when needed.
3. **Describe**: Add the new tool definition to [`ollama_tools.json`](./configs/ollama_tools.json), so the model is aware of it and can use it in tool calls.

## πŸ§ͺ Testing
You can test the individual tools independently of the agent loop:

```bash
python -m tests.test_search # Runs the search tool, saves results to `ollama_models/`
python -m tests.test_fetch # Runs the fetch tool, saves pages to `ollama_docs/`
```
> ℹ️ Using `-m` runs the module as a script, which ensures proper resolution of relative imports inside the tests package.

This can be useful for debugging tool logic or collecting fresh data.

## βš™οΈ Alternative: Use subprocess (no HTTP API)
If you want to avoid running the Ollama HTTP server, you can call the model directly via the command line using `subprocess`.
> Note: This avoids the HTTP API, but **Ollama still needs to be installed locally and accessible via CLI**.

To switch to this mode, replace the contents of [`model.py`](./agent/llm/model.py) with the following:

```python
import subprocess

def query_model(prompt: str) -> str:
result = subprocess.run(
["ollama", "run", "qwen2.5:14b"],
input=prompt,
text=True,
capture_output=True
)
return result.stdout.strip()
```