{"id":31175013,"url":"https://github.com/evalops/dspy-micro-agent","last_synced_at":"2026-04-05T23:03:08.134Z","repository":{"id":313840271,"uuid":"1053107455","full_name":"evalops/dspy-micro-agent","owner":"evalops","description":"Minimal agent runtime built with DSPy modules and a thin Python loop. Includes CLI, FastAPI server, and eval harness with OpenAI/Ollama support.","archived":false,"fork":false,"pushed_at":"2025-12-22T19:47:37.000Z","size":195,"stargazers_count":65,"open_issues_count":0,"forks_count":6,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-23T17:56:52.871Z","etag":null,"topics":["agent","agent-runtime","ai","cli","dspy","fastapi","llm","ollama","openai","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/evalops.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-09-09T02:12:20.000Z","updated_at":"2025-12-22T19:47:40.000Z","dependencies_parsed_at":"2025-09-09T04:55:35.505Z","dependency_job_id":"b242157c-f893-449e-a1e6-879a5e163bdf","html_url":"https://github.com/evalops/dspy-micro-agent","commit_stats":null,"previous_names":["evalops/dspy-micro-agent"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/evalops/dspy-micro-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evalops%2Fdspy-micro-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evalops%2Fdspy-micro-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evalops%2Fdspy-micro-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evalops%2Fdspy-micro-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/evalops","download_url":"https://codeload.github.com/evalops/dspy-micro-agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evalops%2Fdspy-micro-agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31452901,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T21:22:52.476Z","status":"ssl_error","status_checked_at":"2026-04-05T21:22:51.943Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","agent-runtime","ai","cli","dspy","fastapi","llm","ollama","openai","python"],"created_at":"2025-09-19T13:01:42.288Z","updated_at":"2026-04-05T23:03:07.540Z","avatar_url":"https://github.com/evalops.png","language":"Python","readme":"# DSPy Micro Agent\n\n[![CI](https://github.com/evalops/dspy-micro-agent/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/evalops/dspy-micro-agent/actions/workflows/ci.yml)\n\nMinimal agent runtime built with DSPy modules and a thin Python loop.\n- Plan/Act/Finalize expressed as DSPy `Signature`s, with OpenAI-native tool-calling when available.\n- Thin runtime (`agent.py`) handles looping, tool routing, and trace persistence.\n- CLI and FastAPI server, plus a tiny eval harness.\n\n## Quickstart\n- Python 3.10+\n- Create a virtualenv and install (using `uv`, or see pip alternative below):\n```bash\nuv venv \u0026\u0026 source .venv/bin/activate\nuv pip install -e .\ncp .env.example .env  # set OPENAI_API_KEY or configure Ollama\n\n# Ask a question (append --utc to nudge UTC use when time is relevant)\nmicro-agent ask --question \"What's 2*(3+5)?\" --utc\n\n# Run the API server\nuvicorn micro_agent.server:app --reload --port 8000\n\n# Run quick evals (repeat small dataset)\npython evals/run_evals.py --n 50\n```\n\nPip alternative:\n```bash\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -e .\n```\n\n## Configuration\n- `.env` is loaded automatically (via `python-dotenv`).\n- Set one of the following provider configs:\n  - OpenAI (default): `OPENAI_API_KEY`, `OPENAI_MODEL` (default `gpt-4o-mini`)\n  - Ollama: `LLM_PROVIDER=ollama`, `OLLAMA_MODEL` (e.g. `llama3.2:1b`), `OLLAMA_HOST` (default `http://localhost:11434`)\n- Optional tuning: `TEMPERATURE` (default `0.2`), `MAX_TOKENS` (default `1024`)\n- Tool plugins: `TOOLS_MODULES=\"your_pkg.tools,other_pkg.tools\"` to load extra tools (see Tools below)\n- Traces location: `TRACES_DIR` (default `traces/`)\n- CORS (API): `MICRO_AGENT_CORS_ORIGINS` (default `*`), `MICRO_AGENT_CORS_CREDENTIALS` (default `0`)\n- Compiled demos (OpenAI planner): `COMPILED_DEMOS_PATH` (default `opt/plan_demos.json`)\n- Concurrency/usage isolation: `MICRO_AGENT_SERIALIZE` (default `1`), `MICRO_AGENT_USE_GLOBAL_TRACE` (default `1`)\n\nExamples:\n```bash\n# OpenAI\nexport OPENAI_API_KEY=...\nexport OPENAI_MODEL=gpt-4o-mini\n\n# Ollama\nexport LLM_PROVIDER=ollama\nexport OLLAMA_MODEL=llama3.2:1b\nexport OLLAMA_HOST=http://localhost:11434\n\n# CORS (allow a specific origin and enable credentials)\nexport MICRO_AGENT_CORS_ORIGINS=\"https://example.com\"\nexport MICRO_AGENT_CORS_CREDENTIALS=1\n\n# Concurrency/usage isolation\nexport MICRO_AGENT_SERIALIZE=0  # allow concurrent requests\nexport MICRO_AGENT_USE_GLOBAL_TRACE=0  # disable shared trace usage\n```\n\n## CLI\n- `micro-agent ask --question \u003ctext\u003e [--utc] [--max-steps N]`\n  - `--utc` appends a hint to prefer UTC when time is used.\n  - Saves a JSONL trace under `traces/\u003cid\u003e.jsonl` and prints the path.\n- `micro-agent replay --path traces/\u003cid\u003e.jsonl [--index -1]`\n  - Pretty-prints a saved record from the JSONL file.\n\nExamples:\n```bash\nmicro-agent ask --question \"Add 12345 and 67890, then show the current date (UTC).\" --utc\nmicro-agent ask --question \"Compute (7**2 + 14)/5 and explain briefly.\" --max-steps 4\nmicro-agent replay --path traces/\u003cid\u003e.jsonl --index -1\n```\n\n## HTTP API\n- Start: `uvicorn micro_agent.server:app --reload --port 8000`\n- Endpoint: `POST /ask`\n  - Request JSON: `{ \"question\": \"...\", \"max_steps\": 6, \"use_tool_calls\": bool? }`\n  - Response JSON: `{ \"answer\": str, \"trace_id\": str, \"trace_path\": str, \"steps\": [...], \"usage\": {...}, \"cost_usd\": number }`\n  - Health: `GET /healthz` (ok), `GET /health` (provider/model), `GET /version` (package version)\n\nExample:\n```bash\ncurl -s http://localhost:8000/ask \\\n  -H 'content-type: application/json' \\\n  -d '{\"question\":\"What\\'s 2*(3+5)?\",\"max_steps\":6}' | jq .\n```\n\nOpenAPI:\n- FastAPI publishes `/openapi.json` and interactive docs at `/docs`.\n- Schemas reflect `AskRequest` and `AskResponse` models in `micro_agent/server.py`.\n\n### API Examples\n- Ask, capture `trace_id`, then fetch the full trace by id:\n```bash\nRESP=$(curl -s http://localhost:8000/ask \\\n  -H 'content-type: application/json' \\\n  -d '{\"question\":\"Add 12345 and 67890, then UTC time.\",\"max_steps\":6}')\necho \"$RESP\" | jq .\nTID=$(echo \"$RESP\" | jq -r .trace_id)\ncurl -s http://localhost:8000/trace/$TID | jq .\n```\n\n- Replay the saved JSONL locally using the CLI (last record by default index -1):\n```bash\nmicro-agent replay --path traces/$TID.jsonl --index -1\n```\n\n## Logging\n- Controlled via `MICRO_AGENT_LOG` (debug|info|warning|error). Default: `INFO`.\n- Applies to both CLI and server.\n\n## Tools\n- Built-ins live in `micro_agent/tools.py`:\n  - `calculator`: safe expression evaluator. Supports `+ - * / ** % // ( )` and `!` via rewrite to `fact(n)`.\n  - `now`: current timestamp; `{timezone: \"utc\"|\"local\"}` (default local).\n- Each tool is defined as:\n```\nTool(\n  \"name\",\n  \"description\",\n  {\"type\":\"object\",\"properties\":{...},\"required\":[...]},\n  handler_function,\n)\n```\n- Plugins: set `TOOLS_MODULES` to a comma-separated list of importable modules. Each module should expose either a `TOOLS: dict[str, Tool]` or a `get_tools() -\u003e dict[str, Tool]`.\n\nRuntime validation\n- Tool args are validated against the JSON Schema before execution; invalid args add a `⛔️validation_error` step and the agent requests a correction in the next loop. See `micro_agent/tools.py` (run_tool) and `micro_agent/agent.py` (validation error handling).\n\nCalculator limits\n- Factorial capped at 12; exponent size bounded; AST node count limited; large magnitudes rejected to prevent runaway compute. Only a small set of arithmetic nodes is allowed.\n\n\n## Provider Modes\n- OpenAI: uses DSPy `PlanWithTools` with `JSONAdapter` to enable native function-calls. The model may return `tool_calls` or a `final` answer; tool calls are executed via our registry.\n- Others (e.g., Ollama): uses a robust prompt with few-shot JSON decision demos. Decisions are parsed with strict JSON; on failure we try `json_repair` (if installed) and Python-literal parsing.\n- Policy enforcement: if the question implies math, the agent requires a `calculator` step before finalizing; likewise for time/date with the `now` tool. Violations are recorded in the trace as `⛔️policy_violation` steps and planning continues.\n\nCode references (discoverability)\n- Replay subcommand: `micro_agent/cli.py` (subparser `replay`, printing JSONL)\n- Policy enforcement markers: `micro_agent/agent.py` (look for `⛔️policy_violation` and `⛔️validation_error`)\n- Provider fallback and configuration: `micro_agent/config.py` (`configure_lm` tries Ollama → OpenAI → registry fallbacks)\n- JSON repair in decision parsing: `micro_agent/runtime.py` (`parse_decision_text` uses `json_repair` if available)\n\n## Tracing\n- Each run appends a record to `traces/\u003cid\u003e.jsonl` with fields: `id`, `ts`, `question`, `steps`, `answer`.\n- Steps are `{tool, args, observation}` in order of execution.\n- Replay: `micro-agent replay --path traces/\u003cid\u003e.jsonl --index -1`.\n - Fetch by id (HTTP): `GET /trace/{id}` (CORS enabled).\n\n## Evals\n- Dataset: `evals/tasks.yaml` (small, mixed math/time tasks). Rubric: `evals/rubrics.yaml`.\n- Run: `python evals/run_evals.py --n 50`.\n- Metrics printed: `success_rate`, `avg_latency_sec`, `avg_lm_calls`, `avg_tool_calls`, `avg_steps`, `avg_cost_usd`, `n`.\n- Scoring supports both `expect_contains` (answer substring) and `expect_key` (key present in any tool observation). Weights come from `rubrics.yaml` (`contains_weight`, `key_weight`).\n\n### Before/After Compiled Demos (OpenAI)\n- Model: `gpt-4o-mini`, N=30\n- Before (no demos): success_rate 1.00; avg_latency_sec ~0.188; avg_lm_calls 3.33; avg_tool_calls 1.17; avg_steps 3.17\n- After (compiled demos loaded): success_rate 1.00; avg_latency_sec ~0.188; avg_lm_calls 3.33; avg_tool_calls 1.17; avg_steps 3.17\nNotes: For this small dataset, demos neither help nor hurt. For larger flows, compile demos from your real tasks.\n\n### Cost \u0026 Tokens\n- The agent aggregates token counts and cost. If provider usage isn’t exposed, it estimates tokens from prompts/outputs and computes cost using prices.\n- Set env prices for OpenAI models (USD per 1K tokens):\n```bash\nexport OPENAI_INPUT_PRICE_PER_1K=0.005  # example\nexport OPENAI_OUTPUT_PRICE_PER_1K=0.015 # example\n```\nDefaults: for OpenAI models, built‑in prices are used if env isn’t set (best‑effort):\n- gpt-4o-mini: $0.00015 in / $0.0006 out per 1K tokens\n- gpt-4o (and 4.1): $0.005 in / $0.015 out per 1K tokens\nYou can override via the env vars above. Evals print `avg_cost_usd`.\n\n## Optimize (Teleprompting)\n- Compile optimized few-shot demos for the OpenAI `PlanWithTools` planner and save to JSON:\n```bash\nmicro-agent optimize --n 12 --tasks evals/tasks.yaml --save opt/plan_demos.json\n```\n- Apply compiled demos automatically by placing them at the default path or setting:\n```bash\nexport COMPILED_DEMOS_PATH=opt/plan_demos.json\n```\n- Optional: print a DSPy teleprompting template (for notebooks):\n```bash\nmicro-agent optimize --n 12 --template\n```\nThe agent loads these demos on OpenAI providers and attaches them to the `PlanWithTools` predictor to improve tool selection and output consistency.\n\n## Architecture\n- `micro_agent/config.py`: configures DSPy LM. Tries Ollama first if requested, else OpenAI; supports `dspy.Ollama`, `dspy.OpenAI`, and registry fallbacks like `dspy.LM(\"openai/\u003cmodel\u003e\")`.\n- `micro_agent/signatures.py`: DSPy `Signature`s for plan/act/finalize and OpenAI tool-calls.\n- `micro_agent/agent.py`: the runtime loop (~100+ LOC). Builds a JSON decision prompt, executes tools, enforces policy, and finalizes.\n- `micro_agent/runtime.py`: trace format, persistence, and robust JSON decision parsing utilities.\n- `micro_agent/cli.py`: CLI entry (`micro-agent`).\n- `micro_agent/server.py`: FastAPI app exposing `POST /ask`.\n- `evals/`: tiny harness to sample tasks, capture metrics, and save traces.\n\n## Development\n- Make targets: `make init`, `make run`, `make serve`, `make evals`, `make test`.\n- Tests: `pytest -q` (note: tests are minimal and do not cover all paths).\n\n## Docker\n- Build: `make docker-build`\n- Run (OpenAI): `OPENAI_API_KEY=... make docker-run` (maps `:8000`)\n- Run (Ollama on host): `make docker-run-ollama` (uses `host.docker.internal:11434`)\n- Env (OpenAI): `OPENAI_API_KEY`, `OPENAI_MODEL=gpt-4o-mini`\n- Env (Ollama): `LLM_PROVIDER=ollama`, `OLLAMA_HOST=http://host.docker.internal:11434`, `OLLAMA_MODEL=llama3.1:8b`\n- Service: `POST http://localhost:8000/ask` and `GET /trace/{id}`\n\n## Compatibility Notes\n- DSPy is pinned to `dspy-ai\u003e=2.5.0`. Some adapters (e.g., `JSONAdapter`, `dspy.Ollama`) may vary across versions; the code tries multiple backends and falls back to generic registry forms when needed.\n- If `json_repair` is installed, it is used opportunistically to salvage slightly malformed JSON decisions.\n  - Optional install: `pip install -e .[repair]`\n\n## Limitations and Next Steps\n- Usage/cost capture is best-effort: exact numbers depend on provider support; otherwise the agent estimates from text.\n- The finalization step often composes from tool results for reliability; you can swap in a DSPy `Finalize` predictor if preferred.\n- Add persistence to a DB instead of JSONL by replacing `dump_trace`.\n- Add human-in-the-loop, budgets, retries, or branching per your needs.\n\n## Objective\nProve: an “agent” can be expressed as DSPy modules plus a thin runtime loop.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevalops%2Fdspy-micro-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevalops%2Fdspy-micro-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevalops%2Fdspy-micro-agent/lists"}