{"id":45867132,"url":"https://github.com/alloy-ex/alloy","last_synced_at":"2026-04-01T22:56:34.975Z","repository":{"id":340701390,"uuid":"1167226411","full_name":"alloy-ex/alloy","owner":"alloy-ex","description":"Model-agnostic agent harness for Elixir","archived":false,"fork":false,"pushed_at":"2026-03-03T09:28:18.000Z","size":306,"stargazers_count":21,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-03-03T10:42:02.921Z","etag":null,"topics":["agent-framework","ai","ai-agents","anthropic","elixir","gemini","genserver","llm","openai","otp"],"latest_commit_sha":null,"homepage":null,"language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alloy-ex.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-26T04:18:38.000Z","updated_at":"2026-03-02T19:19:56.000Z","dependencies_parsed_at":"2026-03-03T09:02:28.342Z","dependency_job_id":null,"html_url":"https://github.com/alloy-ex/alloy","commit_stats":null,"previous_names":["alloy-ex/alloy"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/alloy-ex/alloy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alloy-ex%2Falloy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alloy-ex%2Falloy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alloy-ex%2Falloy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alloy-ex%2Falloy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alloy-ex","download_url":"https://codeload.github.com/alloy-ex/alloy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alloy-ex%2Falloy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30071903,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T03:25:38.285Z","status":"ssl_error","status_checked_at":"2026-03-04T03:25:05.086Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-framework","ai","ai-agents","anthropic","elixir","gemini","genserver","llm","openai","otp"],"created_at":"2026-02-27T09:00:12.448Z","updated_at":"2026-04-01T22:56:34.962Z","avatar_url":"https://github.com/alloy-ex.png","language":"Elixir","funding_links":[],"categories":["Agent Frameworks"],"sub_categories":[],"readme":"# Alloy\n\n[![Hex.pm](https://img.shields.io/hexpm/v/alloy.svg)](https://hex.pm/packages/alloy)\n[![CI](https://github.com/alloy-ex/alloy/actions/workflows/ci.yml/badge.svg)](https://github.com/alloy-ex/alloy/actions/workflows/ci.yml)\n[![Docs](https://img.shields.io/badge/hex-docs-blue.svg)](https://hexdocs.pm/alloy)\n[![License](https://img.shields.io/hexpm/l/alloy.svg)](LICENSE)\n\n**Minimal, OTP-native agent loop for Elixir.**\n\nAlloy is the completion-tool-call loop and nothing else. Send messages to any LLM, execute tool calls, loop until done. Swap providers with one line. Run agents as supervised GenServers. No opinions on sessions, persistence, memory, scheduling, or UI — those belong in your application.\n\n```elixir\n{:ok, result} = Alloy.run(\"Read mix.exs and tell me the version\",\n  provider: {Alloy.Provider.OpenAI, api_key: System.get_env(\"OPENAI_API_KEY\"), model: \"gpt-5.4\"},\n  tools: [Alloy.Tool.Core.Read]\n)\n\nresult.text #=\u003e \"The version is 0.9.0\"\n```\n\n## Why Alloy?\n\nMost agent frameworks try to be everything — sessions, memory, RAG, multi-agent orchestration, scheduling, UI. Alloy does one thing well: the agent loop. Inspired by [Pi Agent](https://github.com/badlogic/pi-mono)'s minimalism, Alloy brings the same philosophy to the BEAM with OTP's natural advantages: supervision, fault isolation, parallel tool execution, and real concurrency.\n\n- **3 providers** — Anthropic, OpenAI, and OpenAICompat (works with any OpenAI-compatible API: Ollama, OpenRouter, xAI, DeepSeek, Mistral, Groq, Together, etc.)\n- **4 built-in tools** — read, write, edit, bash\n- **GenServer agents** — supervised, stateful, message-passing\n- **Streaming** — token-by-token from any provider, unified interface\n- **Async dispatch** — `send_message/2` fires non-blocking, result arrives via PubSub\n- **Middleware** — custom hooks, tool blocking\n- **Context compaction** — summary-based compaction when approaching token limits, with configurable reserve and fallback to truncation\n- **Prompt caching** — Anthropic `cache: true` adds cache breakpoints for 60-90% input token savings\n- **Reasoning blocks** — DeepSeek/xAI `reasoning_content` parsed as first-class thinking blocks\n- **Provider passthrough** — `extra_body` injects arbitrary provider-specific params (response_format, temperature, reasoning_effort)\n- **Telemetry** — run, turn, provider, and compaction lifecycle events for OTEL/logging/metrics\n- **Cost guard** — `max_budget_cents` halts the loop before overspending\n- **OTP-native** — supervision trees, hot code reloading, real parallel tool execution\n- **~5,000 lines** — small enough to read, understand, and extend\n\n## Design Boundary\n\nAlloy stays minimal by owning protocol and loop concerns, not application\nworkflows.\n\nWhat belongs in Alloy:\n- Provider wire-format translation\n- Tool-call / completion loop mechanics\n- Normalized message blocks\n- Opaque provider-owned state such as stored response IDs\n- Provider response metadata such as citations or server-side tool telemetry\n\nWhat does not belong in Alloy:\n- Sessions and persistence policy\n- File storage, indexing, or retrieval workflows\n- UI rendering for citations, search, or artifacts\n- Scheduling, background job orchestration, or dashboards\n- Tenant plans, quotas, billing, or hosted infrastructure policy\n\nRule of thumb: if the feature is required to speak a provider API correctly,\nand could help any Alloy consumer, it likely belongs here. If it needs a\ndatabase table, product defaults, UI decisions, or tenancy logic, it belongs in\nyour application layer.\n\n## Installation\n\nAdd `alloy` to your dependencies in `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:alloy, \"~\u003e 0.9\"}\n  ]\nend\n```\n\n## Quick Start\n\n### Simple completion\n\n```elixir\n{:ok, result} = Alloy.run(\"What is 2+2?\",\n  provider: {Alloy.Provider.Anthropic, api_key: \"sk-ant-...\", model: \"claude-sonnet-4-6\"}\n)\n\nresult.text #=\u003e \"4\"\n```\n\n### Agent with tools\n\n```elixir\n{:ok, result} = Alloy.run(\"Read mix.exs and summarize the dependencies\",\n  provider: {Alloy.Provider.OpenAICompat,\n    api_url: \"https://generativelanguage.googleapis.com\",\n    chat_path: \"/v1beta/openai/chat/completions\",\n    api_key: \"...\", model: \"gemini-2.5-flash-lite\"},\n  tools: [Alloy.Tool.Core.Read, Alloy.Tool.Core.Bash],\n  max_turns: 10\n)\n```\n\nGemini model IDs Alloy now budgets for include `gemini-2.5-pro`,\n`gemini-2.5-flash`, `gemini-2.5-flash-lite`, `gemini-3-pro-preview`, and\n`gemini-3-flash-preview`.\n\n### Swap providers in one line\n\n```elixir\n# The same tools and conversation work with any provider\nopts = [tools: [Alloy.Tool.Core.Read], max_turns: 10]\n\n# Anthropic\nAlloy.run(\"Read mix.exs\", [{:provider, {Alloy.Provider.Anthropic, api_key: \"...\", model: \"claude-sonnet-4-6\"}} | opts])\n\n# OpenAI\nAlloy.run(\"Read mix.exs\", [{:provider, {Alloy.Provider.OpenAI, api_key: \"...\", model: \"gpt-5.4\"}} | opts])\n\n# xAI via Responses-compatible API\nAlloy.run(\"Read mix.exs\", [{:provider, {Alloy.Provider.OpenAI, api_key: \"...\", api_url: \"https://api.x.ai\", model: \"grok-4\"}} | opts])\n\n# xAI via chat completions (reasoning models, extra_body)\nAlloy.run(\"Read mix.exs\", [{:provider, {Alloy.Provider.OpenAICompat, api_key: \"...\", api_url: \"https://api.x.ai\", model: \"grok-4.1-fast-reasoning\"}} | opts])\n\n# Any OpenAI-compatible API (Ollama, OpenRouter, DeepSeek, Mistral, Groq, etc.)\nAlloy.run(\"Read mix.exs\", [{:provider, {Alloy.Provider.OpenAICompat, api_url: \"http://localhost:11434\", model: \"llama4\"}} | opts])\n```\n\n### Streaming\n\nFor a one-shot run, use `Alloy.stream/3`:\n\n```elixir\n{:ok, result} =\n  Alloy.stream(\"Explain OTP\", fn chunk -\u003e\n    IO.write(chunk)\n  end,\n    provider: {Alloy.Provider.OpenAI, api_key: \"...\", model: \"gpt-5.4\"}\n  )\n```\n\nFor a persistent agent process with conversation state, use `Alloy.Agent.Server.stream_chat/4`:\n\n```elixir\n{:ok, agent} = Alloy.Agent.Server.start_link(\n  provider: {Alloy.Provider.OpenAI, api_key: \"...\", model: \"gpt-5.4\"},\n  tools: [Alloy.Tool.Core.Read]\n)\n\n{:ok, result} = Alloy.Agent.Server.stream_chat(agent, \"Explain OTP\", fn chunk -\u003e\n  IO.write(chunk)  # Print each token as it arrives\nend)\n```\n\nAll providers support streaming. If a custom provider doesn't implement\n`stream/4`, the turn loop falls back to `complete/3` automatically.\n\n`Alloy.run/2` remains the buffered convenience API. Use `Alloy.stream/3`\nwhen you want the same one-shot flow with token streaming.\n\n### Provider-owned state\n\nSome provider APIs expose server-side state such as stored response IDs.\nThat transport concern lives in Alloy; your app decides whether and how to\npersist it.\n\nResults expose provider-owned state in `result.metadata.provider_state`:\n\n```elixir\n{:ok, result} =\n  Alloy.run(\"Read the repo\",\n    provider: {Alloy.Provider.OpenAI,\n      api_key: System.get_env(\"XAI_API_KEY\"),\n      api_url: \"https://api.x.ai\",\n      model: \"grok-4\",\n      store: true\n    }\n  )\n\nprovider_state = result.metadata.provider_state\n```\n\nPass that state back to the same provider on the next turn to continue a\nprovider-native conversation:\n\n```elixir\n{:ok, next_result} =\n  Alloy.run(\"Keep going\",\n    messages: result.messages,\n    provider: {Alloy.Provider.OpenAI,\n      api_key: System.get_env(\"XAI_API_KEY\"),\n      api_url: \"https://api.x.ai\",\n      model: \"grok-4\",\n      provider_state: provider_state\n    }\n  )\n```\n\n### Provider-native tools and citations\n\nResponses-compatible providers can expose built-in server-side tools without\nleaking those wire details into your app layer.\n\nFor xAI search tools:\n\n```elixir\n{:ok, result} =\n  Alloy.run(\"Summarise the latest xAI docs updates\",\n    provider: {Alloy.Provider.OpenAI,\n      api_key: System.get_env(\"XAI_API_KEY\"),\n      api_url: \"https://api.x.ai\",\n      model: \"grok-4\",\n      web_search: %{allowed_domains: [\"docs.x.ai\"]},\n      include: [\"inline_citations\"]\n    }\n  )\n```\n\nCitation metadata is exposed in two places:\n- `result.metadata.provider_response.citations` for provider-level citation data\n- assistant text blocks may include `:annotations` for inline citation spans\n\n### Overriding model metadata\n\nAlloy derives the compaction budget from the configured provider model when it\nknows that model's context window. If you need to support a just-released model\nbefore Alloy ships a catalog update, override it in config:\n\n```elixir\n{:ok, result} = Alloy.run(\"Summarise this repository\",\n  provider: {Alloy.Provider.OpenAI, api_key: \"...\", model: \"gpt-5.4-2026-03-05\"},\n  model_metadata_overrides: %{\n    \"gpt-5.4\" =\u003e 900_000,\n    \"acme-reasoner\" =\u003e %{limit: 640_000, suffix_patterns: [\"\", ~r/^-\\d{4}\\.\\d{2}$/]}\n  }\n)\n```\n\nSet `max_tokens` explicitly when you want a fixed compaction budget. Otherwise\nAlloy derives it from the current model, including after\n`Alloy.Agent.Server.set_model/2` switches to a different provider model.\n\nUse `compaction:` when you want to tune how much room Alloy reserves before it\nsummarizes older context:\n\n```elixir\n{:ok, result} = Alloy.run(\"Summarise this repository\",\n  provider: {Alloy.Provider.OpenAI, api_key: \"...\", model: \"gpt-5.4\"},\n  compaction: [\n    reserve_tokens: 12_000,\n    keep_recent_tokens: 8_000,\n    fallback: :truncate\n  ]\n)\n```\n\n### Cost guard\n\nCap how much an agent run can spend:\n\n```elixir\n{:ok, result} = Alloy.run(\"Research this codebase thoroughly\",\n  provider: {Alloy.Provider.Anthropic, api_key: \"...\", model: \"claude-sonnet-4-6\"},\n  tools: [Alloy.Tool.Core.Read, Alloy.Tool.Core.Bash],\n  max_budget_cents: 50\n)\n\ncase result.status do\n  :completed -\u003e IO.puts(result.text)\n  :budget_exceeded -\u003e IO.puts(\"Stopped: spent #{result.usage.estimated_cost_cents}¢\")\nend\n```\n\nSet `max_budget_cents: nil` (default) for no limit.\n\n### Anthropic prompt caching\n\nEnable prompt caching to save 60-90% on input tokens. Alloy automatically adds\n`cache_control` breakpoints to the system prompt and last tool definition:\n\n```elixir\n{:ok, result} = Alloy.run(\"Explain this codebase\",\n  provider: {Alloy.Provider.Anthropic,\n    api_key: \"...\", model: \"claude-sonnet-4-6\",\n    cache: true\n  },\n  tools: [Alloy.Tool.Core.Read, Alloy.Tool.Core.Bash],\n  system_prompt: \"You are a senior Elixir developer.\"\n)\n\n# Cache usage is reported in result.usage\nresult.usage.cache_creation_input_tokens  #=\u003e 1500\nresult.usage.cache_read_input_tokens      #=\u003e 1500  (on subsequent calls)\n```\n\n### Reasoning model support (DeepSeek, xAI)\n\nOpenAI-compatible reasoning models that return `reasoning_content` (DeepSeek-R1,\nxAI Grok reasoning variants) are automatically parsed into thinking blocks:\n\n```elixir\n{:ok, result} = Alloy.run(\"Solve this step by step\",\n  provider: {Alloy.Provider.OpenAICompat,\n    api_url: \"https://api.x.ai\",\n    api_key: \"...\", model: \"grok-4.1-fast-reasoning\"\n  }\n)\n\n# Thinking blocks are preserved in message content\n[thinking, text] = hd(result.messages).content\nthinking.type     #=\u003e \"thinking\"\nthinking.thinking #=\u003e \"Step 1: Let me consider...\"\ntext.type         #=\u003e \"text\"\ntext.text         #=\u003e \"The answer is 42.\"\n```\n\n### Provider-specific parameters (extra_body)\n\nPass arbitrary provider-specific parameters via `extra_body`. It merges last,\nso it can override any default field:\n\n```elixir\n{:ok, result} = Alloy.run(\"Return JSON\",\n  provider: {Alloy.Provider.OpenAICompat,\n    api_url: \"https://api.deepseek.com\",\n    api_key: \"...\", model: \"deepseek-chat\",\n    extra_body: %{\n      \"response_format\" =\u003e %{\"type\" =\u003e \"json_object\"},\n      \"temperature\" =\u003e 0.3\n    }\n  }\n)\n```\n\nWorks for any provider param: `reasoning_effort`, `max_completion_tokens`,\n`presence_penalty`, etc.\n\n### Telemetry\n\nAlloy emits telemetry events for observability. Attach handlers for OTEL,\nlogging, or custom metrics:\n\n```elixir\n:telemetry.attach_many(\"my-handler\", [\n  [:alloy, :run, :start],\n  [:alloy, :run, :stop],\n  [:alloy, :turn, :start],\n  [:alloy, :turn, :stop],\n  [:alloy, :provider, :request],\n  [:alloy, :compaction, :done],\n  [:alloy, :tool, :start],\n  [:alloy, :tool, :stop],\n  [:alloy, :event]\n], \u0026MyApp.Telemetry.handle_event/4, nil)\n```\n\n| Event | Measurements | Metadata |\n|-------|-------------|----------|\n| `[:alloy, :run, :start]` | `system_time` | `model` |\n| `[:alloy, :run, :stop]` | `duration_ms` | `status`, `turns`, `model` |\n| `[:alloy, :turn, :start]` | `system_time` | `turn` |\n| `[:alloy, :turn, :stop]` | — | `turn`, `status` |\n| `[:alloy, :provider, :request]` | `duration_ms` | `provider`, `model`, `streaming`, `attempt`, `result` |\n| `[:alloy, :compaction, :done]` | `messages_before`, `messages_after` | `turn` |\n| `[:alloy, :tool, :start]` | — | tool identity, correlation |\n| `[:alloy, :tool, :stop]` | `duration_ms` | tool identity, result |\n\n### Supervised GenServer agent\n\n```elixir\n{:ok, agent} = Alloy.Agent.Server.start_link(\n  provider: {Alloy.Provider.Anthropic, api_key: \"...\", model: \"claude-sonnet-4-6\"},\n  tools: [Alloy.Tool.Core.Read, Alloy.Tool.Core.Edit, Alloy.Tool.Core.Bash],\n  system_prompt: \"You are a senior Elixir developer.\"\n)\n\n{:ok, response} = Alloy.Agent.Server.chat(agent, \"What does this project do?\")\n{:ok, response} = Alloy.Agent.Server.chat(agent, \"Now refactor the main module\")\n```\n\n### Async dispatch (Phoenix LiveView)\n\nFire a message without blocking the caller — ideal for LiveView and background jobs:\n\n```elixir\n# Subscribe to receive the result\nPhoenix.PubSub.subscribe(MyApp.PubSub, \"agent:#{session_id}:responses\")\n\n# Returns {:ok, request_id} immediately — agent works in the background\n{:ok, req_id} = Alloy.Agent.Server.send_message(agent, \"Summarise this report\",\n  request_id: \"req-123\"\n)\n\n# Handle the result whenever it arrives\ndef handle_info({:agent_response, %{text: text, request_id: \"req-123\"}}, socket) do\n  {:noreply, assign(socket, :response, text)}\nend\n```\n\n## Providers\n\n| Vendor | Recommended Module | Example Models |\n|--------|---------------------|----------------|\n| Anthropic | `Alloy.Provider.Anthropic` | `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5` |\n| OpenAI | `Alloy.Provider.OpenAI` | `gpt-5.4` |\n| xAI | `Alloy.Provider.OpenAI` with `api_url: \"https://api.x.ai\"` | `grok-4`, `grok-4.1-fast`, `grok-4-fast-reasoning`, `grok-code-fast-1` |\n| Gemini | `Alloy.Provider.OpenAICompat` | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.5-flash-lite`, `gemini-3-pro-preview` |\n| Other OpenAI-compatible APIs | `Alloy.Provider.OpenAICompat` | Ollama, OpenRouter, DeepSeek, Mistral, Groq, Together |\n\nUse `Alloy.Provider.OpenAI` for native Responses APIs like OpenAI and xAI.\nUse `Alloy.Provider.OpenAICompat` for chat-completions compatible APIs and local runtimes.\n\n`OpenAICompat` works with any API that implements the OpenAI chat completions format.\nJust set `api_url`, `model`, and optionally `api_key` and `chat_path`.\n\n## Built-in Tools\n\n| Tool | Module | Description |\n|------|--------|-------------|\n| **read** | `Alloy.Tool.Core.Read` | Read files from disk |\n| **write** | `Alloy.Tool.Core.Write` | Write files to disk |\n| **edit** | `Alloy.Tool.Core.Edit` | Search-and-replace editing |\n| **bash** | `Alloy.Tool.Core.Bash` | Execute shell commands (restricted shell by default) |\n\n### Custom tools\n\n```elixir\ndefmodule MyApp.Tools.WebSearch do\n  @behaviour Alloy.Tool\n\n  @impl true\n  def name, do: \"web_search\"\n\n  @impl true\n  def description, do: \"Search the web for information\"\n\n  @impl true\n  def input_schema do\n    %{\n      type: \"object\",\n      properties: %{query: %{type: \"string\", description: \"Search query\"}},\n      required: [\"query\"]\n    }\n  end\n\n  @impl true\n  def execute(%{\"query\" =\u003e query}, _context) do\n    # Your implementation here\n    {:ok, \"Results for: #{query}\"}\n  end\nend\n```\n\n### Code execution (Anthropic)\n\nEnable Anthropic's server-side code execution sandbox:\n\n```elixir\n{:ok, result} = Alloy.run(\"Calculate the first 20 Fibonacci numbers\",\n  provider: {Alloy.Provider.Anthropic, api_key: \"...\", model: \"claude-sonnet-4-6\"},\n  code_execution: true\n)\n```\n\n## Architecture\n\n```\nAlloy.run/2                    One-shot agent loop (pure function)\nAlloy.Agent.Server             GenServer wrapper (stateful, supervisable)\nAlloy.Agent.Turn               Single turn: call provider → execute tools → return\nAlloy.Provider                 Behaviour: translate wire format ↔ Alloy.Message\nAlloy.Tool                     Behaviour: name, description, input_schema, execute\nAlloy.Middleware               Pipeline: custom hooks, tool blocking\nAlloy.Context.Compactor        Automatic conversation summarization\n```\n\nSessions, persistence, multi-agent coordination, scheduling, skills, and UI\nbelong in your application layer. See [Anvil](https://github.com/alloy-ex/anvil)\nfor a reference Phoenix application built on Alloy.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n## Releases\n\nHex.pm publishing is handled by GitHub Actions on `v*` tags.\nSuccessful publishes also dispatch the landing-site version sync workflow.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falloy-ex%2Falloy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falloy-ex%2Falloy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falloy-ex%2Falloy/lists"}