{"id":31173288,"url":"https://github.com/hud-evals/hud-python","last_synced_at":"2026-04-07T10:01:09.019Z","repository":{"id":280567301,"uuid":"941344611","full_name":"hud-evals/hud-python","owner":"hud-evals","description":"OSS RL environment + evals toolkit","archived":false,"fork":false,"pushed_at":"2026-04-01T03:37:42.000Z","size":65007,"stargazers_count":319,"open_issues_count":7,"forks_count":54,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-04-01T04:26:27.593Z","etag":null,"topics":["grpo","llm","llms","lora","qwen","qwen3","reinforcement-learning","reinforcement-learning-environments","rl"],"latest_commit_sha":null,"homepage":"https://www.hud.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hud-evals.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-03-02T04:05:49.000Z","updated_at":"2026-04-01T03:37:46.000Z","dependencies_parsed_at":"2026-02-07T03:06:36.219Z","dependency_job_id":null,"html_url":"https://github.com/hud-evals/hud-python","commit_stats":null,"previous_names":["human-data/hud-sdk","hud-evals/hud-sdk","hud-evals/hud-python"],"tags_count":133,"template":false,"template_full_name":null,"purl":"pkg:github/hud-evals/hud-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hud-evals%2Fhud-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hud-evals%2Fhud-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hud-evals%2Fhud-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hud-evals%2Fhud-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hud-evals","download_url":"https://codeload.github.com/hud-evals/hud-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hud-evals%2Fhud-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31508282,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T03:10:19.677Z","status":"ssl_error","status_checked_at":"2026-04-07T03:10:13.982Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grpo","llm","llms","lora","qwen","qwen3","reinforcement-learning","reinforcement-learning-environments","rl"],"created_at":"2025-09-19T12:47:59.270Z","updated_at":"2026-04-07T10:01:09.005Z","avatar_url":"https://github.com/hud-evals.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"left\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/logo/hud_logo_dark.svg\"\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/logo/hud_logo.svg\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/logo/hud_logo.svg\" alt=\"HUD\" width=\"150\" style=\"margin-bottom: 24px;\"/\u003e\n  \u003c/picture\u003e\n\u003c/div\u003e\n\nHUD is a platform for building RL environments for AI agents. Define agent-callable tools, write evaluation scenarios, run evals at scale, and train models on the results.\n\nTo learn more, check out our [Documentation](https://docs.hud.ai) and [API Reference](https://docs.hud.ai/reference).\n\n[![PyPI](https://img.shields.io/pypi/v/hud-python?style=flat-square)](https://pypi.org/project/hud-python/)\n[![License](https://img.shields.io/badge/license-MIT-green?style=flat-square)](LICENSE)\n[![Add docs to Cursor](https://img.shields.io/badge/Add%20docs%20to-Cursor-black?style=flat-square)](https://cursor.com/en/install-mcp?name=docs-hud-python\u0026config=eyJ1cmwiOiJodHRwczovL2RvY3MuaHVkLmFpL21jcCJ9)\n[![Discord](https://img.shields.io/discord/1327447144772407390?label=Discord\u0026logo=discord\u0026style=flat-square)](https://discord.gg/wkjtmHYYjm)\n[![X Follow](https://img.shields.io/twitter/follow/hud_evals?style=social)](https://x.com/intent/user?screen_name=hud_evals)\n[![Scarf](https://static.scarf.sh/a.png?x-pxid=6530ff33-4945-452b-81f9-626872593933)](https://scarf.sh)\n[![Docs](https://img.shields.io/badge/docs-hud.ai-blue?style=flat-square)](https://docs.hud.ai)\n\n## Install\n\n```bash\n# Install CLI (recommended)\nuv tool install hud-python --python 3.12\n\nGet your API key at [hud.ai](https://hud.ai) and set it:\n\n```bash\nexport HUD_API_KEY=your-key-here\n```\n\nGet your API key at [hud.ai/project/api-keys](https://hud.ai/project/api-keys).\n\n\u003e Or install as a library: `pip install hud-python`\n\n![Agent running on SheetBench](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/trace_sheet.gif)\n\n## Environments\n\nAn environment is the harness an agent operates in. It packages tools (functions agents can call) and scenarios (how agents are evaluated) into a single deployable unit. Each environment spins up fresh and isolated for every evaluation.\n\n```python\nfrom hud import Environment\n\nenv = Environment(\"my-env\")\n\n@env.scenario(\"count\")\nasync def count(word: str, letter: str):\n    # PROMPT — send a question to the agent.\n    # The agent runs its reasoning loop and returns an answer.\n    answer = yield f\"How many '{letter}' in '{word}'?\"\n\n    # SCORE — check the agent's answer against the correct count.\n    # Return a reward: 1.0 for correct, 0.0 for wrong.\n    correct = str(word.lower().count(letter.lower()))\n    yield 1.0 if answer and correct in answer else 0.0\n```\n\nA scenario has two yields. The first sends a prompt — the agent runs between the yields, calling tools and reasoning. The second checks the result and returns a reward (0.0 to 1.0). → [Core Concepts](https://docs.hud.ai/concepts)\n\n## Run an Agent\n\n```python\nimport hud\nfrom hud.agents import create_agent\n\ntask = env(\"count\", word=\"strawberry\", letter=\"r\")\nagent = create_agent(\"claude-sonnet-4-5\")\n\nasync with hud.eval(task) as ctx:\n    result = await agent.run(ctx)\n\nprint(f\"Reward: {result.reward}\")  # 1.0 if agent answers \"3\"\n```\n\n`create_agent()` picks the right agent class and native tools for each model. → [Environments](https://docs.hud.ai/quick-links/environments)\n\n## Workflow\n\n```bash\nhud init my-env          # Scaffold environment\ncd my-env\nhud dev env:env -w env.py    # Run locally with hot-reload\nhud eval tasks.py claude     # Run evals locally\nhud deploy                   # Deploy to platform\nhud sync tasks my-taskset    # Sync tasks to platform\n```\n\nOnce deployed, run evals at scale from the CLI or the [platform UI](https://hud.ai):\n\n```bash\nhud eval my-taskset claude --remote --full\n```\n\n→ [Deploy](https://docs.hud.ai/quick-links/deploy) · [Testing \u0026 Evaluation](https://docs.hud.ai/advanced/testing-environments)\n\n## Pre-built Tools\n\nHUD ships tools for computer control, shell execution, file editing, browser automation, and web search. Add them to any environment:\n\n```python\nfrom hud.tools import AnthropicComputerTool, BashTool, EditTool\n\nenv.add_tool(AnthropicComputerTool())  # Mouse, keyboard, screenshots\nenv.add_tool(BashTool())               # Persistent bash shell\nenv.add_tool(EditTool())               # File viewing and editing\n```\n\nHUD adapts each tool to the model's native format — Claude gets `computer_20250124`, OpenAI gets `computer_use_preview`, Gemini gets `ComputerUse`. → [Tools Reference](https://docs.hud.ai/tools/computer)\n\n## Model Gateway\n\nUse Claude, GPT, Gemini, or Grok through one OpenAI-compatible endpoint:\n\n```python\nfrom openai import AsyncOpenAI\nimport os\n\nclient = AsyncOpenAI(\n    base_url=\"https://inference.hud.ai\",\n    api_key=os.environ[\"HUD_API_KEY\"]\n)\n\nresponse = await client.chat.completions.create(\n    model=\"claude-sonnet-4-5\",  # or gpt-4o, gemini-2.5-pro (https://hud.ai/models)\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n)\n```\n\nEvery call is traced at [hud.ai](https://hud.ai). → [Models](https://docs.hud.ai/quick-links/models)\n\n## Links\n\n- 📖 [Documentation](https://docs.hud.ai)\n- ⌨️ [CLI Reference](https://docs.hud.ai/reference/cli/overview)\n- 🏆 [Leaderboards](https://hud.ai/leaderboards)\n- 🌐 [Environment Templates](https://hud.ai/environments)\n- 🤖 [Supported Models](https://hud.ai/models)\n- 💬 [Discord](https://discord.gg/wkjtmHYYjm)\n\n## Enterprise\n\nBuilding agents at scale? We work with teams on custom environments, benchmarks, and training.\n\n[📅 Book a call](https://cal.com/jay-hud) · [📧 founders@hud.ai](mailto:founders@hud.ai)\n\n## Contributing\n\nWe welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md).\n\nKey areas: [Agents](hud/agents/) · [Tools](hud/tools/) · [Environments](https://hud.ai/environments)\n\n\u003ca href=\"https://github.com/hud-evals/hud-python/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=hud-evals/hud-python\u0026max=50\" /\u003e\n\u003c/a\u003e\n\n## Citation\n\n```bibtex\n@software{hud2025agentevalplatform,\n  author = {HUD and Jay Ram and Lorenss Martinsons and Parth Patel and Govind Pimpale and Dylan Bowman and Jaideep and Nguyen Nhat Minh},\n  title  = {HUD: An Evaluation and RL Envrionments Platform for Agents},\n  date   = {2025-04},\n  url    = {https://github.com/hud-evals/hud-python},\n  langid = {en}\n}\n```\n\nMIT License · [LICENSE](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhud-evals%2Fhud-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhud-evals%2Fhud-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhud-evals%2Fhud-python/lists"}