An open API service indexing awesome lists of open source software.

awesome-agent-learning

Guides, courses & reading lists for learning to build autonomous LLM agents
https://github.com/artnitolog/awesome-agent-learning

Last synced: about 2 hours ago
JSON representation

  • Framework Tutorials

  • Evaluation Benchmarks

    • AgentBench - domain benchmark for measuring LLMs in autonomous agent roles, spanning 8 tasks (including OS management, SQL ops, web browsing/shopping, games, puzzles). Introduced a leaderboard for reproducible comparison.
    • BrowseComp - crafted questions designed to assess agents ability to find hard-to-locate information across the web.
    • GAIA - question benchmark that tests an agent's reasoning, web search, tool-use and multimodal understanding through short-answer tasks split into 3 difficulty levels.
    • OSWorld - and desktop-tasks, including UI/GUI manipulations, and graded only by execution traces.
    • SWE-bench - scale benchmark of 2k+ real-world github issues from Python repositories, where LLM agents must generate patches and pass tests, all inside a docker environment fully reproducible evaluation. Includes multiple extra versions: Lite with easy tasks, Verified with hand-validated issues, and Multimodal.
    • ToolBench - task benchmark that asks agents to call real APIs from different web services, including weather, spreadsheets, shopping, reservations, and virtual environments.
    • WebArena - hosted benchmark environment comprising 4 web applications: e‑commerce, forums, CMS, code — 800+ long-horizon tasks.
    • AgentBench - domain benchmark for measuring LLMs in autonomous agent roles, spanning 8 tasks (including OS management, SQL ops, web browsing/shopping, games, puzzles). Introduced a leaderboard for reproducible comparison.
    • BrowseComp - crafted questions designed to assess agents ability to find hard-to-locate information across the web.
    • GAIA - question benchmark that tests an agent's reasoning, web search, tool-use and multimodal understanding through short-answer tasks split into 3 difficulty levels.
    • OSWorld - and desktop-tasks, including UI/GUI manipulations, and graded only by execution traces.
    • SWE-bench - scale benchmark of 2k+ real-world github issues from Python repositories, where LLM agents must generate patches and pass tests, all inside a docker environment fully reproducible evaluation. Includes multiple extra versions: Lite with easy tasks, Verified with hand-validated issues, and Multimodal.
    • ToolBench - task benchmark that asks agents to call real APIs from different web services, including weather, spreadsheets, shopping, reservations, and virtual environments.
    • WebArena - hosted benchmark environment comprising 4 web applications: e‑commerce, forums, CMS, code — 800+ long-horizon tasks.
  • Foundational Courses

    • Agentic AI and AI Agents: A Primer for Leaders - technical executives and product managers who want to learn fundamentals of AI agents. Provides relevant theory and teaches no-code approaches of implementing AI agents using custom GPTs.
    • Learn AI Agents Handbook
    • Microsoft's AI Agents for Beginners - friendly but comprehensive open-source course comprising 11 lessons on building AI agents. Covers tool integration, RAG, agentic design patterns, multi-agent systems, deploying in production. Lessons include written materials, code samples and videos. Focused on Microsoft frameworks (Azure AI Agent Service,
    • Multi AI Agent Systems with crewAI - friendly course teaching how to build and deploy AI agents using CrewAI framework. Covers basic concepts including tools management, memory organization, errors handling, agent cooperation. Introduces a lot of AI agent examples for common business processes.
    • Advanced Large Language Model Agents - level course exploring the design and deployment of LLM-powered agents. Covers LLM foundations and infrastructure, reasoning, tool use, multi-agent collaboration, and various applications. Features guest lectures from leading researchers.
    • Advanced Large Language Model Agents - level course exploring the design and deployment of LLM-powered agents. Covers LLM foundations and infrastructure, reasoning, tool use, multi-agent collaboration, and various applications. Features guest lectures from leading researchers.
    • Agentic AI and AI Agents: A Primer for Leaders - technical executives and product managers who want to learn fundamentals of AI agents. Provides relevant theory and teaches no-code approaches of implementing AI agents using custom GPTs.
    • Learn AI Agents Handbook
    • Microsoft's AI Agents for Beginners - friendly but comprehensive open-source course comprising 11 lessons on building AI agents. Covers tool integration, RAG, agentic design patterns, multi-agent systems, deploying in production. Lessons include written materials, code samples and videos. Focused on Microsoft frameworks (Azure AI Agent Service,
    • Multi AI Agent Systems with crewAI - friendly course teaching how to build and deploy AI agents using CrewAI framework. Covers basic concepts including tools management, memory organization, errors handling, agent cooperation. Introduces a lot of AI agent examples for common business processes.
    • AI Agents Masterclass - friendly episodic series with full code walkthroughs to build AI agents. The course covers LangChain, LangGraph, RAG techniques, n8n workflow agents. Each episode is accompanied by the exact code used in the videos.
  • Conceptual Guides