{"id":34051705,"url":"https://github.com/qx-labs/agents-deep-research","last_synced_at":"2026-04-09T03:33:00.307Z","repository":{"id":283199812,"uuid":"950839330","full_name":"qx-labs/agents-deep-research","owner":"qx-labs","description":"An implementation of iterative deep research using the OpenAI Agents SDK","archived":false,"fork":false,"pushed_at":"2025-12-27T02:02:57.000Z","size":184,"stargazers_count":690,"open_issues_count":10,"forks_count":78,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-12-28T17:50:11.139Z","etag":null,"topics":["agentic-ai","agents","deep-research","deepresearch","llms","openai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qx-labs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-03-18T19:00:49.000Z","updated_at":"2025-12-27T13:15:48.000Z","dependencies_parsed_at":"2025-04-07T03:26:53.138Z","dependency_job_id":"836ac1b7-bff5-449d-9688-efd893be1d1a","html_url":"https://github.com/qx-labs/agents-deep-research","commit_stats":null,"previous_names":["qx-labs/agents-sdk-deep-research","qx-labs/agents-deep-research"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/qx-labs/agents-deep-research","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qx-labs%2Fagents-deep-research","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qx-labs%2Fagents-deep-research/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qx-labs%2Fagents-deep-research/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qx-labs%2Fagents-deep-research/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qx-labs","download_url":"https://codeload.github.com/qx-labs/agents-deep-research/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qx-labs%2Fagents-deep-research/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31584578,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"online","status_checked_at":"2026-04-09T02:00:06.848Z","response_time":112,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agents","deep-research","deepresearch","llms","openai"],"created_at":"2025-12-14T01:36:21.767Z","updated_at":"2026-04-09T03:33:00.300Z","avatar_url":"https://github.com/qx-labs.png","language":"Python","funding_links":[],"categories":["The latest additions 🎉"],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n[![GitHub Stars](https://img.shields.io/github/stars/qx-labs/agents-deep-research?style=social)](https://github.com/qx-labs/agents-deep-research/stargazers)\n[![GitHub Forks](https://img.shields.io/github/forks/qx-labs/agents-deep-research?style=social)](https://github.com/qx-labs/agents-deep-research/network/members)\n\n[![PyPI version](https://badge.fury.io/py/deep-researcher.svg)](https://pypi.org/project/deep-researcher/)\n[![License](https://img.shields.io/github/license/qx-labs/agents-deep-research)](https://github.com/qx-labs/agents-deep-research/blob/main/LICENSE)\n[![PyPI Downloads](https://static.pepy.tech/badge/deep-researcher)](https://pepy.tech/projects/deep-researcher)\n\n\u003c/div\u003e\n\n# Agentic Deep Research using the OpenAI Agents SDK\n\nA powerful deep research assistant built using the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python), designed to perform in-depth research on any given topic. Compatible with Azure OpenAI, OpenAI, Anthropic, Gemini, DeepSeek, Perplexity, OpenRouter, Hugging Face and local models such as Ollama.\n\nIt uses a multi-agent architecture that works iteratively, continually refining its understanding of a topic and producing increasingly detailed insights that feed the final report.\n\nDesigned to be extendable to use custom tools and any other 3rd party LLMs compatible with the OpenAI API spec. LLM and tool calls can be optionally traced using OpenAI's tracing feature.\n\nSome background reading [here](https://www.j2.gg/thoughts/deep-research-how-it-works).\n\n## Overview\n\nThis package has two modes of research:\n\n- An `IterativeResearcher` which runs a continuous loop of research on a topic or sub-topic and drafts a report\n  - This is preferred and sufficient for shorter reports (up to 5 pages / 1,000 words)\n  - The user can specify constraints such as research depth, time limits, report length and formatting instructions\n- A `DeepResearcher` which runs a more thorough and structured process, first forming a report outline, and then running concurrent `IterativeResearcher` instances for each section of the report\n  - This is useful for longer reports (e.g. 20+ pages)\n\nThe flow of the `DeepResearcher` is as follows:\n\n1. Takes a research topic and conducts preliminary research to form a report outline / plan\n2. For each section of the report plan, runs parallel instances of the `IterativeResearcher`, which:\n   1. Identifies knowledge gaps in the current research\n   2. Strategically selects the appropriate tools to fill those gaps\n   3. Executes research actions through specialized agents\n   4. Synthesizes findings into a comprehensive section\n3. Compiles all of the sections into a coherent and well-structured report\n\nIt is worth noting that the deep research agent does not ask clarifying questions at the start, so can be used in an automated fashion.\n\n## Sample Output\n\nDeep Research Examples (using DeepResearcher):\n- [Life and Works of Plato](examples/sample_output/plato.md) - 7,980 words\n- [Text Book on Quantum Computing](examples/sample_output/quantum_computing.md) - 5,253 words\n- [Deep-Dive on Tesla](examples/sample_output/tesla.md) - 4,732 words\n\nSimple Research Examples (using IterativeResearcher):\n- [Quantera Market Size](examples/sample_output/quantera_market_size.md) - 1,001 words\n- [UK Government Policies](examples/sample_output/labour_policies.md) - 1,077 words\n\n## Flow Diagram\n\n### IterativeResearcher Flow\n\n```mermaid\nflowchart LR\n    A[\"User Input\u003cbr\u003e- query\u003cbr\u003e- max_iterations\u003cbr\u003e- max_time\u003cbr\u003e- output_instructions\"] --\u003e B\n\n    subgraph \"Deep Research Loop\"\n        B[\"Knowledge\u003cbr\u003eGap Agent\"] --\u003e|\"Current gaps\u003cbr\u003e\u0026 objective\"| C[\"Tool Selector\u003cbr\u003eAgent\"]\n        C --\u003e|\"Tool queries\u003cbr\u003e(run in parallel)\"| D[\"Tool Agents\u003cbr\u003e- Web Search\u003cbr\u003e- Crawler\u003cbr\u003e- Custom tools\"]\n        D --\u003e|\"New findings\"| E[\"Observations\u003cbr\u003eAgent\"]\n        E --\u003e |\"Thoughts on findings\u003cbr\u003eand research strategy\"| B\n    end\n\n    E --\u003e F[\"Writer Agent\u003cbr\u003e(final output\u003cbr\u003ewith references)\"]\n```\n\n### DeepResearcher Flow\n\n```mermaid\nflowchart LR\n    A[\"User Input\u003cbr\u003e- query\u003cbr\u003e- max_iterations\u003cbr\u003e- max_time\"] --\u003e B[\"Planner Agent\"]\n    \n    B --\u003e|\"Report plan\u003cbr\u003e(sections \u0026 background context)\"| D2\n    \n    subgraph Parallel[\"Parallel Section Research\"]\n        D1[\"IterativeResearcher\u003cbr\u003e(Section 1)\"]\n        D2[\"IterativeResearcher\u003cbr\u003e(Section 2)\"]\n        D3[\"IterativeResearcher\u003cbr\u003e(Section 3)\"]\n    end\n    \n    D1 --\u003e|\"Section 1\u003cbr\u003eDraft\"| E[\"Proofreader\u003cbr\u003eAgent\"]\n    D2 --\u003e|\"Section 2\u003cbr\u003eDraft\"| E\n    D3 --\u003e|\"Section 3\u003cbr\u003eDraft\"| E\n    \n    E --\u003e F[\"Final\u003cbr\u003eResearch\u003cbr\u003eReport\"]\n```\n\n## Installation\n\nInstall using `pip`:\n\n```\npip install deep-researcher\n```\n\nOr clone the GitHub repo:\n\n```sh\ngit clone https://github.com/qx-labs/agents-deep-research.git\ncd agents-deep-research\npip install -r requirements.txt\n```\n\nThen create a `.env` file with your API keys:\n\n```sh\ncp .env.example .env\n```\n\nEdit the `.env` file to add your OpenAI, Serper and other settings as needed, e.g.:\n\n```sh\nOPENAI_API_KEY=\u003cyour_key\u003e\nSEARCH_PROVIDER=serper  # or set to openai\nSERPER_API_KEY=\u003cyour_key\u003e\n```\n\n## Usage\n\n### Python Module\n\n```python\n# See the /examples folder for working examples\nimport asyncio\nfrom deep_researcher import IterativeResearcher, DeepResearcher\n\n# Run the IterativeResearcher for simple queries\nresearcher = IterativeResearcher(max_iterations=5, max_time_minutes=5)\nquery = \"Provide a comprehensive overview of quantum computing\"\nreport = asyncio.run(\n    researcher.run(query, output_length=\"5 pages\")\n)\n\n# Run the DeepResearcher for more lengthy and structured reports\nresearcher = DeepResearcher(max_iterations=3, max_time_minutes=5)\nreport = asyncio.run(\n    researcher.run(query)\n)\n\nprint(report)\n```\n\n#### Custom LLM Configuration at Runtime\n\nWhen running the deep researcher in Python, you have the option to set custom LLM configuration variables at runtime. This gives you flexibility to dynamically change the model choice within your code.\n\n```python\nimport asyncio\nfrom deep_researcher import DeepResearcher, LLMConfig\n\n# These configuration options will take precedence over the environment variables\nllm_config = LLMConfig(\n    search_provider=\"serper\",\n    reasoning_model_provider=\"openai\",\n    reasoning_model=\"o3-mini\",\n    main_model_provider=\"openai\",\n    main_model=\"gpt-4o\",\n    fast_model_provider=\"openai\",\n    fast_model=\"gpt-4o-mini\"\n)\nresearcher = DeepResearcher(max_iterations=3, max_time_minutes=5, config=llm_config)\nreport = asyncio.run(\n    researcher.run(query)\n)\n```\n\n### Command Line\n\nRun the research assistant from the command line.\n\nIf you've installed via `pip`:\n```sh\ndeep-researcher --mode deep --query \"Provide a comprehensive overview of quantum computing\" --max-iterations 3 --max-time 10 --verbose\n```\n\nOr if you've cloned the GitHub repo:\n\n```sh\npython -m deep_researcher.main --mode deep --query \"Provide a comprehensive overview of quantum computing\" --max-iterations 3 --max-time 10 --verbose\n```\n\nParameters:\n\n- `--query`: The research topic or question (if not provided, you'll be prompted)\n- `--mode`: If `deep` uses the DeepResearcher, if `simple` uses the IterativeResearcher (default: deep)\n- `--max-iterations`: Maximum number of research iterations (default: 5)\n- `--max-time`: Maximum time in minutes before the research loop auto-exits to produce a final output (default: 10)\n- `--output-length`: Desired output length for the report (default: \"5 pages\")\n- `--output-instructions`: Additional formatting instructions for the final report\n\nBoolean Flags:\n\n- `--verbose`: Prints the research progress to console\n- `--tracing`: Traces the workflow on the OpenAI platform (only works for OpenAI models)\n\n## Compatible Models\n\nThe deep researcher is designed to run any model compatible with the OpenAI API spec, and does so by adjusting the `base_url` parameter to the relevant model provider. Compatible providers include Azure OpenAI, OpenAI, Anthropic, Gemini, DeepSeek, Hugging Face and OpenRouter as well as locally hosted models via Ollama and LM Studio.\n\nHowever, in order for the deep researcher to be run without errors it relies on models that are highly performant at tool calling.\n\n- If using OpenAI models, we find that the `gpt-4o-mini` is as good if not better at tool selection than `o3-mini` (which is consistent with [this leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html)). Given the speed and cost benefits we therefore advise using `gpt-4o-mini` as the model for the majority of agents in our workflow, with `o3-mini` for planning tasks and `gpt-4o` for final writing.\n- If using Gemini models, note that only Gemini 2.5 Pro (currently `gemini-2.5-pro-preview-03-25`) works well. Gemini 2.0 Flash (`gemini-2.0-flash`), despite being listed as compatible with tool calling, very frequently fails to call any tools.\n\n## Architecture\n\nThe Deep Research Assistant is built with the following components:\n\n### Core Components\n\n- **IterativeResearcher**: Orchestrates the iterative research workflow on a single topic or subtopic\n- **DeepResearcher**: Orchestrates a deeper and broader workflow that includes an initial report outline, calling of multiple parallel `IterativeResearch` instances, and final proofreading step\n- **LLMConfig**: Manages interactions with language models so that these can be swapped out as needed\n\n### Agent System\n\n- **Knowledge Gap Agent**: Analyzes current research state and identifies gaps in knowledge\n- **Tool Selector Agent**: Determines which tools to use for addressing specific knowledge gaps\n- **Tool Agents**: Specialized agents for executing specific research actions (can be extended to add custom tools):\n  - Web Search Agent\n  - Website Crawler Agent\n- **Writer Agent**: Synthesizes research findings into coherent reports\n\n### Research Tools\n\n- **Web Search**: Finds relevant information from SERP queries\n  - Our implementation uses [Serper](https://www.serper.dev) to run Google searches by default, which requires an API key set to the `SERPER_API_KEY` env variable.\n  - You can replace this with the native web search tool from OpenAI by setting the environment variable `SEARCH_PROVIDER` to `openai`\n- **Website Crawler**: Extracts detailed content from the pages of a given website\n\n### Implementing Custom Tool Agents\n\nTool agents are agents specialized in carrying out specific tasks using one or more tools (e.g. web searches, fetching and interpreting data from an API, etc). To implement a custom tool agent:\n* Create any tools that the agent will use in the `deep_researcher/tools` folder\n* Create a new tool agent that calls this tool in the `deep_researcher/agents/tool_agents` folder\n* Add the tool agent definition to the `init_tool_agents` function in `deep_researcher/agents/tool_agents/__init__.py`\n* Update the system prompt of `deep_researcher/agents/tool_selector_agent.py` to include the name and description of the new agent, so that the ToolSelectorAgent knows of its existence\n\n### Configuring Custom LLMs\n\nThis repository is in theory compatible with any LLMs that follow the OpenAI API specs. This includes the likes of DeepSeek as well as models served through OpenRouter. However, the models need to be compatible with [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs) in the OpenAI API spec (i.e. being able to set `response_format: {type: \"json_schema\", ...}`).\n\nLLMs are configured and managed in the `deep_researcher/llm_config.py` file.\n\n## Trace Monitoring\n\nThe Deep Research assistant integrates with OpenAI's trace monitoring system. Each research session generates a trace ID that can be used to monitor the execution flow and agent interactions in real-time through the OpenAI platform.\n\n## Observations and Limitations\n\n### Rate Limits\n- The `DeepResearcher` runs a lot of searches and API calls in parallel (at any given point in time it could be ingesting 50-60 different web pages). As a result you may find that yourself hitting rate limits for OpenAI, Gemini, Anthropic and other model providers particularly if you are on lower or free tiers. \n- If you run into these errors, you may wish to use the `IterativeResearcher` instead which is less consumptive of API calls.\n\n### **Output Length:** \n\nLLMs are not good at following guidelines on output length. You typically run into two issues:\n\n- LLMs are bad at counting. When giving length instructions, it's better to provide a reference that the model will be familiar with from its training data (e.g. 'length of a tweet', 'a few paragraphs', 'length of a book') rather than a specific word count. \n- Even though the output token limit on many of these models is massive, it is very difficult to get them to produce more than 1-2,000 words per response. There are methods such as [this one](https://medium.com/@techsachin/longwriter-using-llm-agent-based-pipeline-to-scale-llms-output-window-size-to-10-000-words-33210d299e2b) to produce longer outputs.\n\nWe include an `output_length` parameter for the `IterativeResearcher` to give the user control but bear in mind the above limitations.\n\n## TODOs:\n\n- [ ] Add unit tests for different model providers\n- [ ] Add example implementation for different models\n- [ ] Add compatibility with other search providers (e.g. SearXNG, Bing, Tavily, DuckDuckGo etc.)\n- [ ] Add caching (e.g. Redis) of scraped web pages to avoid duplicate work/calls\n- [ ] Add more specialized research tools (e.g. Wikipedia, arXiv, data analysis etc.)\n- [ ] Add PDF parser\n- [ ] Add integration / RAG for local files\n\n## Author\n\nCreated by Jai Juneja at [QX Labs](https://www.qxlabs.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqx-labs%2Fagents-deep-research","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqx-labs%2Fagents-deep-research","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqx-labs%2Fagents-deep-research/lists"}