{"id":50566401,"url":"https://github.com/neuron-core/raptor-retrieval","last_synced_at":"2026-06-04T15:01:47.563Z","repository":{"id":315608990,"uuid":"1055642078","full_name":"neuron-core/raptor-retrieval","owner":"neuron-core","description":"Recursive Abstractive Processing for Tree-Organized Retrieval - Neuron PHP Framework","archived":false,"fork":false,"pushed_at":"2026-02-25T13:37:48.000Z","size":1271,"stargazers_count":8,"open_issues_count":1,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-25T17:12:05.615Z","etag":null,"topics":["agent","agentic-ai","agentic-workflow","ai","ai-agent","ai-agents","ai-framework","embedding","llm","php","rag","rag-chatbot","rag-pipeline","vector-database","vector-search"],"latest_commit_sha":null,"homepage":"https://docs.neuron-ai.dev/rag/retrieval","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neuron-core.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-12T15:13:07.000Z","updated_at":"2026-02-25T13:37:52.000Z","dependencies_parsed_at":"2025-09-19T23:31:15.205Z","dependency_job_id":null,"html_url":"https://github.com/neuron-core/raptor-retrieval","commit_stats":null,"previous_names":["neuron-core/raptor-retrieval"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/neuron-core/raptor-retrieval","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neuron-core%2Fraptor-retrieval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neuron-core%2Fraptor-retrieval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neuron-core%2Fraptor-retrieval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neuron-core%2Fraptor-retrieval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neuron-core","download_url":"https://codeload.github.com/neuron-core/raptor-retrieval/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neuron-core%2Fraptor-retrieval/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33910137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-04T02:00:06.755Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","agentic-ai","agentic-workflow","ai","ai-agent","ai-agents","ai-framework","embedding","llm","php","rag","rag-chatbot","rag-pipeline","vector-database","vector-search"],"created_at":"2026-06-04T15:01:47.313Z","updated_at":"2026-06-04T15:01:47.536Z","avatar_url":"https://github.com/neuron-core.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Recursive Abstractive Processing for Tree-Organized Retrieval\n\nThis module implements the RAPTOR retrieval strategy for the [Neuron PHP AI framework](https://neuron-ai.dev).\n\n## The Problem with Traditional Retrieval\n\nMost retrieval-augmented models work by breaking down documents into small chunks and retrieving only the most relevant ones. However, this approach has some limitations:\n\n- **Loss of Context**: Retrieving only small, isolated chunks may miss the bigger picture, especially for documents with long contexts.\n- **Difficulty in Multi-Step Reasoning**: Some questions require information from multiple sections of a document.\n\n**Use RAPTOR when:**\n- Users ask open-ended questions that require comprehensive coverage\n- Your domain involves complex topics where context matters as much as facts\n- You need to handle queries about themes, trends, or relationships across documents\n\n**Stick with traditional RAG when:**\n- Users primarily need quick, specific fact retrieval\n- Processing speed and token efficiency are critical constraints\n\nCheck out the example in the [examples](./examples/raptor.php) folder.\n\n## Requirements\n\n- PHP: ^8.1\n- Neuron: ^2.0\n\n## Install RAPTOR retrieval\n\nInstall the latest version of the package:\n\n```\ncomposer require neuron-core/raptor-retreival\n```\n\n## How to use RAPTOR in your agent\n\nOr use the RAPTOR component directly into the agent. RAPTOR needs a vector store, an embedding provider and uses an LLM\nto perform the summarization:\n\n```php\nuse NeuronAI\\RAG\\Retrieval\\RetrievalInterface;\nuse NeuronCore\\RaptorRetrieval\\RaptorRetrieval;\n\nclass WorkoutTipsAgent extends RAG\n{\n    protected function retrieval(): RetrievalInterface\n    {\n        return new RaptorRetrieval(\n            $this-\u003eresolveVectorStore(),\n            $this-\u003eresolveEmbeddingsProvider(),\n            $this-\u003eresolveProvider(), // Used for summarization\n        );\n    }\n\n    protected function embeddings(): EmbeddingsProviderInterface\n    {\n        return new ...\n    }\n\n    protected function vectorStore(): VectorStoreInterface\n    {\n        return new ...\n    }\n}\n```\n\n## Clustering strategy\n\nRAPTOR algorithm uses a clustering strategy to group the retrieved documents into clusters. Choose based on your content characteristics:\n\n### Similarity Clustering (default)\n\nGroups documents with clear thematic boundaries. Best for already well-organized content with distinct topics.\n\n**Use Similarity Clustering when:**\n\n- You have heterogeneous content with clear topic boundaries\n- Your documents have distinct themes that don't overlap much\n- Performance is important (faster processing)\n\n```php\nuse NeuronAI\\RAG\\Retrieval\\RetrievalInterface;\nuse NeuronCore\\RaptorRetrieval\\RaptorRetrieval;\nuse NeuronCore\\RaptorRetrieval\\Clustering\\SimilarityClustering;\n\nclass WorkoutTipsAgent extends RAG\n{\n    protected function retrieval(): RetrievalInterface\n    {\n        return new RaptorRetrieval(\n            $this-\u003eresolveVectorStore(),\n            $this-\u003eresolveEmbeddingsProvider(),\n            $this-\u003eresolveProvider(), // Used for summarization\n            new SimilarityClustering()\n        );\n    }\n\n    protected function embeddings(): EmbeddingsProviderInterface\n    {\n        return new ...\n    }\n\n    protected function vectorStore(): VectorStoreInterface\n    {\n        return new ...\n    }\n}\n```\n\n### Gaussian Mixture Clustering\n\nHandles overlapping topics where documents may belong to multiple themes simultaneously.\nUseful for research papers, news articles, or any content where topics naturally blend together rather than having sharp boundaries.\n\n**Use GMM when:**\n\n- Documents may relate to multiple topics simultaneously\n- You want the algorithm to discover the \"natural\" number of clusters in your data\n- You're dealing with research papers, news, or complex content where multiple topics blend\n\n```php\nuse NeuronAI\\RAG\\Retrieval\\RetrievalInterface;\nuse NeuronCore\\RaptorRetrieval\\RaptorRetrieval;\nuse NeuronCore\\RaptorRetrieval\\Clustering\\GaussianMixtureClustering;\n\nclass WorkoutTipsAgent extends RAG\n{\n    protected function retrieval(): RetrievalInterface\n    {\n        return new RaptorRetrieval(\n            $this-\u003eresolveVectorStore(),\n            $this-\u003eresolveEmbeddingsProvider(),\n            $this-\u003eresolveProvider(), // Used for summarization\n            new GaussianMixtureClustering()\n        );\n    }\n\n    protected function embeddings(): EmbeddingsProviderInterface\n    {\n        return new ...\n    }\n\n    protected function vectorStore(): VectorStoreInterface\n    {\n        return new ...\n    }\n}\n```\n\n## What is Neuron?\n\nNeuron is a PHP framework for creating and orchestrating AI Agents. It allows you to integrate AI entities in your existing\nPHP applications with a powerful and flexible architecture. We provide tools for the entire agentic application development lifecycle,\nfrom LLM interfaces, to data loading, to multi-agent orchestration, to monitoring and debugging.\nIn addition, we provide tutorials and other educational content to help you get started using AI Agents in your projects.\n\n**[Go to the official documentation](https://docs.neuron-ai.dev/)**\n\n[**Video Tutorial**](https://www.youtube.com/watch?v=oSA1bP_j41w)\n\n[![Neuron \u0026 Inspector](./docs/youtube.png)](https://www.youtube.com/watch?v=oSA1bP_j41w)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneuron-core%2Fraptor-retrieval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneuron-core%2Fraptor-retrieval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneuron-core%2Fraptor-retrieval/lists"}