{"id":34255711,"url":"https://github.com/martin-papy/qdrant-loader","last_synced_at":"2026-05-12T06:03:24.080Z","repository":{"id":286572118,"uuid":"961224554","full_name":"martin-papy/qdrant-loader","owner":"martin-papy","description":"Enterprise-ready vector database toolkit for building searchable knowledge bases from multiple data sources. Supports multi-project management, automatic ingestion from Confluence/JIRA/Git, intelligent file conversion (PDF/Office/images), and semantic search. Includes MCP server for seamless AI assistant integration.","archived":false,"fork":false,"pushed_at":"2026-05-04T07:57:41.000Z","size":33162,"stargazers_count":39,"open_issues_count":18,"forks_count":24,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-04T09:42:29.732Z","etag":null,"topics":["cli-tool","confluence-integration","cursor-ide","developer-tools","document-processing","embbedings","enterprise-ready","file-conversion","git-integration","jira-integration","knowledge-base","llm-integration","mcp-server","multi-project","openai","python","rag","semantic-search"],"latest_commit_sha":null,"homepage":"https://qdrant-loader.net","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/martin-papy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-04-06T03:13:17.000Z","updated_at":"2026-05-04T07:57:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"c80f2682-844e-4151-9eef-296925aed816","html_url":"https://github.com/martin-papy/qdrant-loader","commit_stats":null,"previous_names":["kheldar666/qdrant-loader","martin-papy/qdrant-loader"],"tags_count":83,"template":false,"template_full_name":null,"purl":"pkg:github/martin-papy/qdrant-loader","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martin-papy%2Fqdrant-loader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martin-papy%2Fqdrant-loader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martin-papy%2Fqdrant-loader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martin-papy%2Fqdrant-loader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/martin-papy","download_url":"https://codeload.github.com/martin-papy/qdrant-loader/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martin-papy%2Fqdrant-loader/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32807874,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"online","status_checked_at":"2026-05-09T02:00:06.633Z","response_time":123,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli-tool","confluence-integration","cursor-ide","developer-tools","document-processing","embbedings","enterprise-ready","file-conversion","git-integration","jira-integration","knowledge-base","llm-integration","mcp-server","multi-project","openai","python","rag","semantic-search"],"created_at":"2025-12-16T13:03:56.767Z","updated_at":"2026-05-12T06:03:24.075Z","avatar_url":"https://github.com/martin-papy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# QDrant Loader\n\n[![PyPI - qdrant-loader](https://img.shields.io/pypi/v/qdrant-loader?label=qdrant-loader)](https://pypi.org/project/qdrant-loader/)\n[![PyPI - mcp-server](https://img.shields.io/pypi/v/qdrant-loader-mcp-server?label=mcp-server)](https://pypi.org/project/qdrant-loader-mcp-server/)\n[![PyPI - qdrant-loader-core](https://img.shields.io/pypi/v/qdrant-loader-core?label=qdrant-loader-core)](https://pypi.org/project/qdrant-loader-core/)\n![CodeRabbit Pull Request Reviews](https://img.shields.io/coderabbit/prs/github/martin-papy/qdrant-loader?labelColor=171717\u0026color=FF570A\u0026link=https%3A%2F%2Fcoderabbit.ai\u0026label=CodeRabbit+Reviews)\n[![Test Coverage](https://img.shields.io/badge/coverage-view%20reports-blue)](https://qdrant-loader.net/coverage/)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n📝 **[Changelog v1.0.2](./CHANGELOG.md)** - Latest improvements and bug fixes\n\n\u003cdiv align=\"left\"\u003e\nA comprehensive toolkit for loading data into Qdrant vector database with advanced MCP server support for AI-powered development workflows.\n\u003c/div\u003e\n\n## 🎯 What is QDrant Loader?\n\nQDrant Loader is a data ingestion and retrieval system that collects content from multiple sources, processes and vectorizes it, then provides intelligent search capabilities through a Model Context Protocol (MCP) server for AI development tools.\n\n**Perfect for:**\n\n- 🤖 **AI-powered development** with Cursor, Windsurf, and other MCP-compatible tools\n- 📚 **Knowledge base creation** from technical documentation\n- 🔍 **Intelligent code assistance** with contextual information\n- 🏢 **Enterprise content integration** from multiple data sources\n\n## 📦 Packages\n\nThis monorepo contains three complementary packages:\n\n### 🔄 [QDrant Loader](./packages/qdrant-loader/)\n\nData ingestion and processing engine\n\nCollects and vectorizes content from multiple sources into QDrant vector database.\n\n**Key Features:**\n\n- **Multi-source connectors**: Git, Confluence (Cloud \u0026 Data Center), JIRA (Cloud \u0026 Data Center), Public Docs, Local Files\n- **File conversion**: PDF, Office docs (Word, Excel, PowerPoint), images, audio, EPUB, ZIP, and more using MarkItDown\n- **Smart chunking**: Modular chunking strategies with intelligent document processing and hierarchical context\n- **Incremental updates**: Change detection and efficient synchronization\n- **Multi-project support**: Organize sources into projects with shared collections\n- **Provider-agnostic LLM**: OpenAI, Azure OpenAI, Ollama, and custom endpoints with unified configuration\n\n### ⚙️ [QDrant Loader Core](./packages/qdrant-loader-core/)\n\nCore library and LLM abstraction layer\n\nProvides the foundational components and provider-agnostic LLM interface used by other packages.\n\n**Key Features:**\n\n- **LLM Provider Abstraction**: Unified interface for OpenAI, Azure OpenAI, Ollama, and custom endpoints\n- **Configuration Management**: Centralized settings and validation for LLM providers\n- **Rate Limiting**: Built-in rate limiting and request management\n- **Error Handling**: Robust error handling and retry mechanisms\n- **Logging**: Structured logging with configurable levels\n\n### 🔌 [QDrant Loader MCP Server](./packages/qdrant-loader-mcp-server/)\n\nAI development integration layer\n\nModel Context Protocol server providing search capabilities to AI development tools.\n\n**Key Features:**\n\n- **MCP Protocol 2025-06-18**: Latest protocol compliance with dual transport support (stdio + HTTP)\n- **Advanced search tools**: Semantic search, hierarchy-aware search, attachment discovery, and conflict detection\n- **Cross-document intelligence**: Document similarity, clustering, relationship analysis, and knowledge graphs\n- **Streaming capabilities**: Server-Sent Events (SSE) for real-time search results\n- **Production-ready**: HTTP transport with security, session management, and health checks\n\n## 🚀 Quick Start\n\n### Installation\n\n```bash\n# Install both packages\npip install qdrant-loader qdrant-loader-mcp-server\n\n# Or install individually\npip install qdrant-loader          # Data ingestion only\npip install qdrant-loader-mcp-server  # MCP server only\n```\n\n### 5-Minute Setup\n\n1. **Create a workspace**\n\n   ```bash\n   mkdir my-workspace \u0026\u0026 cd my-workspace\n   ```\n\n2. **Initialize workspace with templates**\n\n   ```bash\n   qdrant-loader init --workspace .\n   ```\n\n3. **Configure your environment** (edit `.env`)\n\n   ```bash\n   # Qdrant connection\n   QDRANT_URL=http://localhost:6333\n   QDRANT_COLLECTION_NAME=my_docs\n\n   # LLM provider (new unified configuration)\n   OPENAI_API_KEY=your_openai_key\n   LLM_PROVIDER=openai\n   LLM_BASE_URL=https://api.openai.com/v1\n   LLM_EMBEDDING_MODEL=text-embedding-3-small\n   LLM_CHAT_MODEL=gpt-4o-mini\n   ```\n\n4. **Configure data sources** (edit `config.yaml`)\n\n   ```yaml\n   global:\n     qdrant:\n       url: \"http://localhost:6333\"\n       collection_name: \"my_docs\"\n     llm:\n       provider: \"openai\"\n       base_url: \"https://api.openai.com/v1\"\n       api_key: \"${OPENAI_API_KEY}\"\n       models:\n         embeddings: \"text-embedding-3-small\"\n         chat: \"gpt-4o-mini\"\n       embeddings:\n         vector_size: 1536\n\n   projects:\n     my-project:\n       project_id: \"my-project\"\n       sources:\n         git:\n           docs-repo:\n             base_url: \"https://github.com/your-org/your-repo.git\"\n             branch: \"main\"\n             file_types: [\"*.md\", \"*.rst\"]\n   ```\n\n5. **Load your data**\n\n   ```bash\n   qdrant-loader ingest --workspace .\n   ```\n\n6. **Start the MCP server**\n\n   ```bash\n   mcp-qdrant-loader --env /path/tp/your/.env\n   ```\n\n## 🔧 MCP-Compatible IDE Setup\n\nQDrant Loader works with any IDE/tool that supports MCP, including Cursor, Windsurf, and Claude Desktop.\n\nMinimal MCP server entry (adapt path/format to your tool):\n\n```json\n{\n  \"mcpServers\": {\n    \"qdrant-loader\": {\n      \"command\": \"/path/to/venv/bin/mcp-qdrant-loader\",\n      \"env\": {\n        \"QDRANT_URL\": \"http://localhost:6333\",\n        \"QDRANT_COLLECTION_NAME\": \"my_docs\",\n        \"OPENAI_API_KEY\": \"your_key\"\n      }\n    }\n  }\n}\n```\n\n**Alternative: Use configuration file** (recommended for complex setups):\n\n```json\n{\n  \"mcpServers\": {\n    \"qdrant-loader\": {\n      \"command\": \"/path/to/venv/bin/mcp-qdrant-loader\",\n      \"args\": [\n        \"--config\",\n        \"/path/to/your/config.yaml\",\n        \"--env\",\n        \"/path/to/your/.env\"\n      ]\n    }\n  }\n}\n```\n\nFor tool-specific setup and exact config format:\n\n- **[MCP Setup and Integration](./docs/users/detailed-guides/mcp-server/setup-and-integration.md)** - Full guide\n- **[Cursor Setup](./docs/users/detailed-guides/mcp-server/setup-and-integration.md#-cursor-ide)**\n- **[Windsurf Setup](./docs/users/detailed-guides/mcp-server/setup-and-integration.md#-windsurf)**\n- **[Claude Desktop Setup](./docs/users/detailed-guides/mcp-server/setup-and-integration.md#-claude-desktop)**\n\n**Example queries in AI tools:**\n\n- _\"Find documentation about authentication in our API\"_\n- _\"Show me examples of error handling patterns\"_\n- _\"What are the deployment requirements for this service?\"_\n- _\"Find all attachments related to database schema\"_\n\n## 📚 Documentation\n\n### Getting Started\n\n- **[Getting Started](./docs/getting-started/)** - Quick start and core concepts\n- **[Installation Guide](./docs/getting-started/installation.md)** - Complete setup instructions\n- **[Quick Start](./docs/getting-started/quick-start.md)** - Step-by-step tutorial\n- **[Core Concepts](./docs/getting-started/README.md#-core-concepts)** - Understand the core architecture: workspace model, projects and sources, ingestion pipeline, and MCP search flow\n\n### User Guides\n\n- **[User Guides](./docs/users/)** - Detailed usage instructions\n- **[Configuration](./docs/users/configuration/)** - Complete configuration reference\n- **[Data Sources](./docs/users/detailed-guides/data-sources/)** - Git, Confluence, JIRA setup\n- **[File Conversion](./docs/users/detailed-guides/file-conversion/)** - File processing capabilities\n- **[MCP Server](./docs/users/detailed-guides/mcp-server/)** - AI tool integration\n\n## 🛠️ Developer Resources\n\n- **[Developer hub](./docs/developers)** - Developer guides for architecture, testing, deployment, and contribution workflows.\n- **[Architecture](./docs/developers/architecture/)** - System design overview\n- **[Testing](./docs/developers/testing/)** - Testing guide and best practices\n\n## 🆘 Support\n\n- **[Issues](https://github.com/martin-papy/qdrant-loader/issues)** - Bug reports and feature requests\n- **[Discussions](https://github.com/martin-papy/qdrant-loader/discussions)** - Community Q\u0026A\n\n## 🤝 Contributing\n\nWe welcome contributions! See our [Contributing Guide](./CONTRIBUTING.md) for:\n\n- Development environment setup\n- Code style and standards\n- Pull request process\n\n### Quick Development Setup\n\n```bash\n# Clone and setup\ngit clone https://github.com/martin-papy/qdrant-loader.git\ncd qdrant-loader\n\n# Sync workspace environment (recommended)\nuv sync --all-packages --all-extras\n\n# Add a new dependency during development\nuv add fastapi\nuv sync\n```\n\n## 📄 License\n\nThis project is licensed under the GNU GPLv3 - see the [LICENSE](LICENSE) file for details.\n\n---\n\n**Ready to get started?** Check out our [Quick Start Guide](./docs/getting-started/quick-start.md) or browse the [complete documentation](./docs/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartin-papy%2Fqdrant-loader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmartin-papy%2Fqdrant-loader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartin-papy%2Fqdrant-loader/lists"}