{"id":34666772,"url":"https://github.com/emredeveloper/quick-rag","last_synced_at":"2026-02-05T20:13:57.387Z","repository":{"id":322433708,"uuid":"1088308350","full_name":"emredeveloper/quick-rag","owner":"emredeveloper","description":"Production-ready RAG for JavaScript \u0026 React. Built on Ollama \u0026 LM Studio SDKs with hybrid search, caching, conversation management \u0026 evaluation.","archived":false,"fork":false,"pushed_at":"2025-12-21T21:44:34.000Z","size":97287,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-26T08:35:11.648Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/quick-rag","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emredeveloper.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-02T18:06:35.000Z","updated_at":"2025-12-21T21:44:37.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/emredeveloper/quick-rag","commit_stats":null,"previous_names":["emredeveloper/quick-rag"],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/emredeveloper/quick-rag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emredeveloper%2Fquick-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emredeveloper%2Fquick-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emredeveloper%2Fquick-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emredeveloper%2Fquick-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emredeveloper","download_url":"https://codeload.github.com/emredeveloper/quick-rag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emredeveloper%2Fquick-rag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29133242,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-05T19:36:52.185Z","status":"ssl_error","status_checked_at":"2026-02-05T19:35:40.941Z","response_time":65,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-12-24T19:11:59.043Z","updated_at":"2026-02-05T20:13:57.380Z","avatar_url":"https://github.com/emredeveloper.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Quick RAG ⚡\n\n[![npm version](https://img.shields.io/npm/v/quick-rag.svg)](https://www.npmjs.com/package/quick-rag)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n🚀 **Production-ready RAG (Retrieval-Augmented Generation) for JavaScript \u0026 React**  \nBuilt on official [Ollama](https://github.com/ollama/ollama-js) \u0026 [LM Studio](https://github.com/lmstudio-ai/lmstudio-js) SDKs.\n\n\u003e **🎉 v2.5.2 Released!** React export fixes, deterministic init tests, and Ollama base model alignment (`granite4:3b`). See [CHANGELOG.md](CHANGELOG.md) for details.\n\n## ✨ Features\n\n### 🆕 v2.5.2 - Stability \u0026 Compatibility\n- ✅ **React Export Fix** - `quick-rag/react` now resolves correctly to `useRAG`\n- ✅ **Deterministic initRAG Tests** - core test suite no longer depends on external Ollama availability\n- ✅ **Ollama Base Model Alignment** - examples and tests standardized to `granite4:3b`\n- 🐛 **Critical Bug Fix** - `ConversationManager.addAssistantMessage()` now correctly passes content\n- 🌐 **Browser Compatibility** - Cross-platform UUID generation (Node.js + Browser)\n- 📦 **Cleaner Dependencies** - Removed invalid self-referencing dependency\n- 🤖 **Updated Default Models** - `qwen3-embedding:0.6b` (Ollama) \u0026 `google/gemma-3-4b` (LM Studio)\n\n### v2.4.0 - Robustness \u0026 Explainability\n- 🔪 **Robust Chunking** - Abbreviation-aware sentence splitting \u0026 word-safe text chunking\n- 🔍 **Rich Explainability** - Detailed retrieval snippets, keyword density \u0026 term match metrics\n- 🚀 **BM25 Optimization** - Min-Heap based top-K selection for fast retrieval in large datasets\n- 🌐 **Environment Stability** - Universal UUID support for Node.js and Browser (globalThis.crypto)\n\n### v2.3.0 - Performance \u0026 Evaluation\n- 🚀 **Caching Layer** - LRU cache, embedding cache, query cache for 10x speedup\n- 💬 **Conversation Manager** - Context window management \u0026 auto-summarization\n- 📊 **RAG Evaluation** - Precision@K, Recall, MRR, NDCG metrics\n- 🗄️ **Vector DB Connectors** - ChromaDB \u0026 Qdrant adapters\n\n### 🔍 v2.2.0 - Advanced Search\n- 🔍 **BM25 Sparse Search** - Pure JS keyword-based retrieval (no dependencies!)\n- 🔀 **Hybrid Search** - Combines BM25 + Vector with RRF fusion (20-30% better retrieval)\n- 📊 **Reranking** - Multi-signal scoring (keyword, semantic, coverage, coherence)\n- 🔄 **Query Transformation** - Expansion, decomposition, multi-query, HyDE\n\n### Core Features\n- 🎯 **Official SDKs** - Built on `ollama` and `@lmstudio/sdk` packages\n- 💾 **Embedded Persistence** - SQLite-based vector store (No server required!)\n- 🛡️ **Robust Error Handling** - 7 custom error classes with recovery suggestions\n- 📊 **Telemetry \u0026 Metrics** - Track performance, latency, and usage\n- 📝 **Structured Logging** - JSON logging with Pino integration\n- ⚡ **5x Faster** - Parallel batch embedding\n- 📄 **Document Loaders** - PDF, Word, Excel, Text, Markdown, URLs\n- 🔪 **Robust Chunking** - Intelligent splitting that respects abbreviations (Dr., Prof.) and avoids word cutting\n- 🏷️ **Metadata Filtering** - Filter by document properties\n- 🔍 **Rich Query Explainability** - See WHY docs were retrieved with snippets and density metrics (unique!)\n- 🎨 **Dynamic Prompts** - 10 built-in templates + full customization\n- 🧠 **Weighted Decision Making** - Multi-criteria document scoring\n- 🎯 **Heuristic Reasoning** - Pattern learning and query optimization\n- 🔄 **CRUD Operations** - Add, update, delete documents on the fly\n- 🌊 **Streaming Support** - Real-time AI responses\n- 🔧 **Zero Config** - Works with React, Next.js, Vite, Node.js\n- 💪 **Type Safe** - Full TypeScript support\n\n## 📦 Installation\n\n```bash\nnpm install quick-rag\n```\n\n**Default Ollama models (examples/docs):**\n```bash\nollama pull granite4:3b\nollama pull qwen3-embedding:0.6b\n```\n\n**Optional Dependencies:**\n```bash\n# For embedded persistence\nnpm install better-sqlite3\n\n# For vector databases (optional)\nnpm install chromadb @qdrant/js-client-rest\n```\n\n## 🆕 What's New in v2.3.0\n\n### 🚀 Caching Layer\nSpeed up repeated operations with intelligent caching:\n\n```javascript\nimport { CacheManager, EmbeddingCache } from 'quick-rag';\n\n// Unified cache manager\nconst cache = new CacheManager({\n  embeddings: { maxSize: 5000, ttl: 3600000 }, // 1 hour\n  queries: { maxSize: 500, ttl: 1800000 }      // 30 min\n});\n\n// Wrap embedding function for automatic caching\nconst cachedEmbed = cache.wrapEmbedding(embedFn);\n\n// Check statistics\nconsole.log(cache.getStats());\n// { embeddings: { size: 100, cacheHits: 450, cacheMisses: 50, hitRate: 0.9 } }\n```\n\n### 💬 Conversation Manager\nManage chat history with context window limits:\n\n```javascript\nimport { ConversationManager, getContextLimit } from 'quick-rag';\n\nconst conversation = new ConversationManager({\n  maxTokens: getContextLimit('llama3'), // 8192\n  autoSummarize: true,\n  systemPrompt: 'You are a helpful assistant.'\n});\n\nconversation.addMessage('user', 'What is RAG?');\nconversation.addMessage('assistant', 'RAG stands for...');\n\n// Get context for LLM (respects token limits)\nconst context = conversation.getContext();\n\n// Fork, export, or summarize\nconst forked = conversation.fork();\nconst json = conversation.toJSON();\n```\n\n### 📊 RAG Evaluation\nMeasure retrieval quality with standard metrics:\n\n```javascript\nimport { precisionAtK, meanReciprocalRank, RAGEvaluator } from 'quick-rag';\n\n// Individual metrics\nconst retrieved = ['doc1', 'doc4', 'doc2'];\nconst relevant = ['doc1', 'doc2', 'doc3'];\n\nconsole.log(precisionAtK(retrieved, relevant, 3));  // 0.667\nconsole.log(meanReciprocalRank(retrieved, relevant)); // 1.0\n\n// Full evaluation\nconst evaluator = new RAGEvaluator(retriever);\nconst results = await evaluator.evaluate(testQueries);\nconsole.log(results.metrics); // { precision, recall, mrr, ndcg }\n```\n\n### 🗄️ Vector Database Connectors\nConnect to external vector databases:\n\n```javascript\nimport { createVectorStore, ChromaVectorStore, QdrantVectorStore } from 'quick-rag';\n\n// Factory pattern\nconst store = await createVectorStore('chroma', embedFn, {\n  collectionName: 'my-docs',\n  host: 'localhost',\n  port: 8000\n});\n\n// Or direct usage\nconst qdrant = new QdrantVectorStore(embedFn, {\n  url: 'http://localhost:6333',\n  collectionName: 'documents'\n});\n```\n\n## 🆕 What's New in v2.4.0\n\n### 🔪 Robust Chunking\nIntelligent text splitting that handles abbreviations and prevents word splitting:\n\n```javascript\nimport { chunkBySentences, chunkText } from 'quick-rag';\n\n// Handles Dr., Prof., LTD., approx., etc.\nconst chunks = chunkBySentences(text, { \n  sentencesPerChunk: 3,\n  overlapSentences: 1 \n});\n\n// Avoids cutting words in half\nconst textChunks = chunkText(text, { \n  chunkSize: 500,\n  overlap: 50,\n  separator: ' ' // Word-safe splitting\n});\n```\n\n### 🔍 Rich Query Explainability\nGet deep insights into why a document was retrieved:\n\n```javascript\nconst results = await retriever.getRelevant(query, 3, { explain: true });\n\nconsole.log(results[0].explanation);\n/*\n{\n  score: 0.88,\n  snippet: \"...context surrounding the match...\",\n  relevanceFactors: {\n    semanticScore: 0.88,\n    termMatch: 0.75,   // 3/4 terms matched\n    density: 0.15      // concentration of keywords\n  }\n}\n*/\n```\n\n## 🔍 What's in v2.2.0\n\n### 🔍 BM25 Sparse Search\nPure JavaScript implementation - no external dependencies!\n\n```javascript\nimport { BM25 } from 'quick-rag';\n\nconst bm25 = new BM25({ k1: 1.2, b: 0.75 });\nbm25.addDocuments([\n  { id: '1', text: 'Machine learning is a subset of AI' },\n  { id: '2', text: 'Deep learning uses neural networks' },\n  { id: '3', text: 'Natural language processing handles text' }\n]);\n\nconst results = bm25.search('neural networks AI', 2);\n// Fast keyword-based retrieval with TF-IDF scoring\n```\n\n### 🔀 Hybrid Search (BM25 + Vector)\nCombine sparse and dense retrieval for 20-30% better results!\n\n```javascript\nimport { HybridRetriever, InMemoryVectorStore } from 'quick-rag';\n\nconst vectorStore = new InMemoryVectorStore(embedFn);\nawait vectorStore.addDocuments(docs);\n\nconst hybrid = new HybridRetriever(vectorStore, {\n  alpha: 0.5,           // Balance: 0=sparse only, 1=dense only\n  fusionMethod: 'rrf',  // Reciprocal Rank Fusion\n  rrfK: 60\n});\n\nconst results = await hybrid.search('query', 5, { explain: true });\n// Results include both dense and sparse scores\n```\n\n### 📊 Reranking\nMulti-signal scoring to improve top-K precision:\n\n```javascript\nimport { Reranker, createRerankedRetriever } from 'quick-rag';\n\nconst reranker = new Reranker({\n  keywordWeight: 0.35,   // Keyword overlap\n  semanticWeight: 0.35,  // Semantic similarity\n  coverageWeight: 0.20,  // Query term coverage\n  coherenceWeight: 0.10  // Text coherence\n});\n\n// Rerank any retriever's results\nconst reranked = reranker.rerank(query, initialResults, { explain: true });\n\n// Or wrap a retriever for automatic reranking\nconst smartRetriever = createRerankedRetriever(hybridRetriever, rerankerOptions);\n```\n\n### 🔄 Query Transformation\nAdvanced query processing techniques:\n\n```javascript\nimport { QueryExpander, QueryDecomposer, MultiQueryGenerator } from 'quick-rag';\n\n// 1. Query Expansion - Add synonyms\nconst expander = new QueryExpander();\nexpander.addSynonyms('ml', ['machine learning', 'AI']);\nconst expanded = expander.expand('ml models');\n// \"ml models machine learning AI\"\n\n// 2. Query Decomposition - Split complex queries\nconst decomposer = new QueryDecomposer();\nconst parts = decomposer.decompose('Compare BM25 with vector search and explain differences');\n// [\"Compare BM25 with vector search\", \"explain differences\"]\n\n// 3. Multi-Query - Generate variations\nconst generator = new MultiQueryGenerator();\nconst variations = generator.generate('How does RAG work?');\n// [\"How does RAG work?\", \"What is RAG?\", \"RAG explanation\"]\n```\n\n### 🎯 Full Pipeline Example\nCombine all features for maximum retrieval quality:\n\n```javascript\nimport {\n  OllamaRAGClient,\n  createOllamaRAGEmbedding,\n  InMemoryVectorStore,\n  HybridRetriever,\n  createRerankedRetriever,\n  QueryExpander,\n  generateWithRAG\n} from 'quick-rag';\n\n// Setup\nconst client = new OllamaRAGClient();\nconst embed = createOllamaRAGEmbedding(client, 'qwen3-embedding:0.6b');\nconst store = new InMemoryVectorStore(embed);\nawait store.addDocuments(documents);\n\n// Create hybrid + reranked retriever\nconst hybrid = new HybridRetriever(store, { alpha: 0.5, fusionMethod: 'rrf' });\nconst retriever = createRerankedRetriever(hybrid, { keywordWeight: 0.3 });\n\n// Expand query and retrieve\nconst expander = new QueryExpander();\nconst { expanded } = expander.expand(userQuery);\nconst results = await retriever.getRelevant(expanded, 5);\n\n// Generate response\nconst response = await generateWithRAG(client, 'llama3', userQuery, results);\n```\n\n---\n\n## 📚 Previous Features\n\n### 💾 Embedded Persistence (v2.1.0)\nStore your vectors locally without setting up a complex database server!\n- **Zero Setup:** Just provide a file path (`./rag.db`)\n- **Fast:** Built on `better-sqlite3`\n- **Full Features:** Batch insert, metadata filtering, CRUD\n\n### 🛡️ Advanced Error Handling\nNever crash without knowing why. New error system provides:\n- **Specific Error Types:** `RAGError`, `EmbeddingError`, `RetrievalError`, etc.\n- **Error Codes:** Programmatic handling\n- **Recovery Hints:** Actionable suggestions in error messages\n\n### 📊 Metrics \u0026 Logging\nMonitor your RAG pipeline in production:\n- **Performance Tracking:** Embedding time, search latency, generation speed\n- **Structured Logs:** JSON format for easy parsing\n- **Prometheus Support:** Export metrics for monitoring dashboards\nAdvanced filtering with custom logic - filter documents using JavaScript functions:\n```javascript\nconst results = await retriever.getRelevant('latest AI news', 5, {\n  filter: (meta) =\u003e {\n    return meta.year === 2024 \u0026\u0026 \n           meta.tags.includes('AI') \u0026\u0026\n           meta.difficulty !== 'beginner';\n  }\n});\n```\n\n### 📽️ PowerPoint Support\nLoad .pptx and .ppt files with `officeparser`:\n```javascript\nimport { loadDocument } from 'quick-rag';\nconst pptDoc = await loadDocument('./presentation.pptx');\n```\n\n### 📁 Organized Examples\n12 comprehensive examples covering all features:\n- Basic Usage (Ollama \u0026 LM Studio)\n- Document Loading (PDF, Word, Excel)\n- Metadata Filtering\n- Streaming Responses\n- Advanced Filtering\n- Query Explainability\n- Prompt Management\n- Decision Engine (Simple \u0026 Real-World)\n- Conversation History \u0026 Export\n- New `examples/` folder for direct `npm i quick-rag` usage\n\n---\n\n## 🆕 Previous Features (v1.1.x)\n\n### 📝 Internationalization Update\n- Translated all example files to English for better international accessibility\n- `examples/10-decision-engine.js` - Smart Document Selection example\n- `examples/11-loaders.js` - Document loaders example\n\n### 🧠 Decision Engine (v1.1.0)\n\n**Revolutionary AI-powered retrieval system** - The most advanced RAG retrieval available!\n\nQuick RAG now includes a **Decision Engine** that goes far beyond simple cosine similarity. It combines:\n- 🎯 **Multi-Criteria Weighted Scoring** - 5 factors evaluated together\n- 🧠 **Heuristic Reasoning** - Pattern-based query optimization  \n- � **Adaptive Learning** - Learns from user feedback\n- �🔍 **Full Transparency** - See exactly why each document was selected\n\n#### Multi-Criteria Scoring\n\n**5 weighted factors beyond similarity:**\n\n1. **📊 Semantic Similarity** (50%) - Cosine similarity score\n2. **🔤 Keyword Match** (20%) - Term matching in document\n3. **📅 Recency** (15%) - Document freshness with exponential decay\n4. **⭐ Source Quality** (10%) - Source reliability (official=1.0, research=0.9, blog=0.7, forum=0.6)\n5. **🎯 Context Relevance** (5%) - Contextual fit\n\n```javascript\nimport { SmartRetriever, DEFAULT_WEIGHTS } from 'quick-rag';\n\n// Create smart retriever with default weights\nconst smartRetriever = new SmartRetriever(basicRetriever);\n\n// Or customize weights for your use case\nconst smartRetriever = new SmartRetriever(basicRetriever, {\n  weights: {\n    semanticSimilarity: 0.35,\n    keywordMatch: 0.20,\n    recency: 0.30,         // Higher for news sites\n    sourceQuality: 0.10,\n    contextRelevance: 0.05\n  }\n});\n\n// Get results with decision transparency\nconst response = await smartRetriever.getRelevant('latest AI news', 3);\n\n// See scoring breakdown for each document\nconsole.log(response.results[0]);\n// {\n//   text: \"...\",\n//   weightedScore: 0.742,\n//   scoreBreakdown: {\n//     semanticSimilarity: { score: 0.85, weight: 0.35, contribution: 0.298 },\n//     keywordMatch: { score: 0.67, weight: 0.20, contribution: 0.134 },\n//     recency: { score: 0.95, weight: 0.30, contribution: 0.285 },\n//     sourceQuality: { score: 0.90, weight: 0.10, contribution: 0.090 },\n//     contextRelevance: { score: 1.00, weight: 0.05, contribution: 0.050 }\n//   }\n// }\n\n// Decision context shows WHY these results\nconsole.log(response.decisions);\n// {\n//   weights: { ... },\n//   appliedRules: [\"boost-recent-for-news\"],\n//   suggestions: [\n//     \"Time-sensitive query detected. Prioritizing recent documents.\",\n//     \"Consider using filters if you need older historical content.\"\n//   ]\n// }\n```\n\n#### Heuristic Reasoning\n\n**Pattern-based optimization that learns:**\n\n```javascript\n// Enable learning mode\nconst smartRetriever = new SmartRetriever(basicRetriever, {\n  enableLearning: true,\n  enableHeuristics: true\n});\n\n// Add custom rules\nsmartRetriever.heuristicEngine.addRule(\n  'boost-documentation',\n  (query, context) =\u003e query.includes('documentation'),\n  (query, context) =\u003e {\n    context.adjustWeight('sourceQuality', 0.15);  // Increase quality weight\n    return { adjusted: true, reason: 'Documentation query prioritizes quality' };\n  },\n  5  // Priority\n);\n\n// Provide feedback to enable learning\nsmartRetriever.provideFeedback(query, results, {\n  rating: 5,           // 1-5 rating\n  hasFilters: true,    // User applied filters\n  comment: 'Perfect results!'\n});\n\n// System learns successful patterns\nconst insights = smartRetriever.getInsights();\nconsole.log(insights.heuristics.successfulPatterns);\n// [\"latest\", \"documentation\", \"official release\"]\n\n// Export learned knowledge\nconst knowledge = smartRetriever.exportKnowledge();\n\n// Import to another instance\nnewRetriever.importKnowledge(knowledge);\n```\n\n#### Scenario Customization\n\n**Different weights for different use cases:**\n\n```javascript\n// News Platform - Recency Priority\nconst newsRetriever = new SmartRetriever(basicRetriever, {\n  weights: {\n    semanticSimilarity: 0.30,\n    keywordMatch: 0.20,\n    recency: 0.40,         // 🔥 High recency\n    sourceQuality: 0.05,\n    contextRelevance: 0.05\n  }\n});\n\n// Documentation Site - Quality Priority  \nconst docsRetriever = new SmartRetriever(basicRetriever, {\n  weights: {\n    semanticSimilarity: 0.35,\n    keywordMatch: 0.20,\n    recency: 0.10,\n    sourceQuality: 0.30,   // 🔥 High quality\n    contextRelevance: 0.05\n  }\n});\n\n// Research Platform - Balanced\nconst researchRetriever = new SmartRetriever(basicRetriever, {\n  weights: DEFAULT_WEIGHTS  // Balanced approach\n});\n```\n\n#### Real-World Example\n\nSee `examples/11-loaders.js` for a complete example with:\n- PDF document loading\n- Multiple source types (official, blog, research, forum)\n- 3 different scenarios (news, documentation, research)\n- RAG generation with quality metrics\n- Decision transparency and explanations\n\n**Benefits:**\n- ✅ More accurate retrieval than pure similarity\n- ✅ Adapts to different content types automatically\n- ✅ Learns from user interactions\n- ✅ Fully explainable decisions\n- ✅ Customizable for any use case\n- ✅ Production-ready with proven patterns\n\n### 🔍 Query Explainability (v1.1.0)\n**Understand WHY documents were retrieved** - A first-of-its-kind feature!\n\n```javascript\nconst results = await retriever.getRelevant('What is Ollama?', 3, {\n  explain: true\n});\n\n// Each result includes detailed explanation:\nconsole.log(results[0].explanation);\n// {\n//   queryTerms: [\"ollama\", \"local\", \"ai\"],\n//   matchedTerms: [\"ollama\", \"local\"],\n//   matchCount: 2,\n//   matchRatio: 0.67,\n//   cosineSimilarity: 0.856,\n//   relevanceFactors: {\n//     termMatches: 2,\n//     semanticSimilarity: 0.856,\n//     coverage: \"67%\"\n//   }\n// }\n```\n\n**Use cases:** Debug searches, optimize queries, validate accuracy, explain to users\n\n### 🎨 Dynamic Prompt Management (v1.1.0)\n**10 built-in templates + full customization**\n\n```javascript\n// Quick template selection\nawait generateWithRAG(client, model, query, docs, {\n  template: 'conversational'  // or: technical, academic, code, etc.\n});\n\n// System prompts for role definition\nawait generateWithRAG(client, model, query, docs, {\n  systemPrompt: 'You are a helpful programming tutor',\n  template: 'instructional'\n});\n\n// Advanced: Reusable PromptManager\nimport { createPromptManager } from 'quick-rag';\n\nconst promptMgr = createPromptManager({\n  systemPrompt: 'You are an expert engineer',\n  template: 'technical'\n});\n\nawait generateWithRAG(client, model, query, docs, {\n  promptManager: promptMgr\n});\n```\n\n**Templates:** `default`, `conversational`, `technical`, `academic`, `code`, `concise`, `detailed`, `qa`, `instructional`, `creative`\n\n---\n\n## 🚀 Quick Start\n\n### Option 1: With Official Ollama SDK (Recommended)\n\n```javascript\nimport { \n  OllamaRAGClient, \n  createOllamaRAGEmbedding,\n  InMemoryVectorStore, \n  Retriever \n} from 'quick-rag';\n\n// 1. Initialize client (official SDK)\nconst client = new OllamaRAGClient({\n  host: 'http://127.0.0.1:11434'\n});\n\n// 2. Setup embedding\nconst embed = createOllamaRAGEmbedding(client, 'qwen3-embedding:0.6b');\n\n// 3. Create vector store\nconst vectorStore = new InMemoryVectorStore(embed);\nconst retriever = new Retriever(vectorStore);\n\n// 4. Add documents\nawait vectorStore.addDocument({ \n  text: 'Ollama provides local LLM hosting.' \n});\n\n// 5. Query with streaming (official SDK feature!)\nconst results = await retriever.getRelevant('What is Ollama?', 2);\nconst context = results.map(d =\u003e d.text).join('\\n');\n\nconst response = await client.chat({\n  model: 'granite4:3b',\n  messages: [{ \n    role: 'user', \n    content: `Context: ${context}\\n\\nQuestion: What is Ollama?` \n  }],\n  stream: true, // Official SDK streaming!\n});\n\n// Stream response\nfor await (const part of response) {\n  process.stdout.write(part.message?.content || '');\n}\n```\n\n---\n\n### Option 2: React with Vite\n\n\u003e **💡 Starting from scratch?** Check out the detailed step-by-step guide in [QUICKSTART_REACT.md](./QUICKSTART_REACT.md)!\n\n**Step 1:** Create your project\n\n```bash\nnpm create vite@latest my-rag-app -- --template react\ncd my-rag-app\nnpm install quick-rag express concurrently\n```\n\n**Step 2:** Create backend proxy (`server.js` in project root)\n\n```javascript\nimport express from 'express';\nimport { OllamaRAGClient } from 'quick-rag';\n\nconst app = express();\napp.use(express.json());\n\nconst client = new OllamaRAGClient({ host: 'http://127.0.0.1:11434' });\n\napp.post('/api/generate', async (req, res) =\u003e {\n  const { model = 'granite4:3b', messages } = req.body;\n  const response = await client.chat({ model, messages, stream: false });\n  res.json({ response: response.message.content });\n});\n\napp.post('/api/embed', async (req, res) =\u003e {\n  const { model = 'qwen3-embedding:0.6b', input } = req.body;\n  const response = await client.embed(model, input);\n  res.json(response);\n});\n\napp.listen(3001, () =\u003e console.log('🚀 Server: http://127.0.0.1:3001'));\n```\n\n**Step 3:** Configure Vite proxy (`vite.config.js`)\n\n```javascript\nimport { defineConfig } from 'vite';\nimport react from '@vitejs/plugin-react';\n\nexport default defineConfig({\n  plugins: [react()],\n  server: {\n    proxy: {\n      '/api': {\n        target: 'http://127.0.0.1:3001',\n        changeOrigin: true\n      }\n    }\n  }\n});\n```\n\n**Step 4:** Update `package.json` scripts\n\n```json\n{\n  \"scripts\": {\n    \"dev\": \"concurrently \\\"npm:server\\\" \\\"npm:client\\\"\",\n    \"server\": \"node server.js\",\n    \"client\": \"vite\"\n  }\n}\n```\n\n**Step 5:** Use in your React component (`src/App.jsx`)\n\n```jsx\nimport { useState, useEffect } from 'react';\nimport { useRAG, initRAG, createBrowserModelClient } from 'quick-rag';\n\nconst docs = [\n  { id: '1', text: 'React is a JavaScript library for building user interfaces.' },\n  { id: '2', text: 'Ollama provides local LLM hosting.' },\n  { id: '3', text: 'RAG combines retrieval with AI generation.' }\n];\n\nexport default function App() {\n  const [rag, setRAG] = useState(null);\n  const [query, setQuery] = useState('');\n  \n  const { run, loading, response, docs: results } = useRAG({\n    retriever: rag?.retriever,\n    modelClient: createBrowserModelClient(),\n    model: 'granite4:3b'\n  });\n\n  useEffect(() =\u003e {\n    initRAG(docs, {\n      baseEmbeddingOptions: {\n        useBrowser: true,\n        baseUrl: '/api/embed',\n        model: 'qwen3-embedding:0.6b'\n      }\n    }).then(core =\u003e setRAG(core));\n  }, []);\n\n  return (\n    \u003cdiv style={{ padding: 40 }}\u003e\n      \u003ch1\u003e🤖 RAG Demo\u003c/h1\u003e\n      \u003cinput \n        value={query} \n        onChange={e =\u003e setQuery(e.target.value)}\n        placeholder=\"Ask something...\"\n        style={{ width: 300, padding: 10 }}\n      /\u003e\n      \u003cbutton onClick={() =\u003e run(query)} disabled={loading}\u003e\n        {loading ? 'Thinking...' : 'Ask AI'}\n      \u003c/button\u003e\n      \n      {results \u0026\u0026 (\n        \u003cdiv\u003e\n          \u003ch3\u003e📚 Retrieved:\u003c/h3\u003e\n          {results.map(d =\u003e \u003cp key={d.id}\u003e{d.text}\u003c/p\u003e)}\n        \u003c/div\u003e\n      )}\n      \n      {response \u0026\u0026 (\n        \u003cdiv\u003e\n          \u003ch3\u003e✨ Answer:\u003c/h3\u003e\n          \u003cp\u003e{response}\u003c/p\u003e\n        \u003c/div\u003e\n      )}\n    \u003c/div\u003e\n  );\n}\n```\n\n**Step 6:** Run your app\n\n```bash\nnpm run dev\n```\n\nOpen `http://localhost:5173` 🎉\n\n---\n\n### Option 2: Next.js (Pages Router)\n\n**Step 1:** Create API routes\n\n```javascript\n// pages/api/generate.js\nimport { OllamaClient } from 'quick-rag';\n\nexport default async function handler(req, res) {\n  const client = new OllamaClient();\n  const { model = 'granite4:3b', prompt } = req.body;\n  const response = await client.generate(model, prompt);\n  res.json({ response });\n}\n```\n\n```javascript\n// pages/api/embed.js\nimport { OllamaClient } from 'quick-rag';\n\nexport default async function handler(req, res) {\n  const client = new OllamaClient();\n  const { model = 'qwen3-embedding:0.6b', input } = req.body;\n  const response = await client.embed(model, input);\n  res.json(response);\n}\n```\n\n**Step 2:** Use in your page (same React component as above)\n\n---\n\n### Option 3: Vanilla JavaScript (Node.js)\n\n**Simple approach with official Ollama SDK:**\n\n```javascript\nimport { \n  OllamaRAGClient, \n  createOllamaRAGEmbedding, \n  InMemoryVectorStore, \n  Retriever \n} from 'quick-rag';\n\n// 1. Initialize client\nconst client = new OllamaRAGClient();\n\n// 2. Setup embedding\nconst embed = createOllamaRAGEmbedding(client, 'qwen3-embedding:0.6b');\n\n// 3. Create vector store and retriever\nconst vectorStore = new InMemoryVectorStore(embed);\nconst retriever = new Retriever(vectorStore);\n\n// 4. Add documents\nawait vectorStore.addDocuments([\n  { text: 'JavaScript is a programming language.' },\n  { text: 'Python is great for data science.' },\n  { text: 'Rust is a systems programming language.' }\n]);\n\n// 5. Query\nconst query = 'What is JavaScript?';\nconst results = await retriever.getRelevant(query, 2);\n\n// 6. Generate answer\nconst context = results.map(d =\u003e d.text).join('\\n');\nconst response = await client.chat({\n  model: 'granite4:3b',\n  messages: [{ \n    role: 'user', \n    content: `Context:\\n${context}\\n\\nQuestion: ${query}\\n\\nAnswer:` \n  }]\n});\n\n// Clean output\nconsole.log('📚 Retrieved:', results.map(d =\u003e d.text));\nconsole.log('🤖 Answer:', response.message.content);\n```\n\n**Output:**\n```\n📚 Retrieved: [\n  'JavaScript is a programming language.',\n  'Python is great for data science.'\n]\n🤖 Answer: JavaScript is a programming language that allows developers \nto write code and implement functionality in web browsers...\n```\n\n---\n\n### Option 4: LM Studio 🎨\n\nUse LM Studio instead of Ollama with OpenAI-compatible API:\n\n```javascript\nimport { \n  LMStudioRAGClient, \n  createLMStudioRAGEmbedding, \n  InMemoryVectorStore, \n  Retriever, \n  generateWithRAG \n} from 'quick-rag';\n\n// 1. Initialize LM Studio client\nconst client = new LMStudioRAGClient();\n\n// 2. Setup embedding (use your embedding model from LM Studio)\nconst embed = createLMStudioRAGEmbedding(client, 'text-embedding-embeddinggemma-300m');\n\n// 3. Create vector store and retriever\nconst vectorStore = new InMemoryVectorStore(embed);\nconst retriever = new Retriever(vectorStore);\n\n// 4. Add documents\nawait vectorStore.addDocuments([\n  { text: 'LM Studio is a desktop app for running LLMs locally.' },\n  { text: 'It provides an OpenAI-compatible API.' },\n  { text: 'You can use models like Llama, Mistral, and more.' }\n]);\n\n// 5. Query with RAG\nconst results = await retriever.getRelevant('What is LM Studio?', 2);\nconst answer = await generateWithRAG(\n  client,\n  'google/gemma-3-4b', // or your model name\n  'What is LM Studio?',\n  results\n);\n\nconsole.log('Answer:', answer);\n```\n\n**Prerequisites for LM Studio:**\n\n1. Download and install [LM Studio](https://lmstudio.ai)\n2. Download a language model (e.g., Llama 3.2, Mistral)\n3. Download an embedding model (e.g., text-embedding-embeddinggemma-300m)\n4. Start the local server: `Developer \u003e Local Server` (default: `http://localhost:1234`)\n\n**For React projects:** Import from `'quick-rag/react'` to use hooks:\n\n```javascript\nimport { useRAG } from 'quick-rag/react';\n// or\nimport { useRAG } from 'quick-rag'; // Also works in React projects\n```\n\n---\n\n## 📖 API Reference\n\n### React Hook: `useRAG`\n\n```javascript\nconst { run, loading, response, docs, streaming, error } = useRAG({\n  retriever,        // Retriever instance\n  modelClient,      // Model client (OllamaClient or BrowserModelClient)\n  model            // Model name (e.g., 'granite4:3b')\n});\n\n// Ask a question\nawait run('What is React?');\n\n// With options\nawait run('What is React?', {\n  topK: 5,           // Number of documents to retrieve\n  stream: true,      // Enable streaming\n  onDelta: (chunk, fullText) =\u003e console.log(chunk)\n});\n```\n\n### Core Functions\n\n**Initialize RAG**\n\n```javascript\nconst { retriever, store, mrl } = await initRAG(documents, {\n  defaultDim: 128,              // Embedding dimension\n  k: 2,                         // Default number of results\n  mrlBaseDim: 768,             // Base embedding dimension\n  baseEmbeddingOptions: {\n    useBrowser: true,           // Use browser-safe fetch\n    baseUrl: '/api/embed',      // Embedding endpoint\n    model: 'qwen3-embedding:0.6b'    // Embedding model\n  }\n});\n```\n\n**Generate with RAG**\n\n```javascript\nconst result = await generateWithRAG({\n  retriever,\n  modelClient,\n  model,\n  query: 'Your question',\n  topK: 3              // Optional: override default k\n});\n\n// Returns: { docs, response, prompt }\n```\n\n### VectorStore API\n\n```javascript\nconst store = new InMemoryVectorStore(embeddingFn, { defaultDim: 128 });\n\n// Add documents\nawait store.addDocument({ id: '1', text: 'Document text' });\n\n// Add multiple documents with batch processing (v2.0.3!)\nawait store.addDocuments([{ id: '1', text: '...' }], { \n  dim: 128,\n  batchSize: 20,        // Process 20 chunks at a time\n  maxConcurrent: 5,     // Max 5 concurrent requests\n  onProgress: (current, total) =\u003e {\n    console.log(`Progress: ${current}/${total}`);\n  }\n});\n\n// Query\nconst results = await store.similaritySearch('query', k, queryDim);\n\n// CRUD\nconst doc = store.getDocument('id');\nconst all = store.getAllDocuments();\nawait store.updateDocument('id', 'new text', { meta: 'data' });\nstore.deleteDocument('id');\nstore.clear();\n```\n\n**Batch Processing for Large Documents (v2.0.3):**\n\n```javascript\n// Process large PDFs efficiently\nconst chunks = chunkDocuments([largePDF], { chunkSize: 1000, overlap: 100 });\n\nawait store.addDocuments(chunks, {\n  batchSize: 20,        // Process 20 chunks per batch\n  maxConcurrent: 5,     // Max 5 concurrent embedding requests\n  onProgress: (current, total) =\u003e {\n    console.log(`Embedding progress: ${current}/${total} (${Math.round(current/total*100)}%)`);\n  }\n});\n```\n\n### Model Clients\n\n**Browser (with proxy)**\n\n```javascript\nconst client = createBrowserModelClient({\n  endpoint: '/api/generate'  // Your proxy endpoint\n});\n```\n\n**Node.js (direct)**\n\n```javascript\nconst client = new OllamaClient({\n  baseUrl: 'http://127.0.0.1:11434/api'\n});\n```\n\n---\n\n## 💡 Examples\n\n### CRUD Operations\n\n```javascript\n// Add document dynamically\nawait store.addDocument({ \n  id: 'new-doc', \n  text: 'TypeScript adds types to JavaScript.' \n});\n\n// Add multiple documents with batch processing (v2.0.3!)\nawait store.addDocuments([\n  { id: 'doc1', text: 'First document' },\n  { id: 'doc2', text: 'Second document' }\n], {\n  batchSize: 10,        // Process in batches\n  maxConcurrent: 5,     // Rate limiting\n  onProgress: (current, total) =\u003e {\n    console.log(`Added ${current}/${total} documents`);\n  }\n});\n\n// Update existing\nawait store.updateDocument('1', 'React 19 is the latest version.', {\n  version: '19',\n  updated: Date.now()\n});\n\n// Delete\nstore.deleteDocument('2');\n\n// Query all\nconst allDocs = store.getAllDocuments();\nconsole.log(`Total documents: ${allDocs.length}`);\n```\n\n### Dynamic Retrieval\n\n```javascript\n// Ask with different topK values\nconst result1 = await run('What is JavaScript?', { topK: 1 }); // Get 1 doc\nconst result2 = await run('What is JavaScript?', { topK: 5 }); // Get 5 docs\n```\n\n### Streaming Responses\n\n```javascript\nawait run('Explain React hooks', {\n  stream: true,\n  onDelta: (chunk, fullText) =\u003e {\n    console.log('New chunk:', chunk);\n    // Update UI in real-time\n  }\n});\n```\n\n### Custom Embedding Models\n\n```javascript\n// Use different embedding models\nconst rag = await initRAG(docs, {\n  baseEmbeddingOptions: {\n    useBrowser: true,\n    baseUrl: '/api/embed',\n    model: 'qwen3-embedding:0.6b'  // or another compatible embedding model\n  }\n});\n```\n\n**More examples:** Check the [`examples/`](./examples) folder for complete demos.\n\n---\n\n## 📄 Document Loaders (v0.7.4+)\n\nLoad documents from various formats and use them with RAG!\n\n### Supported Formats\n\n| Format | Function | Requires |\n|--------|----------|----------|\n| PDF | `loadPDF()` | `npm install pdf-parse` |\n| Word (.docx) | `loadWord()` | `npm install mammoth` |\n| Excel (.xlsx) | `loadExcel()` | `npm install xlsx` |\n| Text (.txt) | `loadText()` | Built-in ✅ |\n| JSON | `loadJSON()` | Built-in ✅ |\n| Markdown | `loadMarkdown()` | Built-in ✅ |\n| Web URLs | `loadURL()` | Built-in ✅ |\n\n### Quick Start\n\n**Load PDF:**\n```javascript\nimport { loadPDF, chunkDocuments } from 'quick-rag';\n\n// Load PDF\nconst pdf = await loadPDF('./document.pdf');\nconsole.log(`Loaded ${pdf.meta.pages} pages`);\n\n// Chunk and add to RAG\nconst chunks = chunkDocuments([pdf], { \n  chunkSize: 500, \n  overlap: 50 \n});\nawait store.addDocuments(chunks);\n```\n\n**Load from URL:**\n```javascript\nimport { loadURL } from 'quick-rag';\n\nconst doc = await loadURL('https://example.com', {\n  extractText: true  // Convert HTML to plain text\n});\nawait store.addDocuments([doc]);\n```\n\n**Load Directory:**\n```javascript\nimport { loadDirectory } from 'quick-rag';\n\n// Load all supported documents from a folder\nconst docs = await loadDirectory('./documents', {\n  extensions: ['.pdf', '.docx', '.txt', '.md'],\n  recursive: true\n});\n\nconsole.log(`Loaded ${docs.length} documents`);\n\n// Chunk and add to vector store\nconst chunks = chunkDocuments(docs, { chunkSize: 500 });\nawait store.addDocuments(chunks);\n```\n\n**Auto-Detect Format:**\n```javascript\nimport { loadDocument } from 'quick-rag';\n\n// Automatically detects file type\nconst doc = await loadDocument('./file.pdf');\n// Works with: .pdf, .docx, .xlsx, .txt, .md, .json\n```\n\n### Installation\n\n```bash\n# Core package (includes text, JSON, markdown, URL loaders)\nnpm install quick-rag\n\n# Optional: PDF support\nnpm install pdf-parse\n\n# Optional: Word support\nnpm install mammoth\n\n# Optional: Excel support\nnpm install xlsx\n\n# Or install all at once:\nnpm install quick-rag pdf-parse mammoth xlsx\n```\n\n### Complete Example\n\n```javascript\nimport {\n  loadPDF,\n  loadDirectory,\n  chunkDocuments,\n  InMemoryVectorStore,\n  Retriever,\n  OllamaRAGClient,\n  createOllamaRAGEmbedding,\n  generateWithRAG\n} from 'quick-rag';\n\n// Load documents\nconst pdf = await loadPDF('./research.pdf');\nconst docs = await loadDirectory('./articles');\n\n// Combine and chunk\nconst allDocs = [pdf, ...docs];\nconst chunks = chunkDocuments(allDocs, { \n  chunkSize: 500,\n  overlap: 50 \n});\n\n// Setup RAG\nconst client = new OllamaRAGClient();\nconst embed = createOllamaRAGEmbedding(client, 'qwen3-embedding:0.6b');\nconst store = new InMemoryVectorStore(embed);\nconst retriever = new Retriever(store);\n\n// Add to vector store\nawait store.addDocuments(chunks);\n\n// Query\nconst results = await retriever.getRelevant('What is the main topic?', 3);\nconst answer = await generateWithRAG(client, 'granite4:3b', \n  'What is the main topic?', results);\n\nconsole.log(answer);\n```\n\n**See full example:** [`examples/11-loaders.js`](./examples/11-loaders.js)\n\n---\n\n## ❓ Troubleshooting\n\n| Problem | Solution |\n|---------|----------|\n| 🚫 **CORS errors** | Use a proxy server (Express/Next.js API routes) |\n| 🔌 **Connection refused** | Ensure Ollama is running: `ollama serve` |\n| 📦 **Models not found** | Pull models: `ollama pull granite4:3b \u0026\u0026 ollama pull qwen3-embedding:0.6b` |\n| 🌐 **404 on `/api/embed`** | Check your proxy configuration in `vite.config.js` or API routes |\n| 💻 **Windows IPv6 issues** | Use `127.0.0.1` instead of `localhost` |\n| 📦 **Module not found** | Check imports: use `'quick-rag'` not `'quick-rag/...'` |\n\n\u003e **Note:** v0.6.5+ automatically detects and uses the correct API (generate or chat) for any model.\n\n---\n\n## 📚 Documentation\n\n- **📖 [API Reference](./docs/API_REFERENCE.md)** - Complete API documentation\n- **🛡️ [Error Handling](./docs/ERROR_HANDLING.md)** - Error handling best practices\n- **💾 [SQLite Persistence](./docs/SQLITE_PERSISTENCE.md)** - Embedded storage guide\n- **📊 [Metrics \u0026 Telemetry](./docs/METRICS_TELEMETRY.md)** - Monitoring and logging\n- **🤝 [Contributing](./CONTRIBUTING.md)** - Contribution guidelines\n- **📝 [Changelog](./CHANGELOG.md)** - Version history\n- **💡 [Examples](./examples)** - Working code examples\n- **🚀 [Quickstart](./quickstart)** - Quick start guides\n\n## 🔗 Resources\n\n- **Ollama Models:** [ollama.ai/library](https://ollama.ai/library)\n- **LM Studio:** [lmstudio.ai](https://lmstudio.ai)\n- **Issues:** [GitHub Issues](https://github.com/emredeveloper/quick-rag/issues)\n- **Discussions:** [GitHub Discussions](https://github.com/emredeveloper/quick-rag/discussions)\n- **NPM Package:** [npmjs.com/package/quick-rag](https://www.npmjs.com/package/quick-rag)\n\n---\n\n## 📄 License\n\nMIT © [Cihat Emre Karataş](https://github.com/emredeveloper)\n\n---\n\n## 🙏 Acknowledgments\n\nBuilt with:\n- [Ollama JS SDK](https://github.com/ollama/ollama-js)\n- [LM Studio SDK](https://github.com/lmstudio-ai/lmstudio-js)\n- [Pino](https://github.com/pinojs/pino) - Fast logging\n- [Better SQLite3](https://github.com/WiseLibs/better-sqlite3) - Embedded database\n\nSpecial thanks to all contributors and the open-source community!\n\n---\n\n**Made with ❤️ for the JavaScript \u0026 AI community**\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femredeveloper%2Fquick-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femredeveloper%2Fquick-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femredeveloper%2Fquick-rag/lists"}