{"id":49618832,"url":"https://github.com/shrimpy8/semantic-search-next","last_synced_at":"2026-05-05T00:39:26.531Z","repository":{"id":328103653,"uuid":"1112379636","full_name":"shrimpy8/semantic-search-next","owner":"shrimpy8","description":"A full-stack RAG (Retrieval Augmented Generation) application with hybrid search, cross-encoder reranking, citation-verified AI answers, and LLM-as-Judge evaluation. Supports multiple AI providers including OpenAI, Anthropic, and Ollama for fully local operation.","archived":false,"fork":false,"pushed_at":"2026-02-13T07:44:09.000Z","size":6806,"stargazers_count":1,"open_issues_count":5,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-05T00:38:58.918Z","etag":null,"topics":["ai-answers","anthropic","chromadb","citation-validation","document-search","embeddings","fastapi","hybrid-search","jina","llm","natural-language-processing","nextjs","ollama","openai","postgresql","python","rag","retrieval-augmented-generation","semantic-search","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shrimpy8.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-08T14:44:44.000Z","updated_at":"2026-02-13T07:44:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/shrimpy8/semantic-search-next","commit_stats":null,"previous_names":["shrimpy8/semantic-search-next"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/shrimpy8/semantic-search-next","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrimpy8%2Fsemantic-search-next","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrimpy8%2Fsemantic-search-next/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrimpy8%2Fsemantic-search-next/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrimpy8%2Fsemantic-search-next/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shrimpy8","download_url":"https://codeload.github.com/shrimpy8/semantic-search-next/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrimpy8%2Fsemantic-search-next/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32631058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-answers","anthropic","chromadb","citation-validation","document-search","embeddings","fastapi","hybrid-search","jina","llm","natural-language-processing","nextjs","ollama","openai","postgresql","python","rag","retrieval-augmented-generation","semantic-search","vector-database"],"created_at":"2026-05-05T00:39:23.818Z","updated_at":"2026-05-05T00:39:26.513Z","avatar_url":"https://github.com/shrimpy8.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Semantic Search Next\n\nA full-stack RAG (Retrieval Augmented Generation) application with hybrid search, cross-encoder reranking, citation-verified AI answers, and LLM-as-Judge evaluation. Supports multiple AI providers including OpenAI, Anthropic, and Ollama for fully local operation.\n\n## Pipeline Overview\n\nWhen you search, here's what happens behind the scenes:\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e📄\u003c/h3\u003e\n\u003cb\u003e1. Document\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eUpload \u0026 organize into collections\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e✂️\u003c/h3\u003e\n\u003cb\u003e2. Chunk\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eSplit into searchable pieces\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e🧮\u003c/h3\u003e\n\u003cb\u003e3. Embed\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eConvert to vector embeddings\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e🔀\u003c/h3\u003e\n\u003cb\u003e4. Search\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eHybrid BM25 + Semantic\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e🏆\u003c/h3\u003e\n\u003cb\u003e5. Rerank\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eCross-encoder refinement\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e💬\u003c/h3\u003e\n\u003cb\u003e6. Answer\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eRAG-powered response\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd align=\"center\" width=\"14%\"\u003e\n\u003ch3\u003e📊\u003c/h3\u003e\n\u003cb\u003e7. Eval\u003c/b\u003e\u003cbr/\u003e\n\u003csub\u003eLLM-as-Judge quality\u003c/sub\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### Pipeline Options\n\n| Step | What It Does | Provider Options | Key Parameters |\n|:-----|:-------------|:-----------------|:---------------|\n| **Chunk** | Splits documents into searchable pieces | Built-in | `chunk_size` (default: 1000), `chunk_overlap` (200) |\n| **Embed** | Converts text to vectors for semantic search | **Cloud:** OpenAI, Voyage, Cohere, Jina\u003cbr/\u003e**Local:** Ollama | `embedding_model` |\n| **Search** | Finds relevant chunks using hybrid retrieval | BM25 (keywords) + ChromaDB (semantic) | `alpha` (0=keywords, 1=semantic), `top_k`, `preset` |\n| **Rerank** | AI scores each result for precise ranking | **Cloud:** Cohere\u003cbr/\u003e**Local:** Jina | `reranker_provider` (auto/jina/cohere/none) |\n| **Answer** | Generates response from retrieved context | **Cloud:** OpenAI, Anthropic\u003cbr/\u003e**Local:** Ollama | `answer_provider`, `answer_model` |\n| **Eval** | Measures retrieval \u0026 answer quality | **Cloud:** OpenAI, Anthropic\u003cbr/\u003e**Local:** Ollama | `eval_judge_provider`, `eval_judge_model` |\n\n\u003e **All settings configurable via the Settings page (`/settings`)**. For fully local operation, use Ollama + Jina — no API keys required.\n\n## How Search Works\n\n```\n┌─────────────────────────────────────────────────────────────────────────────────┐\n│                              SEARCH FLOW                                        │\n└─────────────────────────────────────────────────────────────────────────────────┘\n\n  ┌─────────────┐\n  │  Your Query │\n  │ \"How does   │\n  │  auth work?\"│\n  └──────┬──────┘\n         │\n         ▼\n  ┌─────────────┐     ┌────────────────────────────────────────────────────────┐\n  │   EMBED     │     │  Convert query to 3072-dimensional vector using AI     │\n  │   QUERY     │────▶│  (OpenAI text-embedding-3-large)                       │\n  └──────┬──────┘     └────────────────────────────────────────────────────────┘\n         │\n         ▼\n  ┌─────────────────────────────────────────┐\n  │         PARALLEL RETRIEVAL              │\n  │  ┌─────────────┐    ┌─────────────┐     │\n  │  │  SEMANTIC   │    │    BM25     │     │\n  │  │   SEARCH    │    │  KEYWORDS   │     │\n  │  │ (ChromaDB)  │    │ (In-memory) │     │\n  │  │             │    │             │     │\n  │  │ Finds by    │    │ Finds by    │     │\n  │  │  meaning    │    │ exact terms │     │\n  │  └──────┬──────┘    └──────┬──────┘     │\n  │         │                  │            │\n  └─────────┼──────────────────┼────────────┘\n            │                  │\n            ▼                  ▼\n  ┌─────────────────────────────────────────┐\n  │    RECIPROCAL RANK FUSION (RRF)         │\n  │                                         │\n  │  Intelligently merge both result sets   │\n  │  α=0.5 → 50% semantic + 50% keywords    │\n  └──────────────────┬──────────────────────┘\n                     │\n                     ▼\n  ┌─────────────────────────────────────────┐\n  │    CROSS-ENCODER RERANKING              │\n  │                                         │\n  │  AI model scores each query-document    │\n  │  pair for precise relevance (0-100%)    │\n  └──────────────────┬──────────────────────┘\n                     │\n                     ▼\n  ┌─────────────────────────────────────────┐\n  │    CONFIDENCE FILTERING                 │\n  │                                         │\n  │  ┌─────────────┐    ┌─────────────┐     │\n  │  │    HIGH     │    │     LOW     │     │\n  │  │ CONFIDENCE  │    │ CONFIDENCE  │     │\n  │  │  (≥30%)     │    │  (\u003c30%)     │     │\n  │  │  Shown      │    │  Hidden     │     │\n  │  └─────────────┘    └─────────────┘     │\n  └──────────────────┬──────────────────────┘\n                     │\n                     ▼\n  ┌─────────────────────────────────────────┐\n  │    OPTIONAL: AI ANSWER + CITATIONS      │\n  │                                         │\n  │  RAG-powered answer with verification   │\n  │  Each claim linked to source document   │\n  └─────────────────────────────────────────┘\n```\n\n\u003e **Want to learn more?** The app includes an interactive **\"How It Works\"** page with detailed explanations of each concept, search quality progression, and settings guidance. See the [screenshots below](#how-it-works-page) or explore it yourself when running the app.\n\n## Features\n\n| Feature | Description |\n|---------|-------------|\n| **Hybrid Retrieval** | Combines BM25 keyword search with semantic embeddings using Reciprocal Rank Fusion (RRF) |\n| **AI Answer Generation** | RAG-powered answers with citation verification and hallucination detection |\n| **[RAG Evaluations](#rag-evaluations)** | LLM-as-Judge evaluation with retrieval \u0026 answer quality metrics |\n| **AI Reranking** | Uses Jina cross-encoder (local) or Cohere API to rerank results for relevance |\n| **Confidence Filtering** | Separates high-confidence from low-confidence results based on configurable threshold |\n| **Answer Verification** | Extracts claims from AI answers and verifies them against source documents |\n| **Search Analytics** | Dashboard with search history, latency trends, and usage statistics |\n| **Document Preview** | View full document content with chunk navigation |\n| **Collection Scoping** | Search across all documents or within specific collections |\n| **Retrieval Presets** | High Precision / Balanced / High Recall modes |\n| **Score Transparency** | View semantic, BM25, rerank, and final scores on results |\n| **Multiple Providers** | Support for OpenAI, Anthropic, Ollama (local), Jina, Cohere, and Voyage AI |\n| **Dark Mode** | Full theme support with system preference detection |\n\n## Screenshots\n\n### Semantic Search with AI Answer\n![Semantic Search Result](screenshots/1_Semantic-Search-Result.png)\n\n### Detailed Relevance Scores\n![Search Result Scores](screenshots/2_Semantic-Search-Result-Score.png)\n\n### LLM-as-Judge Evaluation\nRun evaluations to measure search quality with configurable judge models.\n\n| Run Evaluation | Evaluation Output |\n|----------------|-------------------|\n| ![Run Eval](screenshots/3_Semantic-Search-Result-Run-Eval.png) | ![Eval Output](screenshots/4_Semantic-Search-Result-Eval-Output.png) |\n\n### More Screenshots\n\n| Feature | Description |\n|---------|-------------|\n| [Evaluation Details](screenshots/5_Semantic-Search-Result-Eval-Details.png) | Detailed breakdown of evaluation metrics and scores |\n| [Collections](screenshots/6_Semantic-Search-Collections.png) | Organize documents into searchable collections |\n| [Documents](screenshots/7_Semantic-Search-Collections-Documents.png) | View and manage documents within collections |\n| [Analytics](screenshots/8_Semantic-Search-Analytics.png) | Track search history, latency trends, and query patterns |\n| [How It Works](screenshots/9_Semantic-Search-HowItWorks.png) | Interactive documentation explaining search technology |\n| [Settings](screenshots/10_Semantic-Search-Settings.png) | Configure providers, models, and search parameters |\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│                         FRONTEND                                │\n│                    Next.js 15 (App Router)                      │\n│              Shadcn/ui + Tailwind + TypeScript                  │\n└─────────────────────────┬───────────────────────────────────────┘\n                          │ HTTP/REST\n                          ▼\n┌─────────────────────────────────────────────────────────────────┐\n│                         BACKEND                                 │\n│                      FastAPI (Python)                           │\n│  ┌─────────────┬─────────────┬─────────────┬─────────────────┐  │\n│  │ Collections │  Documents  │   Search    │    Settings     │  │\n│  │   API       │    API      │    API      │      API        │  │\n│  └──────┬──────┴──────┬──────┴──────┬──────┴────────┬────────┘  │\n│         │             │             │               │           │\n│  ┌──────▼─────────────▼─────────────▼───────────────▼────────┐  │\n│  │                    CORE SERVICES                          │  │\n│  │  HybridSearchService │ Reranker │ VectorStore │ BM25Cache │  │\n│  └──────┬───────────────┴──────────┴─────────────┬───────────┘  │\n└─────────┼────────────────────────────────────────┼──────────────┘\n          │                                        │\n          ▼                                        ▼\n┌─────────────────────┐                 ┌─────────────────────────┐\n│     PostgreSQL      │                 │       ChromaDB          │\n│  (Metadata + Config)│                 │    (Vector Store)       │\n└─────────────────────┘                 └─────────────────────────┘\n```\n\n## Search Flow\n\n1. **Query Embedding** - Generate embedding via OpenAI `text-embedding-3-large`\n2. **Parallel Retrieval**:\n   - Semantic search via ChromaDB (cosine similarity)\n   - BM25 keyword search (in-memory, per-collection cache with auto-invalidation)\n3. **Reciprocal Rank Fusion (RRF)** - Merge results with configurable alpha\n4. **Reranking** - Jina cross-encoder (local) or Cohere API\n5. **Confidence Filtering** - Split results by `min_score_threshold` (default: 35%)\n6. **Response** - High-confidence results + hidden low-confidence results\n\n## Tech Stack\n\n### Backend\n- **FastAPI** - Python web framework with async support\n- **PostgreSQL** - Relational database (metadata, settings, search history)\n- **ChromaDB** - Vector database for semantic search\n- **OpenAI** - Embeddings (`text-embedding-3-large`)\n- **Jina/Cohere** - Cross-encoder reranking\n- **BM25** - Keyword search via `rank_bm25`\n\n### Frontend\n- **Next.js 15** - React framework with App Router\n- **TypeScript** - Type safety\n- **Tailwind CSS** - Utility-first styling\n- **Shadcn/ui** - Component library\n- **Lucide** - Icons\n\n## Prerequisites\n\n- Node.js 18+\n- Python 3.11+\n- Docker \u0026 Docker Compose\n\n\u003e **Detailed Setup Guide**: See [INFRASTRUCTURE.md](./docs/INFRASTRUCTURE.md) for comprehensive setup instructions including:\n\u003e - PostgreSQL \u0026 ChromaDB configuration\n\u003e - Local AI providers (Ollama, Jina reranker)\n\u003e - Cloud provider setup (OpenAI, Cohere, Voyage AI)\n\u003e - Troubleshooting guide\n\u003e\n\u003e **Quick Start**: See [SETUP.md](./docs/SETUP.md) for a concise, working local setup checklist.\n\u003e\n\u003e **API Reference**: See [API.md](./docs/API.md) for full endpoint documentation with examples.\n\n## Local AI with Ollama (Optional)\n\nRun the entire pipeline locally without API keys using Ollama:\n\n### Install Ollama\n\n```bash\n# macOS\nbrew install ollama\n\n# Linux\ncurl -fsSL https://ollama.com/install.sh | sh\n\n# Windows - Download from https://ollama.com/download\n```\n\n### Pull Required Models\n\n```bash\n# Embedding model (choose one)\nollama pull nomic-embed-text-v2-moe  # Latest MoE, strong retrieval (recommended)\nollama pull nomic-embed-text      # Fast, good quality\nollama pull mxbai-embed-large     # Higher quality, slower\n\n# LLM for answers \u0026 evaluation (choose one)\nollama pull llama3.2              # Fast, 3B params (recommended)\nollama pull llama3.1:8b           # Better quality, 8B params\nollama pull mistral               # Good balance\n```\n\n### Start Ollama Server\n\n```bash\nollama serve  # Runs on http://localhost:11434\n```\n\n### Configure in App\n\n1. Start the app and go to **Settings** (`/settings`)\n2. Configure Ollama models:\n   - **Embedding Model**: `ollama:nomic-embed-text-v2-moe:latest` (or `ollama:nomic-embed-text`)\n   - **Answer Provider**: `ollama` → Model: `llama3.2`\n   - **Eval Provider**: `ollama` → Model: `llama3.1:8b`\n\n\u003e **Note**: Ollama runs locally - no API keys required. First request may be slow as models load into memory.\n\n## Quick Start\n\n### 1. Clone and setup environment\n\n```bash\ngit clone https://github.com/shrimpy8/semantic-search-next.git\ncd semantic-search-next\n\n# Copy environment files\ncp backend/.env.example backend/.env\ncp frontend/.env.example frontend/.env.local\n# Edit with your API keys\n```\n\n### 2. Start Docker Services\n\n```bash\n# Start PostgreSQL + pgAdmin\ndocker-compose up -d\n\n# Start ChromaDB (separate container)\ndocker run -d --name chromadb -p 8000:8000 chromadb/chroma\n```\n\nServices started:\n- **PostgreSQL**: `localhost:5432`\n- **ChromaDB**: `localhost:8000`\n- **pgAdmin**: `http://localhost:3001` (login: `admin@local.dev` / `admin`)\n\n### 3. Backend Setup\n\n```bash\ncd backend\n\n# Create virtual environment\npython -m venv .venv\nsource .venv/bin/activate  # macOS/Linux\n# .venv\\Scripts\\activate   # Windows\n\n# Install dependencies\npip install -e \".[dev]\"\n\n# Run FastAPI server\nuvicorn app.main:app --reload --port 8080\n```\n\n- API: `http://localhost:8080`\n- Swagger docs: `http://localhost:8080/api/v1/docs`\n\n### 4. Frontend Setup\n\n```bash\ncd frontend\n\n# Install dependencies\nnpm install\n\n# Run development server\nnpm run dev\n```\n\nFrontend: `http://localhost:3000`\n\n## Environment Variables\n\nCopy `.env.example` files in `backend/` and `frontend/` directories. See `.env.example` for comprehensive documentation.\n\n### Backend (backend/.env)\n\n```env\n# Debug Mode\nDEBUG=false                          # Set true for verbose logging\n\n# OpenAI (required for default config)\nOPENAI_API_KEY=sk-...\nEMBEDDING_MODEL=text-embedding-3-large\nLLM_MODEL=gpt-4o-mini\n\n# Alternative Embedding Providers (optional)\nOLLAMA_BASE_URL=http://localhost:11434  # Local, no API key needed\nJINA_API_KEY=...                         # Free tier: 1M tokens/mo\nCOHERE_API_KEY=...                       # Also used for reranking\nVOYAGE_API_KEY=...                       # RAG optimized\n\n# Database\nPOSTGRES_HOST=localhost\nPOSTGRES_PORT=5432\nPOSTGRES_DB=semantic_search\nPOSTGRES_USER=postgres\nPOSTGRES_PASSWORD=postgres\n\n# ChromaDB\nCHROMA_HOST=localhost\nCHROMA_PORT=8000\n\n# Reranking\nRERANKER_PROVIDER=auto               # auto | jina | cohere\nUSE_RERANKING=true\n```\n\n### Frontend (frontend/.env.local)\n\n```env\nNEXT_PUBLIC_API_URL=http://localhost:8080/api/v1\nNEXT_PUBLIC_DEBUG=false              # Set true for console logging\n```\n\n## Project Structure\n\n```\nsemantic-search-next/\n├── docker-compose.yml           # PostgreSQL + ChromaDB + pgAdmin\n├── docs/\n│   ├── ARCHITECTURE.md          # Detailed system design\n│   ├── INFRASTRUCTURE.md        # Setup guide for all services\n│   ├── PROJECT_STATUS.md        # Implementation status \u0026 roadmap\n│   └── SETUP.md                 # Quick-start setup checklist\n├── backend/\n│   ├── .env.example             # Backend environment template\n│   ├── app/\n│   │   ├── main.py              # FastAPI entry\n│   │   ├── config.py            # Settings\n│   │   ├── api/v1/              # REST endpoints\n│   │   │   ├── collections.py   # Collection CRUD\n│   │   │   ├── documents.py     # Document upload/delete\n│   │   │   ├── search.py        # Search with AI answers\n│   │   │   ├── analytics.py     # Search analytics\n│   │   │   ├── settings.py      # App settings\n│   │   │   └── health.py        # Health check\n│   │   ├── core/                # Business logic\n│   │   │   ├── hybrid_retriever.py  # RRF fusion\n│   │   │   ├── reranker.py      # Jina/Cohere reranking\n│   │   │   ├── qa_chain.py      # RAG answer generation\n│   │   │   ├── answer_verifier.py   # Citation verification\n│   │   │   └── embeddings.py    # Multi-provider embeddings\n│   │   ├── prompts/             # Externalized LLM prompts\n│   │   │   ├── qa.yaml          # QA generation prompts\n│   │   │   └── verification.yaml    # Verification prompts\n│   │   ├── services/\n│   │   │   └── retrieval.py     # HybridSearchService + BM25 cache\n│   │   ├── db/\n│   │   │   └── models.py        # SQLAlchemy models\n│   │   └── api/\n│   │       └── schemas.py       # Pydantic schemas\n│   └── pyproject.toml\n├── frontend/\n│   ├── .env.example             # Frontend environment template\n│   ├── src/\n│   │   ├── app/                 # Next.js App Router\n│   │   │   ├── page.tsx         # Main search page\n│   │   │   ├── analytics/       # Search analytics dashboard\n│   │   │   ├── documents/[id]/  # Document preview\n│   │   │   ├── collections/     # Collection management\n│   │   │   └── settings/        # Settings page\n│   │   ├── components/\n│   │   │   ├── ui/              # Shadcn components\n│   │   │   ├── layout/          # Header, sidebar\n│   │   │   ├── search/          # Search components\n│   │   │   ├── analytics/       # Analytics charts\n│   │   │   └── documents/       # Document viewer\n│   │   ├── lib/\n│   │   │   ├── api/             # API client \u0026 types\n│   │   │   └── debug.ts         # Debug logging utility\n│   │   └── hooks/               # TanStack Query hooks\n│   ├── package.json\n│   └── tsconfig.json\n└── README.md\n```\n\n## API Endpoints\n\n### Collections\n```\nPOST   /api/v1/collections              Create collection\nGET    /api/v1/collections              List collections\nGET    /api/v1/collections/{id}         Get collection\nPATCH  /api/v1/collections/{id}         Update collection\nDELETE /api/v1/collections/{id}         Delete collection\n```\n\n### Documents\n```\nPOST   /api/v1/collections/{id}/documents   Upload document (invalidates BM25 cache)\nGET    /api/v1/collections/{id}/documents   List documents\nGET    /api/v1/documents/{id}               Get document\nDELETE /api/v1/documents/{id}               Delete document (invalidates BM25 cache)\n```\n\n### Search\n```\nPOST   /api/v1/search                   Execute search with optional AI answer\n```\n\n**Request:**\n```json\n{\n  \"query\": \"machine learning\",\n  \"preset\": \"balanced\",\n  \"top_k\": 10,\n  \"collection_id\": \"optional-uuid\",\n  \"generate_answer\": true\n}\n```\n\n**Response:**\n```json\n{\n  \"query\": \"machine learning\",\n  \"results\": [...],\n  \"low_confidence_results\": [...],\n  \"low_confidence_count\": 3,\n  \"min_score_threshold\": 0.35,\n  \"answer\": \"Machine learning is...\",\n  \"answer_verification\": {\n    \"confidence\": \"high\",\n    \"citations\": [...],\n    \"verified_claims\": 3,\n    \"total_claims\": 3,\n    \"coverage_percent\": 100\n  },\n  \"latency_ms\": 245,\n  \"retrieval_method\": \"balanced\"\n}\n```\n\n### Analytics\n```\nGET    /api/v1/analytics/searches       Search history (paginated)\nGET    /api/v1/analytics/stats          Aggregate statistics\nGET    /api/v1/analytics/trends         Time-series data\n```\n\n### Evaluations\n```\nPOST   /api/v1/evals/evaluate           Run LLM-as-Judge evaluation\nGET    /api/v1/evals/results            List evaluation results\nGET    /api/v1/evals/results/{id}       Get single evaluation\nGET    /api/v1/evals/stats              Aggregate evaluation stats\nGET    /api/v1/evals/providers          List available judge providers\n```\n\n### Settings\n```\nGET    /api/v1/settings                 Get current settings\nPATCH  /api/v1/settings                 Update settings\nPOST   /api/v1/settings/reset           Reset to defaults\n```\n\n**Key Settings:**\n| Setting | Type | Default | Description |\n|---------|------|---------|-------------|\n| `default_preset` | string | `balanced` | Retrieval preset |\n| `default_alpha` | float | 0.5 | Semantic vs BM25 weight |\n| `default_use_reranker` | bool | true | Enable reranking |\n| `default_top_k` | int | 10 | Results to return |\n| `min_score_threshold` | float | 0.30 | Low-confidence cutoff |\n| `default_generate_answer` | bool | false | Enable AI answer generation |\n| `default_context_window` | int | 1 | Chunks before/after for context |\n| `show_scores` | bool | true | Display score breakdown |\n\n### Health\n```\nGET    /api/v1/health                   Health check\n```\n\n## Search Result Scores\n\nEach result includes a `scores` object:\n\n```json\n{\n  \"scores\": {\n    \"semantic_score\": 0.85,    // Normalized 0-1 (cosine similarity)\n    \"bm25_score\": 0.72,        // Normalized 0-1 (keyword match)\n    \"rerank_score\": 0.92,      // Cross-encoder 0-1 (when enabled)\n    \"final_score\": 0.92,       // Used for ranking/filtering\n    \"relevance_percent\": 92    // Display value (0-100%)\n  }\n}\n```\n\n## Development\n\n### Backend\n\n```bash\ncd backend\nsource .venv/bin/activate\n\n# Lint\nruff check .\n\n# Format\nruff format .\n\n# Type check\nmypy app\n\n# Test\npytest\n```\n\n### Frontend\n\n```bash\ncd frontend\n\n# Lint\nnpm run lint\n\n# Format\nnpm run format\n\n# Build\nnpm run build\n```\n\n## Test Queries\n\n```bash\n# High-confidence query\ncurl -s -X POST \"http://localhost:8080/api/v1/search\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"machine learning\", \"preset\": \"balanced\", \"top_k\": 5}'\n\n# Low-confidence query (unrelated to docs)\ncurl -s -X POST \"http://localhost:8080/api/v1/search\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"quantum entanglement physics\", \"preset\": \"balanced\", \"top_k\": 10}'\n\n# Check settings\ncurl -s http://localhost:8080/api/v1/settings\n\n# Health check\ncurl -s http://localhost:8080/api/v1/health\n```\n\n## Retrieval Presets\n\n| Preset | Alpha | Use Reranker | Description |\n|--------|-------|--------------|-------------|\n| `high_precision` | 0.8 | true | Emphasizes semantic similarity, best for specific queries |\n| `balanced` | 0.5 | true | Equal weight to semantic and keyword, good default |\n| `high_recall` | 0.3 | true | Emphasizes keyword matching, better for exploratory search |\n\n## Security Hardening\n\nThe application includes layered security protections for the RAG pipeline:\n\n### Input Sanitization\nUser queries are sanitized before reaching embedding and LLM providers. High-confidence injection patterns (prompt delimiters like `[INST]`, instruction overrides, system extraction attempts) are stripped automatically. Includes NFKC unicode normalization to defeat homoglyph evasion (e.g., Cyrillic characters) and zero-width character stripping.\n\n### Injection Detection\nQueries and retrieved document chunks are scanned for prompt injection patterns using a weighted pattern detector. When suspicious patterns are detected, a warning banner is shown to the user without blocking results. This is observe-only — it does not modify content.\n\n### Strict Output Parsing\nAll LLM responses (AI answers, claim verification, evaluation judgments) are validated through Pydantic schemas with fallback extraction. This prevents malformed or manipulated LLM output from propagating through the system.\n\n### Trust Boundaries\nCollections can be marked as **Trusted** or **Unverified**. Trust status flows through to search results and AI answers:\n- Search result cards show trust indicators (shield icons with labels)\n- AI answers that use content from unverified sources display a warning banner\n- Trust status is editable in collection settings\n\n### Configuration\n\nThese features are controlled via environment variables in `backend/.env`:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `ENABLE_INJECTION_DETECTION` | `true` | Enable/disable injection pattern scanning |\n| `ENABLE_INPUT_SANITIZATION` | `true` | Enable/disable query sanitization |\n| `SANITIZATION_THRESHOLD` | `0.8` | Minimum pattern weight (0.0-1.0) to trigger stripping |\n\n## Known Considerations\n\n- **BM25 Cache**: Automatically invalidated when documents are uploaded/deleted\n- **Confidence Threshold**: Adjustable via Settings API (`min_score_threshold`)\n- **Reranking**: Falls back to Jina local model if Cohere unavailable\n\n## RAG Evaluations\n\nMeasure and improve your search quality with LLM-as-Judge evaluation. The system evaluates both retrieval quality (finding the right chunks) and answer quality (generating accurate responses).\n\n### Evaluation Metrics\n\n| Category | Metric | Description |\n|----------|--------|-------------|\n| **Retrieval** | Context Relevance | How relevant are the retrieved chunks? |\n| **Retrieval** | Context Precision | Are irrelevant chunks filtered out? |\n| **Retrieval** | Context Coverage | Is all needed information present? |\n| **Answer** | Faithfulness | Is the answer grounded in the chunks? |\n| **Answer** | Answer Relevance | Does it answer the question? |\n| **Answer** | Completeness | Is anything missing? |\n\n### Score Interpretation\n\n| Score Range | Quality | Action |\n|-------------|---------|--------|\n| \u003e 0.8 | Excellent | System working well |\n| 0.6 - 0.8 | Good | Minor improvements possible |\n| 0.4 - 0.6 | Moderate | Review retrieval/generation settings |\n| \u003c 0.4 | Poor | Significant tuning needed |\n\n### Judge Providers\n\nConfigure the evaluation LLM in Settings (`/settings`):\n\n- **OpenAI** - GPT-4o-mini (fast), GPT-4o (best quality)\n- **Anthropic** - Claude Sonnet 4, Claude Opus 4\n- **Ollama** - Llama 3.2, Llama 3.1 (local, free)\n\n### Learn More\n\nVisit `/learn-evals` in the app for an interactive guide explaining evaluation concepts, when to use them, and how to act on results.\n\n## License\n\nMIT License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrimpy8%2Fsemantic-search-next","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshrimpy8%2Fsemantic-search-next","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrimpy8%2Fsemantic-search-next/lists"}