{"id":31881217,"url":"https://github.com/vezlo/ai-validator","last_synced_at":"2026-01-20T16:27:59.005Z","repository":{"id":317696290,"uuid":"1068503476","full_name":"vezlo/ai-validator","owner":"vezlo","description":"AI Response Validator - Automated accuracy checking, hallucination prevention, and confidence scoring for AI responses","archived":false,"fork":false,"pushed_at":"2025-12-30T06:47:03.000Z","size":62,"stargazers_count":0,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-02T17:59:25.798Z","etag":null,"topics":["accuracy","ai","ai-response","claude","confidence","hallucination","knowlege-base","llm","openai","rag","response-validation","validation","validator"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vezlo.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-02T13:37:36.000Z","updated_at":"2025-12-30T06:47:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"231b6d7e-b936-4fe2-9f1a-8a90a2744837","html_url":"https://github.com/vezlo/ai-validator","commit_stats":null,"previous_names":["vezlo/ai-validator"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vezlo/ai-validator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vezlo%2Fai-validator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vezlo%2Fai-validator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vezlo%2Fai-validator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vezlo%2Fai-validator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vezlo","download_url":"https://codeload.github.com/vezlo/ai-validator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vezlo%2Fai-validator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28607087,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T16:10:39.856Z","status":"ssl_error","status_checked_at":"2026-01-20T16:10:39.493Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accuracy","ai","ai-response","claude","confidence","hallucination","knowlege-base","llm","openai","rag","response-validation","validation","validator"],"created_at":"2025-10-13T01:58:10.745Z","updated_at":"2026-01-20T16:27:58.905Z","avatar_url":"https://github.com/vezlo.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Validator\n\n[![npm version](https://img.shields.io/npm/v/@vezlo/ai-validator.svg)](https://www.npmjs.com/package/@vezlo/ai-validator)\n[![License: AGPL-3.0](https://img.shields.io/badge/License-AGPL%203.0-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)\n\n**AI Response Validator** - Automated accuracy checking, hallucination prevention, and confidence scoring for AI responses.\n\n## 🎯 Purpose\n\nAI Validator helps you ensure the quality and reliability of AI-generated responses by:\n\n- ✅ **Automated Accuracy Checking** - Verify AI responses against source documents\n- ✅ **Hallucination Prevention** - Detect when AI invents information not in sources\n- ✅ **Confidence Scoring** - Get reliability scores for every response\n- ✅ **Query Classification** - Skip validation for greetings, typos, and small talk\n- ✅ **Multi-LLM Support** - Works with OpenAI and Claude\n\nPerfect for RAG systems, knowledge bases, and any application where AI response quality matters.\n\n## 🚀 Quick Start\n\n### Installation\n\n```bash\nnpm install @vezlo/ai-validator\n```\n\nOr install globally for CLI access:\n\n```bash\nnpm install -g @vezlo/ai-validator\n```\n\n### For Local Development/Testing\n\n```bash\n# Clone the repository\ngit clone https://github.com/vezlo/ai-validator.git\ncd ai-validator\n\n# Install dependencies\nnpm install\n\n# Build the project\nnpm run build\n\n# Run the test CLI\nnpm test\n```\n\n## 💻 Usage\n\n### 1. CLI Testing (Interactive)\n\nTest the validator interactively without writing code:\n\n```bash\n# Using npx (no installation required)\nnpx vezlo-validator-test\n\n# Or if installed globally\nvezlo-validator-test\n```\n\nThe CLI will guide you through:\n- Selecting LLM provider (OpenAI or Claude)\n- Entering API keys\n- Choosing models (any OpenAI or Claude model)\n- Configuring validation settings\n- Testing with your own queries and responses\n- Easy text input for sources (no JSON required)\n\n### 2. Code Usage (Programmatic)\n\n#### Basic Example\n\n```typescript\nimport { AIValidator } from '@vezlo/ai-validator';\n\n// Initialize with your API key and provider\nconst validator = new AIValidator({\n  openaiApiKey: 'sk-your-openai-key',  // Your OpenAI API key\n  llmProvider: 'openai'                 // 'openai' or 'claude'\n});\n\n// Validate a response\nconst validation = await validator.validate({\n  query: \"What is machine learning?\",\n  response: \"Machine learning is a subset of AI that focuses on algorithms.\",\n  sources: [\n    {\n      content: \"Machine learning is a subset of artificial intelligence that focuses on algorithms and statistical models.\",\n      title: \"ML Guide\",\n      url: \"https://example.com/ml-guide\"\n    }\n  ]\n});\n\n// Check results\nconsole.log(`Confidence: ${(validation.confidence * 100).toFixed(1)}%`);\nconsole.log(`Valid: ${validation.valid}`);\nconsole.log(`Accuracy: ${validation.accuracy.verified ? 'Verified' : 'Not verified'}`);\nconsole.log(`Hallucination Risk: ${(validation.hallucination.risk * 100).toFixed(1)}%`);\nconsole.log(`Warnings: ${validation.warnings.join(', ')}`);\n```\n\n#### Advanced Configuration\n\n```typescript\nimport { AIValidator } from '@vezlo/ai-validator';\n\nconst validator = new AIValidator({\n  // API Keys (at least one required)\n  openaiApiKey: 'sk-your-openai-key',\n  claudeApiKey: 'sk-ant-your-claude-key',\n  \n  // LLM Provider (required)\n  llmProvider: 'openai', // 'openai' or 'claude'\n  \n  // Model Selection (optional - you can specify any model from the provider)\n  openaiModel: 'gpt-4o',  // Any OpenAI model: gpt-4o, gpt-4o-mini, gpt-4, etc.\n  claudeModel: 'claude-sonnet-4-5-20250929',  // Any Claude model\n  \n  // Validation Settings (optional)\n  confidenceThreshold: 0.7,           // 0.0 - 1.0 (default: 0.7)\n  enableQueryClassification: true,     // Skip validation for greetings/typos\n  enableAccuracyCheck: true,          // LLM-based accuracy checking\n  enableHallucinationDetection: true  // LLM-based hallucination detection\n});\n```\n\n### Integration with RAG Systems\n\n```typescript\n// Example with a RAG system\nconst ragResponse = await yourRAGSystem.query(userQuestion);\nconst sources = await yourRAGSystem.getSources(userQuestion);\n\nconst validation = await validator.validate({\n  query: userQuestion,\n  response: ragResponse.content,\n  sources: sources.map(s =\u003e ({\n    content: s.text,\n    title: s.title,\n    url: s.url\n  }))\n});\n\nif (validation.valid) {\n  // Show response to user\n  return ragResponse.content;\n} else {\n  // Handle low confidence response\n  console.warn('Low confidence response:', validation.warnings);\n  return \"I'm not confident about this answer. Please consult additional sources.\";\n}\n```\n\n## 📊 Validation Results\n\n```typescript\ninterface ValidationResult {\n  confidence: number;        // 0.0 - 1.0\n  valid: boolean;            // true if confidence \u003e= threshold\n  accuracy: {\n    verified: boolean;\n    verification_rate: number;\n    reason?: string;\n  };\n  context: {\n    source_relevance: number;\n    source_usage_rate: number;\n    valid: boolean;\n  };\n  hallucination: {\n    detected: boolean;\n    risk: number;\n    hallucinated_parts?: string[];\n  };\n  warnings: string[];\n  query_type?: string;       // 'greeting', 'question', etc.\n  skip_validation?: boolean; // true for greetings/typos\n}\n```\n\n## 🔧 Configuration\n\n### Configuration Options\n\nAll configuration is done in code when initializing the validator:\n\n```typescript\ninterface AIValidatorConfig {\n  // API Keys (at least one required)\n  openaiApiKey?: string;      // Your OpenAI API key\n  claudeApiKey?: string;       // Your Claude API key\n  \n  // Provider (required)\n  llmProvider: 'openai' | 'claude';\n  \n  // Models (optional - specify any valid model from the chosen provider)\n  openaiModel?: string;        // Default: 'gpt-4o'\n  claudeModel?: string;        // Default: 'claude-sonnet-4-5-20250929'\n  \n  // Validation Settings (optional)\n  confidenceThreshold?: number;           // Default: 0.7\n  enableQueryClassification?: boolean;    // Default: true\n  enableAccuracyCheck?: boolean;         // Default: true\n  enableHallucinationDetection?: boolean; // Default: true\n}\n```\n\n### Model Support\n\n**OpenAI Models:**\nYou can use any OpenAI chat model by specifying it in `openaiModel`. Common choices include:\n- `gpt-4o` (default, recommended)\n- `gpt-4o-mini` (faster, cheaper)\n- `gpt-4` (previous flagship)\n- `gpt-4-turbo`\n- Or any other OpenAI chat completion model\n\n**Claude Models:**\nYou can use any Claude model by specifying it in `claudeModel`. Common choices include:\n- `claude-sonnet-4-5-20250929` (default, Claude 4.5 Sonnet)\n- `claude-opus-4-1-20250805` (Claude 4.1 Opus)\n- `claude-3-7-sonnet-20250219` (Claude 3.7 Sonnet)\n- Or any other Claude model identifier\n\nThe validator will work with any model supported by the respective provider's API.\n\n### CLI Commands\n\n```bash\n# Interactive testing CLI\nnpx vezlo-validator-test\n\n# Development commands\nnpm run build   # Build the project\nnpm run clean   # Clean build files\nnpm test        # Run the test CLI\n```\n\n## 🎯 Use Cases\n\n### 1. RAG Systems\nValidate responses against retrieved documents to ensure accuracy.\n\n### 2. Customer Support Bots\nPrevent incorrect information from reaching customers.\n\n### 3. Knowledge Base Applications\nEnsure AI answers are grounded in your documentation.\n\n### 4. Content Generation\nValidate AI-generated content against source materials.\n\n### 5. Educational Applications\nEnsure AI tutoring responses are accurate and helpful.\n\n## ⚡ Performance\n\n- **Validation Time**: 2-5 seconds per response (depending on LLM provider)\n- **Cost**: Additional LLM API calls for validation\n- **Accuracy**: High accuracy for responses with good sources\n- **Reliability**: Graceful handling of edge cases\n\n## 🔍 How It Works\n\n1. **Query Classification** - Identifies greetings, typos, and small talk (skips validation)\n2. **Accuracy Checking** - Uses LLM to verify facts against source documents\n3. **Hallucination Detection** - Identifies information not present in sources\n4. **Context Validation** - Ensures response relevance to the query\n5. **Confidence Scoring** - Combines all metrics into a single score\n\n## 📝 Examples\n\n### High Confidence Response\n```typescript\n{\n  confidence: 0.92,\n  valid: true,\n  accuracy: { verified: true, verification_rate: 0.95 },\n  hallucination: { detected: false, risk: 0.05 },\n  warnings: []\n}\n```\n\n### Low Confidence Response\n```typescript\n{\n  confidence: 0.35,\n  valid: false,\n  accuracy: { verified: false, verification_rate: 0.2 },\n  hallucination: { detected: true, risk: 0.8 },\n  warnings: [\"No sources provided - high hallucination risk\"]\n}\n```\n\n### Skipped Validation (Greeting)\n```typescript\n{\n  confidence: 1.0,\n  valid: true,\n  query_type: \"greeting\",\n  skip_validation: true,\n  warnings: []\n}\n```\n\n## 🤝 Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## 📄 License\n\nThis project is dual-licensed:\n\n- **Non-Commercial Use**: Free under AGPL-3.0 license\n- **Commercial Use**: Requires a commercial license - contact us for details\n\nSee the [LICENSE](LICENSE) file for complete AGPL-3.0 license terms.\n\n## 🆘 Support\n\n- **Issues**: [GitHub Issues](https://github.com/vezlo/ai-validator/issues)\n- **Documentation**: [GitHub Wiki](https://github.com/vezlo/ai-validator/wiki)\n- **Discussions**: [GitHub Discussions](https://github.com/vezlo/ai-validator/discussions)\n\n## 🔗 Related Projects\n\n- [@vezlo/assistant-server](https://www.npmjs.com/package/@vezlo/assistant-server) - AI Assistant Server with RAG capabilities\n- [@vezlo/src-to-kb](https://www.npmjs.com/package/@vezlo/src-to-kb) - Convert source code to knowledge base\n\n---\n\n**Status**: ✅ Production Ready | **Version**: 1.0.2 | **License**: AGPL-3.0 | **Node.js**: 20+\n\n**Made with ❤️ by [Vezlo](https://vezlo.org)**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvezlo%2Fai-validator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvezlo%2Fai-validator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvezlo%2Fai-validator/lists"}