{"id":46253204,"url":"https://github.com/benaah/amaniquery","last_synced_at":"2026-03-03T23:01:18.087Z","repository":{"id":323680297,"uuid":"1094173743","full_name":"Benaah/amaniquery","owner":"Benaah","description":"A Retrieval-Augmented Generation (RAG) system for Kenyan legal, parliamentary, and news intelligence. NIRU: Neural Intelligence Retrieval Unit","archived":false,"fork":false,"pushed_at":"2026-01-20T23:10:49.000Z","size":126858,"stargazers_count":1,"open_issues_count":22,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-01-21T09:30:32.931Z","etag":null,"topics":["ai-agents","langchain","langgraph-agents","rag-chatbot","rag-pipeline"],"latest_commit_sha":null,"homepage":"https://amaniquery.vercel.app","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Benaah.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":null}},"created_at":"2025-11-11T11:02:58.000Z","updated_at":"2026-01-20T23:09:19.000Z","dependencies_parsed_at":"2025-11-27T14:07:58.257Z","dependency_job_id":null,"html_url":"https://github.com/Benaah/amaniquery","commit_stats":null,"previous_names":["benaah/amaniquery"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/Benaah/amaniquery","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Benaah%2Famaniquery","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Benaah%2Famaniquery/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Benaah%2Famaniquery/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Benaah%2Famaniquery/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Benaah","download_url":"https://codeload.github.com/Benaah/amaniquery/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Benaah%2Famaniquery/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30064780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T18:21:05.932Z","status":"ssl_error","status_checked_at":"2026-03-03T18:20:59.341Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","langchain","langgraph-agents","rag-chatbot","rag-pipeline"],"created_at":"2026-03-03T23:01:17.070Z","updated_at":"2026-03-03T23:01:18.079Z","avatar_url":"https://github.com/Benaah.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AmaniQuery 🇰🇪\n\n![AmaniQuery](imgs/readme.png)\n\nA Retrieval-Augmented Generation (RAG) system for Kenyan legal, parliamentary, and news intelligence with **three unique \"wow\" features**: Constitutional Alignment Analysis, Public Sentiment Gauge, InfoSMS Gateway, and Parliament Video Indexer.\n\n## 🌟 Unique Features\n\n### 1. 📊 Public Sentiment Gauge\n\n**Track public sentiment on trending topics from news coverage**\n\n- Sentiment analysis on all news articles (positive/negative/neutral)\n- Real-time aggregation by topic with percentage breakdowns\n- Visual sentiment distribution for policies, bills, and events\n- Example: \"Finance Bill: 70% negative, 20% neutral, 10% positive\"\n\n```bash\nGET /sentiment?topic=Finance%20Bill\u0026days=30\n```\n\n### 2. 📱 InfoSMS Gateway (Kabambe Accessibility)\n\n**SMS-based queries for feature phone users**\n\n- 160-character intelligent responses via SMS\n- English and Swahili language support\n- Africa's Talking integration for Kenya\n- Automatic query type detection (legal/parliament/news)\n- Works on feature phones without internet\n\n```bash\nUser SMS: \"Finance Bill\"\nAmaniQuery: \"Finance Bill 2025 raises revenue through digital service tax...\"\n```\n\n### 3. 🎥 Parliament Video Indexer\n\n**Searchable YouTube transcripts with timestamp citations**\n\n- Automatic transcript extraction from Parliament YouTube channels\n- Timestamp-based citations (jump to exact moment)\n- 60-second chunks with contextual overlap\n- Vector search for semantic matching\n- Direct YouTube links with `\u0026t=XXs` parameters\n\n```bash\nQuery: \"budget allocation for education\"\nResponse: \"At 15:42 in the Finance Committee session...\"\nLink: https://youtube.com/watch?v=abc123\u0026t=942s\n```\n\n### 4. ⚖️ Constitutional Alignment Analysis\n\n**Compare Bills and Acts against the Constitution**\n\n- Dual-retrieval RAG system (Bill + Constitution chunks separately)\n- Granular legal metadata extraction (articles, clauses)\n- Structured comparative analysis with citations\n- Quick-check endpoint for specific constitutional topics\n\n## 🏛️ Architecture\n\nAmaniQuery is built as an 8-module pipeline:\n\n1. **[NiruSpider](Module1_NiruSpider/README.md)** - Web crawler for data ingestion\n2. **[NiruParser](Module2_NiruParser/README.md)** - ETL pipeline with embedding generation\n3. **[NiruDB](Module3_NiruDB/README.md)** - Vector database with metadata storage\n4. **[NiruAPI](Module4_NiruAPI/README.md)** - RAG-powered query interface with multi-model support\n5. **[NiruShare](Module5_NiruShare/README.md)** - Social media sharing service\n6. **[NiruVoice](Module6_NiruVoice/README.md)** - Voice agent for real-time conversations\n7. **[NiruHybrid](Module7_NiruHybrid/README.md)** - Enhanced RAG with hybrid encoder and adaptive retrieval\n8. **[NiruAuth](Module8_NiruAuth/README.md)** - Authentication and authorization system for users and third-party integrations\n\n## 📸 Screenshots\n\n### Homepage\n\n![Homepage](imgs/homepage.png)\n\n### Chat Interface\n\n![Chat Interface](imgs/chat_1.png)\n\n### Voice Agent\n\n![Voice Agent 1](imgs/voice_1.png)\n![Voice Agent 2](imgs/voice_2.png)\n\n### Admin Dashboard\n\n![Admin Dashboard 1](imgs/admin_1.png)\n![Admin Dashboard 2](imgs/admin_2.png)\n![Admin Dashboard 3](imgs/admin_3.png)\n\n### AI Integration\n\n![AI Integration](imgs/ai_integration.png)\n\n## 📚 Documentation\n\n**Comprehensive documentation is available:**\n\n- **[📖 Documentation Index](./docs/DOCUMENTATION_INDEX.md)** - Central navigation hub with organized paths for different user roles\n- **[🚀 Quick Start Guide](./QUICKSTART.md)** - Detailed installation instructions\n- **[🏗️ Architecture Docs](./docs)** - System design and module documentation\n\n### Documentation by Role\n\n| Role | Start With | Key Docs |\n|------|------------|----------|\n| **End Users** | [Main Features](#-unique-features) | [InfoSMS Guide](./docs/SHARING_GUIDE.md) |\n| **Developers** | [📖 Documentation Index](./docs/DOCUMENTATION_INDEX.md) | [Module READMEs](./Module1_NiruSpider/README.md) |\n| **DevOps** | [Deployment Guide](./docs/DEPLOYMENT_GUIDE.md) | [Docker/K8s Docs](./docs) |\n| **Contributors** | [Contributing Guide](./CONTRIBUTING.md) | [Architecture Docs](./docs/AMANIQ_V2_ARCHITECTURE.md) |\n\n## 🚀 Quick Start\n\nSee the [Quick Start Guide](./QUICKSTART.md) for detailed installation instructions.\n\n**tl;dr:**\n\n```bash\n# 1. Setup environment\npython setup.py\n\n# 2. Run API (includes all modules)\npython start_api.py\n```\n\nFor detailed module-specific instructions, see [📖 Documentation Index](./docs/DOCUMENTATION_INDEX.md).\n\n- Authentication system (Module 8, if `ENABLE_AUTH=true`)\n- All API endpoints\n\n### 4. Initialize Authentication (Optional)\n\nIf you want to enable authentication:\n\n```bash\n# Run database migration for auth tables\npython migrate_auth_db.py\n\n# Set environment variable\nENABLE_AUTH=true\n```\n\nSee [NiruAuth README](Module8_NiruAuth/README.md) for detailed setup instructions.\n\n### 4. Query and Share\n\n```python\nimport requests\n\n# Standard query\nresponse = requests.post(\"http://localhost:8000/query\", json={\n    \"query\": \"What does the Constitution say about freedom of expression?\"\n})\nresult = response.json()\n\n# Streaming query (real-time token-by-token)\nresponse = requests.post(\"http://localhost:8000/query/stream\", json={\n    \"query\": \"What does the Constitution say about freedom of expression?\",\n    \"top_k\": 5,\n    \"include_sources\": True\n}, stream=True)\n\nfor line in response.iter_lines():\n    if line:\n        print(line.decode('utf-8'))\n\n# Hybrid RAG query (enhanced retrieval)\nresponse = requests.post(\"http://localhost:8000/query/hybrid\", json={\n    \"query\": \"What does the Constitution say about freedom of expression?\",\n    \"top_k\": 5,\n    \"use_hybrid\": True\n})\nresult = response.json()\n\n# Share to Twitter\nshare = requests.post(\"http://localhost:8000/share/format\", json={\n    \"answer\": result[\"answer\"],\n    \"sources\": result[\"sources\"],\n    \"platform\": \"twitter\",\n    \"query\": \"Constitutional rights\"\n})\nprint(share.json()[\"content\"])\n```\n\n## 🎯 Data Sources\n\n### Kenyan Laws \u0026 Constitution\n\n- **Source**: \u003chttps://new.kenyalaw.org/\u003e\n- **Strategy**: Comprehensive crawl + periodic updates\n- **Content**:\n  - Constitution of Kenya 2010 (article-level)\n  - Acts of Parliament (500+ acts, section-level)\n  - Bills (all types)\n  - Subsidiary \u0026 County Legislation\n  - Case Law \u0026 Judgments (300k+ decisions, all courts)\n  - Kenya Gazette (8,000+ gazettes, 1899-2025)\n  - Treaties \u0026 International Agreements\n  - Legal Publications \u0026 Journals\n  - Daily Cause Lists\n\n### Parliament\n\n- **Source**: \u003chttps://www.parliament.go.ke/\u003e\n- **Strategy**: Weekly crawl\n- **Content**: Hansards, Bills, Publications\n\n### Kenyan News (High-Frequency)\n\n- **Sources**:\n  - nation.africa/rss\n  - standardmedia.co.ke/rss\n  - the-star.co.ke/rss\n  - businessdailyafrica.com/rss\n- **Strategy**: Daily RSS feed parsing\n\n### Global News \u0026 International Affairs\n\n- **Sources**:\n  - Geopolitics: Reuters, BBC, Al Jazeera, Foreign Policy\n  - International Organizations: UN, WHO, World Bank, IMF, African Union\n  - Technology: Reuters Tech, TechCrunch, MIT Tech Review\n  - Policy: The Economist, Brookings, CFR\n  - Climate \u0026 Development: UN Climate, UNDP\n- **Strategy**: Daily RSS feed parsing\n- **Focus**: Africa-relevant global news, international policy, institutional announcements\n\n## 🚀 Features\n\n### Core Features\n\n- ✅ Automated web crawling from Kenyan sources\n- ✅ Intelligent text processing \u0026 chunking\n- ✅ Vector embeddings for semantic search\n- ✅ RAG-powered Q\u0026A with multi-model support (OpenAI, Moonshot, Anthropic, Gemini)\n- ✅ **Real-time streaming responses** - Token-by-token generation for faster perceived speed\n- ✅ **Multi-model ensemble** - When context is limited, queries all available models and combines responses for accuracy\n- ✅ **Hybrid RAG Pipeline** - Enhanced retrieval with hybrid encoder and adaptive retrieval\n- ✅ Source citation \u0026 verification\n- ✅ REST API with interactive documentation\n\n### Unique Differentiators\n\n- ✅ **Public Sentiment Gauge** - Track news sentiment by topic\n- ✅ **InfoSMS Gateway** - SMS queries via Africa's Talking (kabambe accessibility)\n- ✅ **Parliament Video Indexer** - Searchable YouTube transcripts with timestamps\n- ✅ **Constitutional Alignment Analysis** - Dual-retrieval Bill-Constitution comparison\n- ✅ **Vision RAG** - Multimodal RAG with Cohere Embed-4 and Gemini 2.5 Flash for image/PDF analysis\n- ✅ **Social media sharing** - Intelligent formatting for Twitter/X, LinkedIn, Facebook\n- ✅ **Chat interface** - Modern, responsive UI with copy/edit/resend for failed queries\n- ✅ **Voice agent** - Real-time voice conversations via VibeVoice\n- ✅ **Authentication \u0026 Authorization** - User accounts, API keys, OAuth 2.0, RBAC, rate limiting, usage tracking\n\n## 🧠 RAG Pipeline\n\n### Standard RAG\n\n1. **Chunking**: 500-1000 characters with 100-char overlap\n2. **Embedding Model**: all-MiniLM-L6-v2\n3. **Vector DB**: ChromaDB / FAISS / Upstash / Qdrant\n4. **LLM**: Moonshot AI (default), OpenAI, Anthropic, Gemini\n\n### Enhanced Features\n\n#### Multi-Model Ensemble\n\nWhen context is limited or unavailable in vector storage, AmaniQuery automatically:\n\n- Queries all available models (OpenAI, Moonshot, Anthropic, Gemini) in parallel\n- Combines responses intelligently to remove redundancy\n- Streams the synthesized response for better accuracy\n\n#### Hybrid RAG (Module 7)\n\n- **Hybrid Encoder**: Combines convolutional and transformer architectures for enhanced embeddings\n- **Adaptive Retrieval**: Multi-stage retrieval with context-aware thresholds\n- **Streaming Support**: Optimized for real-time token-by-token responses\n- **Improved Response Format**: Concise, scannable responses with clear structure\n\n#### Response Formatting\n\n- **Concise structure**: Summary → Key Points → Important Details\n- **Better readability**: Proper spacing, bullet points, limited section length\n- **No redundant disclaimers**: Only cites sources when directly used\n\n## 📊 Feature Details\n\n### Public Sentiment Gauge\n\nAnalyze news sentiment on any topic:\n\n```python\n# Get sentiment breakdown\nGET /sentiment?topic=Finance%20Bill\u0026days=30\n\n# Response\n{\n  \"sentiment_percentages\": {\n    \"positive\": 15.0,\n    \"negative\": 70.0,\n    \"neutral\": 15.0\n  },\n  \"average_polarity\": -0.35,\n  \"total_articles\": 20\n}\n```\n\n**Use Cases:**\n\n- Track public reaction to legislation\n- Monitor news tone on policies\n- Identify controversial topics\n- Compare Kenyan vs Global coverage sentiment\n\n### InfoSMS Gateway\n\nQuery AmaniQuery via SMS (no internet needed):\n\n```python\n# Webhook for incoming SMS\nPOST /sms-webhook\n\n# Preview SMS response (testing)\nGET /sms-query?query=Finance%20Bill\u0026language=en\n\n# Manual SMS send\nPOST /sms-send?phone_number=+254712345678\u0026message=...\n```\n\n**Setup:**\n\n1. Sign up at \u003chttps://africastalking.com\u003e\n2. Set environment variables: `AT_USERNAME`, `AT_API_KEY`\n3. Configure webhook URL in Africa's Talking dashboard\n4. Users send SMS to your shortcode\n\n**Features:**\n\n- 160-character concise responses\n- English and Swahili support\n- ~KES 0.80 per SMS in Kenya\n- Feature phone accessibility (kabambe)\n\n### Parliament Video Indexer\n\nSearch Parliament YouTube videos with timestamp citations:\n\n```python\n# Search videos\nPOST /query\n{\n  \"query\": \"budget allocation for education\",\n  \"category\": \"Parliamentary Record\"\n}\n\n# Response includes timestamp URLs\n{\n  \"sources\": [{\n    \"title\": \"Finance Committee Session\",\n    \"timestamp_url\": \"https://youtube.com/watch?v=abc\u0026t=942s\",\n    \"timestamp_formatted\": \"15:42\",\n    \"excerpt\": \"Budget allocation discussion...\"\n  }]\n}\n```\n\n**How it works:**\n\n1. Spider scrapes Parliament YouTube channels\n2. youtube-transcript-api extracts transcripts with timestamps\n3. 60-second chunks with 10-second overlap\n4. Each chunk indexed with `start_time_seconds`\n5. Citations include YouTube links with `\u0026t=XXs` parameter\n\n## 🏛️ Constitutional Alignment Module\n\nAmaniQuery's **core legal feature**: Dual-retrieval RAG for constitutional compliance analysis.\n\n**How it works:**\n\n1. Analyzes query to identify Bill and constitutional concepts\n2. Retrieves Bill chunks (filtered by `category='Bill'`)\n3. Retrieves Constitution chunks (filtered by `category='Constitution'`)\n4. Generates structured comparative analysis with citations\n\n**Example:**\n\n```python\nresponse = requests.post(\"http://localhost:8000/alignment-check\", json={\n    \"query\": \"How does the Finance Bill housing levy align with the constitution?\"\n})\n\n# Returns structured analysis:\n# 1. The Bill's Proposal (with citations)\n# 2. Relevant Constitutional Provisions\n# 3. Alignment Analysis (objective comparison)\n# 4. Key Considerations\n```\n\n**API Endpoints:**\n\n- `POST /alignment-check` - Full constitutional alignment analysis\n- `POST /alignment-quick-check` - Quick bill vs concept check\n\nSee [Constitutional Alignment Guide](docs/CONSTITUTIONAL_ALIGNMENT.md) for details.\n\n## Documentation\n\n- `GET /docs` - Interactive API documentation (Swagger UI)\n- `GET /redoc` - Alternative documentation (ReDoc)\n\n## �📱 Social Media Sharing\n\nModule 5 provides intelligent formatting for:\n\n- **Twitter/X**: Auto-threading for long content (280 char limit)\n- **LinkedIn**: Professional posts with key takeaways (3000 char)\n- **Facebook**: Engaging posts with call-to-action\n\nSee [Sharing Guide](docs/SHARING_GUIDE.md) for details.\n\n## 📊 Metadata Structure\n\nEach chunk stores:\n\n- `source_url`: Original article/document URL\n- `title`: Document title\n- `publication_date`: ISO format date\n- `category`: [\"Kenyan Law\", \"Parliament\", \"Kenyan News\", \"Global Trend\"]\n- `chunk_id`: Unique identifier (e.g., article-xyz_chunk_3)\n- `author`: When available\n- `summary`: Auto-generated snippet\n\n## 🔧 Configuration\n\nEdit `config/sources.yaml` to:\n\n- Add/remove data sources\n- Adjust crawl schedules\n- Configure chunk sizes\n- Set embedding parameters\n\n## 📅 Automated Scheduling\n\nUse Windows Task Scheduler or cron (Linux):\n\n```bash\n# Daily news crawl at 6 AM\n# Weekly parliament crawl on Mondays\n# Monthly law database update\n```\n\nSee `scripts/scheduler_setup.md` for details.\n\n## 🛡️ Ethical Crawling\n\n- Respects `robots.txt`\n- 2-3 second delays between requests\n- User-agent identification\n- Rate limiting on RSS feeds\n\n## 📚 Documentation\n\n- [Quick Start Guide](QUICKSTART.md) - Step-by-step setup\n- [Constitutional Alignment](docs/CONSTITUTIONAL_ALIGNMENT.md) - **Core feature guide**\n- [Moonshot AI Setup](docs/MOONSHOT_SETUP.md) - LLM configuration\n- [Social Media Sharing](docs/SHARING_GUIDE.md) - Sharing guide\n- [Authentication System](Module8_NiruAuth/README.md) - **Auth module guide**\n- [Email Setup](Module8_NiruAuth/README_EMAIL_SETUP.md) - Gmail SMTP configuration\n- [API Documentation](http://localhost:8000/docs) - Interactive docs\n\n## 💡 Use Cases\n\n- 📚 Legal research \u0026 constitutional queries\n- ⚖️ **Constitutional alignment analysis** (Bills vs Constitution)\n- 🏛️ Parliamentary proceedings analysis\n- 📰 News aggregation \u0026 summarization\n- 🌍 Policy \u0026 global trend tracking\n- 📱 Social media content creation\n- 🎓 Educational resource for Kenyan civics\n- 💼 Legislative due diligence\n- 💬 **Real-time chat interface** - Interactive Q\u0026A with streaming responses\n- 🎤 **Voice queries** - Ask questions via voice (VibeVoice integration)\n- 🔄 **Multi-model accuracy** - Enhanced responses when context is limited\n- 📊 **Hybrid retrieval** - Improved accuracy with adaptive retrieval\n- 🔐 **Third-party integrations** - API keys and OAuth 2.0 for external applications\n- 📊 **Usage analytics** - Track API usage and costs for integrations\n\n## �📝 License\n\nApache License 2.0 - See LICENSE file\n\n## 🤝 Contributing\n\nAll contributions are welcome! Refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for details.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenaah%2Famaniquery","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenaah%2Famaniquery","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenaah%2Famaniquery/lists"}