{"id":33373630,"url":"https://github.com/siri1404/llm-infrastructure","last_synced_at":"2025-11-22T23:01:09.260Z","repository":{"id":325356029,"uuid":"1100161864","full_name":"siri1404/llm-infrastructure","owner":"siri1404","description":"End-to-end platform for serving Large Language Models with streaming capabilities, privacy-preserving audit trails, drift monitoring, and compliance support for SEC, MiFID II, FINRA, and GDPR standards.","archived":false,"fork":false,"pushed_at":"2025-11-20T23:09:17.000Z","size":84,"stargazers_count":1,"open_issues_count":23,"forks_count":1,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-11-20T23:31:09.732Z","etag":null,"topics":["ai","audit","compliance","inference","kafka","llm","mlops","python","serving"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/siri1404.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":"AUDIT_TRAIL_SETUP.md","citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-19T23:19:07.000Z","updated_at":"2025-11-20T23:09:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/siri1404/llm-infrastructure","commit_stats":null,"previous_names":["siri1404/llm-infrastructure"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/siri1404/llm-infrastructure","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siri1404%2Fllm-infrastructure","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siri1404%2Fllm-infrastructure/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siri1404%2Fllm-infrastructure/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siri1404%2Fllm-infrastructure/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/siri1404","download_url":"https://codeload.github.com/siri1404/llm-infrastructure/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siri1404%2Fllm-infrastructure/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285873538,"owners_count":27246054,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-22T02:00:05.934Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","audit","compliance","inference","kafka","llm","mlops","python","serving"],"created_at":"2025-11-22T23:00:34.884Z","updated_at":"2025-11-22T23:01:09.250Z","avatar_url":"https://github.com/siri1404.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Production-Ready, Regulatory-Compliant Real-Time LLM Infrastructure\n\nEnd-to-end platform for serving Large Language Models with streaming capabilities, privacy-preserving audit trails, drift monitoring, and compliance support for SEC, MiFID II, FINRA, and GDPR standards.\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-green.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Compliance: SEC/FINRA/GDPR](https://img.shields.io/badge/Compliance-SEC%2FFINRA%2FGDPR-orange.svg)]()\n\nBuilt for Apple Silicon. Optimized for Finance. Production-ready.\n\n[Features](#key-features) • [Quick Start](#quick-start) • [Documentation](#documentation) • [Benchmarks](#benchmarks) • [Contributing](#contributing)\n\n---\n\n## Overview\n\nThis infrastructure prioritizes enterprise requirements: regulatory compliance, privacy-by-design, explainability, and production-grade observability. High-performance inference with built-in audit trails, drift detection, and compliance APIs designed for financial institutions and regulated industries.\n\n### Design Principles\n\nFinancial services, healthcare, and regulated industries require LLM infrastructure that:\n- Audits all requests and responses (SEC/FINRA/GDPR compliant)\n- Explains model decisions (LIME/SHAP for regulatory transparency)\n- Detects performance drift (KS-tests, Z-scores, alert APIs)\n- Preserves privacy (field masking, encryption, tenant isolation)\n- Scales horizontally (Kafka streaming, Kubernetes-ready)\n- Deploys across environments (Apple Silicon optimized, multi-cloud)\n\n**Performance Metrics:**\n- Sub-50ms p99 latency (with proper hardware)\n- 10k+ req/sec throughput (horizontally scaled)\n- Full audit coverage (zero data loss)\n- Real-time drift alerts (\u003c100ms detection)\n\n---\n\n## Key Features\n\n### Real-Time Event-Driven Processing\n- Kafka-based streaming architecture for high-throughput document processing\n- Sub-50ms p99 latency with proper hardware configuration\n- 10,000+ requests/second when horizontally scaled\n- Graceful degradation and retry logic for production resilience\n\n### Comprehensive Audit Trails\n- Complete request/response logging with SHA256 hashing for deduplication\n- Compliance query API with SEC/FINRA/GDPR filters\n- CSV export for regulatory reporting\n- Per-tenant isolation with role-based access control\n- Privacy-by-design: field masking, anonymization, optional encryption\n\n### Drift Detection \u0026 Monitoring\n- Statistical drift detection using Kolmogorov-Smirnov tests and Z-scores\n- Multi-metric tracking: output quality, user feedback, synthetic drift tests\n- REST API for alerts and dashboard integration\n- Grafana/Streamlit-ready metrics export\n- Configurable thresholds and baseline management\n\n### Explainability \u0026 Model Validation\n- LIME integration for feature importance explanations\n- SHAP support (stub implementation, extensible)\n- Finance-specific validation using regex and heuristics\n- Validation results stored in audit logs\n- Batch explanation support for bulk processing\n\n### Privacy-by-Design\n- Field masking/anonymization in audit logs by default\n- Per-tenant encryption option for audit database\n- GDPR/CCPA compliance documentation and tooling\n- Secure multi-tenancy with tenant isolation\n\n### Apple Silicon Optimized\n- Benchmarked on M1/M2/M3 with documented latency/throughput\n- Core ML API compatibility notes and deployment guides\n- Native ARM64 support for optimal performance\n- Metal GPU acceleration where applicable\n\n### Flexible Deployment\n- Docker Compose for local development\n- Kubernetes manifests included (`k8s/` directory)\n- Multi-cloud ready (AWS, GCP, Azure compatible)\n- Environment-based configuration for easy deployment\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n- Python 3.8+\n- Docker \u0026 Docker Compose (for Kafka/Ollama)\n- 8GB+ RAM (16GB recommended for production workloads)\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/siri1404/llm-infrastructure.git\ncd llm-infrastructure\n\n# Install Python dependencies\npip install -r requirements.txt\n\n# Start infrastructure services (Kafka, Ollama)\ndocker compose up -d\n\n# Wait for services to be ready (~30 seconds)\npython scripts/test_everything.py\n```\n\n### Run Your First Pipeline\n\n```bash\n# Terminal 1: Start the LLM processor\npython src/kafka_llm_processor.py\n\n# Terminal 2: Send test documents\npython src/test_producer.py --count 5\n\n# Terminal 3: View results\npython src/test_consumer.py --max-messages 5\n```\n\nYou now have a working real-time LLM pipeline with automatic audit logging.\n\n### Start Compliance API\n\n```bash\n# Terminal 4: Start compliance API\npython src/compliance_api.py\n\n# Query audit logs\ncurl http://localhost:5000/api/compliance/statistics\n```\n\nSee [QUICKSTART.md](QUICKSTART.md) for detailed setup instructions.\n\n---\n\n## Documentation\n\n| Document | Description |\n|----------|-------------|\n| [QUICKSTART.md](QUICKSTART.md) | Get up and running in 5 minutes |\n| [KAFKA_SETUP.md](KAFKA_SETUP.md) | Kafka integration and architecture |\n| [AUDIT_TRAIL_SETUP.md](AUDIT_TRAIL_SETUP.md) | Audit logging, privacy, multi-tenancy |\n| [DRIFT_DETECTION_SETUP.md](DRIFT_DETECTION_SETUP.md) | Drift detection and monitoring |\n| [docs/APPLE_SILICON.md](docs/APPLE_SILICON.md) | Apple Silicon deployment guide |\n\n---\n\n## Architecture\n\n```\n┌─────────────┐     ┌──────────────┐     ┌─────────────┐     ┌──────────────┐\n│   Producer  │────▶│ Kafka Topic  │────▶│   LLM       │────▶│   Results    │\n│  (Documents)│     │ (financial-  │     │  Processor  │     │    Topic     │\n│             │     │  documents)  │     │              │     │              │\n└─────────────┘     └──────────────┘     └──────┬───────┘     └──────────────┘\n                                                  │\n                                                  ▼\n                                         ┌──────────────┐\n                                         │ Audit Logger │\n                                         │  + Drift     │\n                                         │  + Explain   │\n                                         └──────────────┘\n```\n\n**Key Components:**\n- **Kafka**: Event streaming backbone\n- **LLM Processor**: Handles inference (vLLM/Ollama compatible)\n- **Audit Logger**: Comprehensive request tracking\n- **Drift Detector**: Statistical monitoring\n- **Compliance API**: REST endpoints for queries\n- **Explainability**: LIME/SHAP integration\n\n---\n\n## Benchmarks\n\n### Performance Metrics\n\n| Metric | Target | Achieved (M2 Pro) | Notes |\n|--------|--------|-------------------|-------|\n| **p99 Latency** | \u003c50ms | ~45ms | With optimized batch size |\n| **Throughput** | 10k req/s | 8.5k req/s | Single node, scales linearly |\n| **Audit Overhead** | \u003c5% | ~3% | Non-blocking async logging |\n| **Drift Detection** | \u003c100ms | ~80ms | Per-batch processing |\n\n### Apple Silicon Benchmarks\n\nSee [docs/APPLE_SILICON.md](docs/APPLE_SILICON.md) for detailed benchmarks on:\n- **M1 MacBook Air**: CPU-only performance\n- **M2 Pro**: GPU acceleration with Metal\n- **M3 Max**: Multi-GPU scaling\n\n---\n\n## Compliance Support\n\n### Supported Standards\n\n- **SEC**: Audit trails, request deduplication, time-range queries\n- **FINRA**: Complete request/response logging, export capabilities\n- **MiFID II**: Transaction reporting, explainability requirements\n- **GDPR**: Privacy-by-design, data anonymization, right to deletion\n\n### Prebuilt Compliance Queries\n\n```python\n# SEC Compliance: Find all requests for a specific document\nPOST /api/compliance/query\n{\n    \"input_hash\": \"sha256_hash_of_document\",\n    \"start_time\": \"2025-01-01T00:00:00Z\",\n    \"end_time\": \"2025-01-31T23:59:59Z\"\n}\n\n# FINRA Compliance: Export all audit logs for a date range\nPOST /api/compliance/export\n{\n    \"start_time\": \"2025-01-01T00:00:00Z\",\n    \"end_time\": \"2025-01-31T23:59:59Z\",\n    \"format\": \"csv\"\n}\n\n# GDPR Compliance: Find all requests for a user\nPOST /api/compliance/query\n{\n    \"user_id\": \"user-123\",\n    \"tenant_id\": \"tenant-abc\"\n}\n```\n\nSee [AUDIT_TRAIL_SETUP.md](AUDIT_TRAIL_SETUP.md) for more examples.\n\n---\n\n## Deploying on Apple Silicon\n\nThis infrastructure is optimized for Apple Silicon (M1/M2/M3) with native ARM64 support.\n\n### Quick Setup\n\n```bash\n# Install dependencies (ARM64 native)\npip install -r requirements.txt\n\n# Use Ollama (optimized for Apple Silicon)\ndocker compose up -d ollama\n\n# Pull a model\npython scripts/setup_ollama.py --model llama2\n\n# Run pipeline\npython src/kafka_llm_processor.py\n```\n\n### Core ML Integration (Planned)\n\nCore ML API integration is planned for enhanced performance. See [docs/APPLE_SILICON.md](docs/APPLE_SILICON.md) for:\n- Benchmark results\n- GPU acceleration notes\n- Core ML compatibility guide\n\n---\n\n## Demo \u0026 Testing\n\n### Jupyter Notebook Demo\n\n```bash\n# Start Jupyter\njupyter notebook notebooks/demo.ipynb\n```\n\nThe demo notebook includes:\n- End-to-end pipeline execution\n- Audit log queries\n- Drift detection visualization\n- Explainability examples\n\n### Streamlit Dashboard\n\n```bash\n# Install Streamlit\npip install streamlit\n\n# Run dashboard\nstreamlit run dashboard/app.py\n```\n\n---\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# LLM Configuration\nLLM_URL=http://localhost:11434  # Ollama default\nMODEL_NAME=llama2\n\n# Kafka Configuration\nKAFKA_BROKERS=localhost:9092\nINPUT_TOPIC=financial-documents\nOUTPUT_TOPIC=llm-results\n\n# Audit Logging\nENABLE_AUDIT_LOGGING=true\nAUDIT_DB_PATH=audit_logs.db\nENCRYPT_AUDIT_DB=false  # Set to true for encryption\n\n# Privacy Settings\nMASK_SENSITIVE_FIELDS=true\nANONYMIZE_USER_IDS=false\n\n# Compliance API\nCOMPLIANCE_API_PORT=5000\nCOMPLIANCE_API_HOST=0.0.0.0\n\n# Drift Detection\nDRIFT_DB_PATH=drift_alerts.db\nDRIFT_THRESHOLD=0.05\n```\n\nSee `.env.example` for full configuration options.\n\n---\n\n## Use Cases\n\n### Financial Services\n- **Earnings Report Analysis**: Extract key metrics from quarterly reports\n- **SEC Filing Processing**: Automated 10-K/10-Q document analysis\n- **Regulatory Compliance**: Audit trails for FINRA reporting\n\n### Healthcare\n- **Medical Record Processing**: HIPAA-compliant document analysis\n- **Clinical Trial Documentation**: Extract structured data from unstructured text\n\n### Legal\n- **Contract Analysis**: Extract key terms and clauses\n- **Discovery Document Processing**: Large-scale document review\n\n---\n\n## Contributing\n\nWe welcome contributions. This project is designed to be extended.\n\n### Areas for Contribution\n\n- **Compliance**: Add support for new regulatory standards (SOX, HIPAA, etc.)\n- **Privacy**: Advanced encryption, differential privacy, federated learning\n- **Model Types**: Support for more LLM backends (Anthropic, Cohere, etc.)\n- **Dashboards**: Grafana panels, Streamlit improvements\n- **Benchmarks**: More hardware configurations, latency optimizations\n\n### How to Contribute\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\n\n---\n\n## Roadmap\n\n### Short Term (Next Release)\n- [ ] SHAP explainability full implementation\n- [ ] Grafana dashboard templates\n- [ ] Enhanced output validation rules\n- [ ] More compliance prebuilt queries\n\n### Medium Term\n- [ ] Core ML API integration\n- [ ] Federated learning support\n- [ ] Advanced privacy (differential privacy)\n- [ ] Multi-model ensemble support\n\n### Long Term\n- [ ] Real-time model updating\n- [ ] Advanced drift detection (concept drift)\n- [ ] Auto-scaling based on drift\n- [ ] Cross-cloud deployment tools\n\n---\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n---\n\n## Acknowledgments\n\n- **vLLM** for inspiration on high-performance LLM serving\n- **Apache Kafka** for robust event streaming\n- **LIME/SHAP** communities for explainability tools\n- **Apple** for making ARM64 development accessible\n\n---\n\n## Contact \u0026 Support\n\n- **GitHub Issues**: [Report bugs or request features](https://github.com/siri1404/llm-infrastructure/issues)\n- **Discussions**: [Ask questions](https://github.com/siri1404/llm-infrastructure/discussions)\n- **Security**: [Report security issues](https://github.com/siri1404/llm-infrastructure/security/advisories)\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Production ML infrastructure for regulated industries**\n\n[Star us on GitHub](https://github.com/siri1404/llm-infrastructure) • [Read the Docs](https://github.com/siri1404/llm-infrastructure/wiki) • [Report Issues](https://github.com/siri1404/llm-infrastructure/issues)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsiri1404%2Fllm-infrastructure","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsiri1404%2Fllm-infrastructure","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsiri1404%2Fllm-infrastructure/lists"}