{"id":26129302,"url":"https://github.com/stackloklabs/haistings","last_synced_at":"2025-04-13T18:42:22.730Z","repository":{"id":280814021,"uuid":"932652518","full_name":"StacklokLabs/HAIstings","owner":"StacklokLabs","description":"An AI assistant to prioritize security vulnerabilities","archived":false,"fork":false,"pushed_at":"2025-03-24T18:44:31.000Z","size":1371,"stargazers_count":9,"open_issues_count":6,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-27T09:23:33.078Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/StacklokLabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-14T09:19:38.000Z","updated_at":"2025-03-19T00:53:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"1028d8b6-aa99-4ef4-bcd3-f6e6815d6856","html_url":"https://github.com/StacklokLabs/HAIstings","commit_stats":null,"previous_names":["stackloklabs/haistings"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StacklokLabs%2FHAIstings","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StacklokLabs%2FHAIstings/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StacklokLabs%2FHAIstings/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StacklokLabs%2FHAIstings/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/StacklokLabs","download_url":"https://codeload.github.com/StacklokLabs/HAIstings/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248764725,"owners_count":21158144,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-10T19:48:01.295Z","updated_at":"2025-04-13T18:42:22.723Z","avatar_url":"https://github.com/StacklokLabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HAIstings\n\nHAIstings is an AI-powered companion designed to help you assess and prioritize Common Vulnerabilities and Exposures (CVEs) within your Kubernetes infrastructure. Drawing inspiration from Agatha Christie's legendary character Arthur Hastings, the crime-solving partner of Hercule Poirot, HAIstings partners with you to ensure robust security measures in your Kubernetes environments.\n\n## Overview\n\nHAIstings analyzes vulnerability reports from tools like trivy-operator, generates prioritized reports, and engages in an interactive conversation to refine its recommendations based on your specific context and requirements.\n\n## Features\n\n- **Vulnerability Prioritization**: Automatically prioritizes vulnerabilities based on severity, impact, and context\n- **Interactive Refinement**: Engages in a conversation to gather more context and refine prioritization\n- **Infrastructure Context**: Ingests infrastructure repository information to provide more relevant recommendations\n- **Persistent Memory**: Maintains conversation history across sessions using checkpoints\n- **Customizable Output**: Adjusts recommendations based on user-provided context\n- **Retrieval-Augmented Generation (RAG)**: Selectively includes only relevant infrastructure files in the context, reducing overall context size and improving performance\n\n## Installation\n\n### Prerequisites\n\n- Python 3.12\n- Kubernetes cluster with trivy-operator installed\n- Properly configured kubeconfig file\n\n### Using Poetry\n\n```bash\n# Clone the repository\ngit clone https://github.com/stacklok/HAIstings.git\ncd HAIstings\n\n# Install dependencies\npoetry install\n```\n\n### Using pip\n\n```bash\npip install haistings\n```\n\n## Usage\n\n### Basic Usage\n\nGenerate a vulnerability report showing the top 25 most critical vulnerabilities:\n\n```bash\nhaistings\n```\n\n### Customizing Output\n\nSpecify the number of vulnerabilities to show:\n\n```bash\nhaistings --top 30\n```\n\n### Providing Context\n\nProvide additional context to improve prioritization:\n\n```bash\nhaistings --notes usercontext.txt\n```\n\nWhere `usercontext.txt` contains information about your infrastructure, such as:\n\n```\nexample-service is a very critical service that is internet-facing. We should assign more priority to it.\n\nFlux is critical to our infrastructure, so if it has a vulnerability on anything related to how it processes git requests, then we should assign it very high priority.\n```\n\n### Ingesting Infrastructure Repository\n\nProvide your infrastructure repository for additional context:\n\n```bash\nhaistings --infra-repo https://github.com/yourusername/infra-repo --gh-token YOUR_GITHUB_TOKEN\n```\n\nFor a specific subdirectory:\n\n```bash\nhaistings --infra-repo https://github.com/yourusername/infra-repo --infra-repo-subdir kubernetes --gh-token YOUR_GITHUB_TOKEN\n```\n\n### RAG Configuration\n\nControl the Retrieval-Augmented Generation functionality:\n\n```bash\n# Disable RAG (use traditional approach)\nhaistings --use-vectordb false\n\n# Specify maximum number of relevant files per component\nhaistings --max-relevant-files 10\n```\n\n### Persistent Conversations\n\nUse SQLite to persist conversation history:\n\n```bash\nhaistings --checkpoint-saver-driver sqlite\n```\n\n### Full Example\n\n```bash\nhaistings --top 30 --notes usercontext.txt --infra-repo https://github.com/yourusername/infra-repo --max-relevant-files 8 --checkpoint-saver-driver sqlite\n```\n\n## How It Works\n\n1. **Vulnerability Collection**: HAIstings connects to your Kubernetes cluster and collects vulnerability reports from trivy-operator.\n2. **Prioritization**: Vulnerabilities are prioritized based on severity (critical vulnerabilities are weighted 10x more than high vulnerabilities).\n3. **Repository Ingestion**: Infrastructure repository files are ingested and stored in a vector database for efficient retrieval.\n4. **Relevant File Retrieval**: Using RAG (Retrieval-Augmented Generation), only the most relevant files for each vulnerability are retrieved based on similarity search.\n5. **Context Integration**: User-provided context and relevant infrastructure files are integrated into the analysis.\n6. **Report Generation**: A prioritized report is generated in a conversational style inspired by Arthur Hastings.\n7. **Interactive Refinement**: HAIstings engages in a conversation to gather more context and refine its recommendations.\n\n## Command Line Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--top` | Number of vulnerabilities to show | 25 |\n| `--notes` | Path to a file containing additional context | None |\n| `--infra-repo` | URL to your infrastructure repository | None |\n| `--infra-repo-subdir` | Subdirectory in the repository to ingest | None |\n| `--gh-token` | GitHub Personal Access Token for private repositories | None |\n| `--checkpoint-saver-driver` | Memory persistence driver (`memory` or `sqlite`) | `memory` |\n| `--use-vectordb` | Use vector database for repository ingestion | `true` |\n| `--max-relevant-files` | Maximum number of relevant files per component | 5 |\n| `--debug` | Enable debug mode | False |\n| `--model` | LLM model to use (when not using CodeGate) | `this-makes-no-difference-to-codegate` |\n| `--model-provider` | Model provider | `openai` |\n| `--api-key` | API key for the model provider (when not using CodeGate) | `fake-api-key` |\n| `--base-url` | Base URL for the model provider | `http://127.0.0.1:8989/v1/mux` |\n\n## Example Output\n\n```markdown\n# HAIsting's Security Report\n\n## Introduction\n\nGood day! Arthur Hastings at your service. I've meticulously examined the vulnerability reports from your Kubernetes infrastructure and prepared a prioritized assessment of the security concerns that require your immediate attention.\n\n## Summary\n\nAfter careful analysis, I've identified several critical vulnerabilities that demand prompt remediation:\n\n1. **example-service (internet-facing service)**\n   - Critical vulnerabilities: 3\n   - High vulnerabilities: 7\n   - Most concerning: CVE-2023-1234 (Remote code execution)\n   \n   This service is particularly concerning due to its internet-facing nature, as mentioned in your notes. I recommend addressing these vulnerabilities with the utmost urgency.\n\n2. **Flux (GitOps controller)**\n   - Critical vulnerabilities: 2\n   - High vulnerabilities: 5\n   - Most concerning: CVE-2023-5678 (Git request processing vulnerability)\n   \n   As you've noted, Flux is critical to your infrastructure, and this Git request processing vulnerability aligns with your specific concerns.\n\n[Additional entries...]\n\n## Conclusion\n\nI say, these vulnerabilities require prompt attention, particularly the ones affecting your internet-facing services and deployment controllers. I recommend addressing the critical vulnerabilities in example-service and Flux as your top priorities. Should you require any further assistance or have additional context to share, I remain at your service.\n```\n\n## Development\n\n### Setting Up Development Environment\n\n```bash\n# Clone the repository\ngit clone https://github.com/stacklok/HAIstings.git\ncd HAIstings\n\n# Install dependencies including development dependencies\npoetry install\n\n# Run tests\npoetry run pytest\n```\n\n### Code Style\n\nThis project uses:\n- Black for code formatting\n- isort for import sorting\n- mypy for type checking\n- flake8 for linting\n\n```bash\n# Format code\npoetry run black .\npoetry run isort .\n\n# Type check\npoetry run mypy .\n\n# Lint\npoetry run flake8\n```\n\n## Future Improvements / TODO\n\n- **Custom Vulnerability Scoring**: Add support for custom vulnerability scoring based on user-defined criteria beyond just severity.\n- **Integration with More Scanners**: Extend beyond trivy-operator to support other vulnerability scanners.\n- **Visualization Dashboard**: Create a web interface to visualize vulnerability reports and trends over time.\n- **Automated Remediation Suggestions**: Provide specific remediation steps for common vulnerabilities.\n- **Multi-Cluster Support**: Add support for analyzing vulnerabilities across multiple Kubernetes clusters.\n\n## License\n\nApache-2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackloklabs%2Fhaistings","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstackloklabs%2Fhaistings","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackloklabs%2Fhaistings/lists"}