An open API service indexing awesome lists of open source software.

https://github.com/robusta-dev/holmesgpt

Your 24/7 On-Call AI Agent - Solve Alerts Faster with Automatic Correlations, Investigations, and More
https://github.com/robusta-dev/holmesgpt

aiops chatbot chatops devops devops-tools incident incident-management incident-response jira kubernetes llm llm-agent llm-framework llms monitoring observability prometheus site-reliability-engineering slack sre

Last synced: 23 days ago
JSON representation

Your 24/7 On-Call AI Agent - Solve Alerts Faster with Automatic Correlations, Investigations, and More

Awesome Lists containing this project

README

          


AI Agent for Cloud Troubleshooting and Alert Investigation

HolmesGPT is an AI agent for investigating problems in your cloud, finding the root cause, and suggesting remediations. It has dozens of built-in integrations for cloud providers, observability tools, and on-call systems.

HolmesGPT has been submitted to the CNCF as a sandbox project ([view status](https://github.com/cncf/sandbox/issues/392)). You can learn more about HolmesGPT's maintainers and adopters [here](./ADOPTERS.md).


How it Works |
Installation |
LLM Providers |
YouTube Demo |
Ask DeepWiki


![HolmesGPT Investigation Demo](https://holmesgpt.dev/assets/HolmesInvestigation.gif)

## How it Works

HolmesGPT connects AI models with live observability data and organizational knowledge. It uses an **agentic loop** to analyze data from multiple sources and identify possible root causes.

holmesgpt-architecture-diagram

### 🔗 Data Sources

HolmesGPT integrates with popular observability and cloud platforms. The following data sources ("toolsets") are built-in. [Add your own](#customizing-holmesgpt).

| Data Source | Status | Notes |
|-------------|--------|-------|
| [ArgoCD **ArgoCD**](https://holmesgpt.dev/data-sources/builtin-toolsets/argocd/) | ✅ | Get status, history and manifests and more of apps, projects and clusters |
| [AWS RDS **AWS RDS**](https://holmesgpt.dev/data-sources/builtin-toolsets/aws/) | ✅ | Fetch events, instances, slow query logs and more |
| [Confluence **Confluence**](https://holmesgpt.dev/data-sources/builtin-toolsets/confluence/) | ✅ | Private runbooks and documentation |
| [Coralogix Logs **Coralogix Logs**](https://holmesgpt.dev/data-sources/builtin-toolsets/coralogix-logs/) | ✅ | Retrieve logs for any resource |
| [Datetime **Datetime**](https://holmesgpt.dev/data-sources/builtin-toolsets/datetime/) | ✅ | Date and time-related operations |
| [Docker **Docker**](https://holmesgpt.dev/data-sources/builtin-toolsets/docker/) | ✅ | Get images, logs, events, history and more |
| [GitHub **GitHub**](https://holmesgpt.dev/data-sources/builtin-toolsets/github/) | 🟡 Beta | Remediate alerts by opening pull requests with fixes |
| [DataDog **DataDog**](https://holmesgpt.dev/data-sources/builtin-toolsets/datadog/) | 🟡 Beta | Fetches log data from datadog |
| [Loki **Loki**](https://holmesgpt.dev/data-sources/builtin-toolsets/grafanaloki/) | ✅ | Query logs for Kubernetes resources or any query |
| [Tempo **Tempo**](https://holmesgpt.dev/data-sources/builtin-toolsets/grafanatempo/) | ✅ | Fetch trace info, debug issues like high latency in application. |
| [Helm **Helm**](https://holmesgpt.dev/data-sources/builtin-toolsets/helm/) | ✅ | Release status, chart metadata, and values |
| [Internet **Internet**](https://holmesgpt.dev/data-sources/builtin-toolsets/internet/) | ✅ | Public runbooks, community docs etc |
| [Kafka **Kafka**](https://holmesgpt.dev/data-sources/builtin-toolsets/kafka/) | ✅ | Fetch metadata, list consumers and topics or find lagging consumer groups |
| [Kubernetes **Kubernetes**](https://holmesgpt.dev/data-sources/builtin-toolsets/kubernetes/) | ✅ | Pod logs, K8s events, and resource status (kubectl describe) |
| [NewRelic **NewRelic**](https://holmesgpt.dev/data-sources/builtin-toolsets/newrelic/) | 🟡 Beta | Investigate alerts, query tracing data |
| [OpenSearch **OpenSearch**](https://holmesgpt.dev/data-sources/builtin-toolsets/opensearch-status/) | ✅ | Query health, shard, and settings related info of one or more clusters|
| [Prometheus **Prometheus**](https://holmesgpt.dev/data-sources/builtin-toolsets/prometheus/) | ✅ | Investigate alerts, query metrics and generate PromQL queries |
| [RabbitMQ **RabbitMQ**](https://holmesgpt.dev/data-sources/builtin-toolsets/rabbitmq/) | ✅ | Info about partitions, memory/disk alerts to troubleshoot split-brain scenarios and more |
| [Robusta **Robusta**](https://holmesgpt.dev/data-sources/builtin-toolsets/robusta/) | ✅ | Multi-cluster monitoring, historical change data, user-configured runbooks, PromQL graphs and more |
| [Slab **Slab**](https://holmesgpt.dev/data-sources/builtin-toolsets/slab/) | ✅ | Team knowledge base and runbooks on demand |

### 🚀 End-to-End Automation

HolmesGPT can fetch alerts/tickets to investigate from external systems, then write the analysis back to the source or Slack.

| Integration | Status | Notes |
|-------------------------|-----------|-------|
| Slack | 🟡 Beta | [Demo.](https://www.loom.com/share/afcd81444b1a4adfaa0bbe01c37a4847) Tag HolmesGPT bot in any Slack message |
| Prometheus/AlertManager | ✅ | Robusta SaaS or HolmesGPT CLI |
| PagerDuty | ✅ | HolmesGPT CLI only |
| OpsGenie | ✅ | HolmesGPT CLI only |
| Jira | ✅ | HolmesGPT CLI only |
| GitHub | ✅ | HolmesGPT CLI only |

## Installation


All Installation Methods

Read the [installation documentation](https://holmesgpt.dev/installation/cli-installation/) to learn how to install HolmesGPT.

## Supported LLM Providers


All Integration Providers

Read the [LLM Providers documentation](https://holmesgpt.dev/ai-providers/) to learn how to set up your LLM API key.

## Using HolmesGPT

- In the Robusta SaaS: Go to [platform.robusta.dev](https://platform.robusta.dev/signup/?utm_source=github&utm_medium=holmesgpt-readme&utm_content=ways_to_use_holmesgpt_section) and use Holmes from your browser
- With HolmesGPT CLI: [setup an LLM API key](https://holmesgpt.dev/ai-providers/) and ask Holmes a question 👇

```bash
holmes ask "what pods are unhealthy and why?"
```

You can also provide files as context:
```bash
holmes ask "summarize the key points in this document" -f ./mydocument.txt
```

You can also load the prompt from a file using the `--prompt-file` option:
```bash
holmes ask --prompt-file ~/long-prompt.txt

Enter interactive mode to ask follow-up questions:
```bash
holmes ask "what pods are unhealthy and why?" --interactive
# or
holmes ask "what pods are unhealthy and why?" -i
```

Also supported:

HolmesGPT CLI: investigate Prometheus alerts

Pull alerts from AlertManager and investigate them with HolmesGPT:

```bash
holmes investigate alertmanager --alertmanager-url http://localhost:9093
# if on Mac OS and using the Holmes Docker image👇
# holmes investigate alertmanager --alertmanager-url http://docker.for.mac.localhost:9093
```

To investigate alerts in your browser, sign up for a free trial of [Robusta SaaS](https://platform.robusta.dev/signup/?utm_source=github&utm_medium=holmesgpt-readme&utm_content=ways_to_use_holmesgpt_section).

Optional: port-forward to AlertManager before running the command mentioned above (if running Prometheus inside Kubernetes)

```bash
kubectl port-forward alertmanager-robusta-kube-prometheus-st-alertmanager-0 9093:9093 &
```

HolmesGPT CLI: investigate PagerDuty and OpsGenie alerts

```bash
holmes investigate opsgenie --opsgenie-api-key
holmes investigate pagerduty --pagerduty-api-key
# to write the analysis back to the incident as a comment
holmes investigate pagerduty --pagerduty-api-key --update
```

For more details, run `holmes investigate --help`

## Customizing HolmesGPT

HolmesGPT can investigate many issues out of the box, with no customization or training. Optionally, you can extend Holmes to improve results:

**Custom Data Sources**: Add data sources (toolsets) to improve investigations
- If using Robusta SaaS: See [here](https://holmesgpt.dev/data-sources/custom-toolsets/)
- If using the CLI: Use `-t` flag with [custom toolset files](./examples/custom_toolset.yaml) or add to `~/.holmes/config.yaml`

**Custom Runbooks**: Give HolmesGPT instructions for known alerts:
- If using Robusta SaaS: Use the Robusta UI to add runbooks
- If using the CLI: Use `-r` flag with [custom runbook files](./examples/custom_runbooks.yaml) or add to `~/.holmes/config.yaml`

You can save common settings and API Keys in a config file to avoid passing them from the CLI each time:

Reading settings from a config file

You can save common settings and API keys in config file for re-use. Place the config file in ~/.holmes/config.yaml` or pass it using the --config

You can view an example config file with all available settings [here](config.example.yaml).

### Tool Output Transformers

HolmesGPT supports **transformers** to process large tool outputs before sending them to your primary LLM. This feature helps manage context window limits while preserving essential information.

The most common transformer is `llm_summarize`, which uses a fast secondary model to summarize lengthy outputs from tools like `kubectl describe`, log queries, or metrics collection.

📖 **Learn more**: [Tool Output Transformers Documentation](docs/transformers.md)

## 🔐 Data Privacy

By design, HolmesGPT has **read-only access** and respects RBAC permissions. It is safe to run in production environments.

We do **not** train HolmesGPT on your data. Data sent to Robusta SaaS is private to your account.

For extra privacy, [bring an API key](https://holmesgpt.dev/ai-providers/) for your own AI model.

## Evals

Because HolmesGPT relies on LLMs, it relies on [a suite of pytest based evaluations](https://holmesgpt.dev/development/evals/) to ensure the prompt and HolmesGPT's default set of tools work as expected with LLMs.

- [Introduction to HolmesGPT's evals](https://holmesgpt.dev/development/evals/).
- [Write your own evals](https://holmesgpt.dev/development/evals/adding-new-eval/).
- [Use Braintrust to view analyze results (optional)](https://holmesgpt.dev/development/evals/reporting/).

## License
Distributed under the MIT License. See [LICENSE.txt](https://github.com/robusta-dev/holmesgpt/blob/master/LICENSE.txt) for more information.

## Community

Join our community to discuss the HolmesGPT roadmap and share feedback:

📹 **First Community Meetup Recording:** [Watch on YouTube](https://youtu.be/slQRc6nlFQU)
- **Topics:** Roadmap discussion, community feedback, and Q&A
- **Resources:** [📝 Meeting Notes](https://docs.google.com/document/d/1sIHCcTivyzrF5XNvos7ZT_UcxEOqgwfawsTbb9wMJe4/edit?tab=t.0) | [📋 Community Page](https://holmesgpt.dev/community/)

## Support

If you have any questions, feel free to message us on [robustacommunity.slack.com](https://bit.ly/robusta-slack)

## How to Contribute

Please read our [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines and instructions.

For help, contact us on [Slack](https://bit.ly/robusta-slack) or ask [DeepWiki AI](https://deepwiki.com/robusta-dev/holmesgpt) your questions.

[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/robusta-dev/holmesgpt)