Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/skeeey/acm-troubleshooter
https://github.com/skeeey/acm-troubleshooter
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/skeeey/acm-troubleshooter
- Owner: skeeey
- Created: 2024-10-27T06:46:32.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-08T07:35:40.000Z (about 1 month ago)
- Last Synced: 2024-11-08T08:30:23.028Z (about 1 month ago)
- Language: Python
- Size: 95.7 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Prepare
1. Create a virtual environment
```sh
VENV= # e.g. $HOME/acm-troubleshooter/.venv
mkdir -p $VENV
python -m venv $VENV
source $VENV/bin/activate
```2. Install dependents
```sh
make deps
```## Diagnose an ACM issue with runbooks and must-gather
```sh
export GROQ_API_KEY= # https://console.groq.com/docs/models
python -m apps.troubleshooter --runbooks= \
--hub-mg= \
--cluster-mg= \
```**Note**: you should install the [`omc` command](https://github.com/gmeghnag/omc) on your host machine.
### How does this work
![diagnosis-flow](./imgs/troubleshooting-flow.png)
Refers to
- https://langchain-ai.github.io/langgraph/tutorials/plan-and-execute/plan-and-execute/
- https://microsoft.github.io/autogen/
- https://github.com/gmeghnag/omc### Runbook Guideline
For a runbook, it should
1. Give a title for the runbook, the title summarizes the issue that this runbook aims to diagnosis.
2. Clearly and detailedly describe the issue, for example
- describe the issue's symptom, for example, the condition status of a resource when the issue happens, the error message, etc;
- describe why this issue happens and which components or resources will be impacted when this issue happens, etc.
3. Clearly and detailedly describe describe the diagnosis steps, for example
- list the diagnosis steps one-by-one;
- specify whether the step should be run on the hub cluster or the managed cluster;
- if one step needs to refer to the other runbooks, using the related runbook title as the markdown link text, for example, `[runbook_title](runbook_location)`## Generate a draft runbooks from a Jira issue
```sh
export JIRA_TOKEN= # https://issues.redhat.com/secure/ViewProfile.jspa?selectedTab=com.atlassian.pats.pats-plugin:jira-user-personal-access-tokens
export GROQ_API_KEY=python -m apps.rbgenerator
```### How does this works
1. Using Jira SDK to get the Jira issue content.
2. Invoking the LLM with the below prompt,
```
You are a Red Hat Advanced Cluster Management for Kubernetes (ACM or RHACM) assistant.
The following content is about an ACM issue, please
- Summarize the issue
- Give the issue's symptom
- Give the troubleshooting steps for this issue
- Give the solution of this issueHere is the content
{text}
```
## Diagnose an ACM issue using a chatbot with ACM doc1. Download the ACM doc embedding data from [here](https://drive.google.com/file/d/1H-ColxJFdH2KiSqdmKITmrJm3VToADb0/view?usp=sharing)
2. Unzip the embedding data to a directory
3. Run the following command```sh
export INDEX_DIR=
export GROQ_API_KEY=
python -m apps.chatbot
```