https://github.com/sammcj/confuddlement
Confluence API scaper
https://github.com/sammcj/confuddlement
Last synced: 8 months ago
JSON representation
Confluence API scaper
- Host: GitHub
- URL: https://github.com/sammcj/confuddlement
- Owner: sammcj
- License: mit
- Created: 2024-05-22T12:05:56.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-23T01:08:47.000Z (over 1 year ago)
- Last Synced: 2024-05-23T01:31:58.467Z (over 1 year ago)
- Language: Go
- Size: 13.7 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Confuddlement
Confuddlement is a command-line tool that downloads Confluence pages and saves them as Markdown files.
It uses the Confluence REST API to fetch page content and convert it to Markdown.
The program can also summarise the content of a fetched page using the Ollama API.```plain
$ go run main.goConfuddlement 0.2.0
Spaces: [COOLTEAM, MANAGEMENT]
Fetching content from space COOLTEAMCOOLTEAM (Totally Cool Team Homepage)
Retrospectives
Decision log
Development Onboarding
Saved page COOLTEAM - Feature List to ./confluence_dump/COOLTEAM - Feature List.md
Skipping page 7. Support, less than 300 charactersMANAGEMENT (Department of Overhead and Bureaucracy)
Painful Change Management
Illogical Diagrams
Saved page ./confluence_dump/Painful Change Management.md
Saved page Illogical Diagrams to ./confluence_dump/Ilogical Diagrams.mdDone!
``````plain
$ go run main.go summarise
Select a file to summarise:
0: + COOLTEAM - Feature List
1: + Painful Change Management
2: + Illogical Diagrams
Enter the number of the file to summarise: 1Summarising Painful Change Management...
"Change management in the enterprise is painful and slow. It involves many forms and approvals."
``````plain
go run main.go -q 'who is the CEO?' -s 'management' -r 2Querying the LLM with the prompt 'who is the CEO?'...
"The CEO of the company is Peewee Herman."
```## Usage
### Running the Program
1. Copy [.env.template](.env.template) to `.env` and update the environment variables.
2. Run the program using the command `go run main.go` or build the program using the command `go build` and run the resulting executable.
3. The program will fetch Confluence pages and save them as Markdown files in the specified directory.#### Querying the documents with AI
You can summarise the content of a fetched page using the Ollama API by running the program with the `summarise` argument:
```shell
go run main.go summarise
```To perform a custom query, you can use the `query` argument:
- `-q`: The query to to provide to the LLM.
- `-s`: The search term to match documents against.
- `-r`: The number of lines before and after the search term to include in the context to the LLM.```plain
go run main.go -q 'who is the CEO?' -s 'management' -r 2
```### Environment Variables
The following environment must be set:
> - `CONFLUENCE_DUMP_DIR`: The directory where the Markdown files will be saved.
> - `CONFLUENCE_LIMIT`: The number of pages to fetch per API request.
> - `CONFLUENCE_BASE_URL`: The base URL of the Confluence instance.
> - `CONFLUENCE_USER`: The username to use for API authentication.
> - `CONFLUENCE_SPACES`: The space keys to fetch pages from, separated by commas.
> - `CONFLUENCE_API_TOKEN`: The API token to use for authentication.
> - `DELETE_PREVIOUS_DUMP`: Set to `true` to delete the previous dump directory (and state) before fetching pages.
> - `MIN_PAGE_LENGTH`: The minimum length of a page to be considered valid.
> - `SKIP_FETCHED_PAGES`: Set to `true` to skip pages that have already been fetched.
> - `DEBUG`: Set to `true` to enable debug logging.
> - `OLLAMA_HOST`: The host of the Ollama API (optional, only required for summarisation).
> - `OLLAMA_MODEL`: The model to use for summarisation (optional, only required for summarisation).
> - `OLLAMA_NUM_CTX`: The number of context sentences to include in the summary (optional, only required for summarisation).
> - `OLLAMA_NUM_PREDICT`: The number of predicted sentences to include in the summary (optional, only required for summarisation).## License
This program is licensed under the MIT License.
Copyright (c) 2024, Sam McLeod