https://github.com/sammcj/treesummary
https://github.com/sammcj/treesummary
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/sammcj/treesummary
- Owner: sammcj
- License: apache-2.0
- Created: 2024-10-17T06:59:26.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-10-22T05:06:32.000Z (12 months ago)
- Last Synced: 2024-11-14T03:35:05.419Z (11 months ago)
- Language: Python
- Size: 805 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Tree Summary
This script generates a summary of code within a directory tree. It will generate a summary of each file (matching the configured extensions) and generate a summary of summaries every n summaries (configurable in `config.json`) in Markdown
Currently only supports Amazon Bedrock for the LLM, will add OpenAI compatiable API support in the future.
- [Tree Summary](#tree-summary)
- [Usage](#usage)
- [Config](#config)
- [Example Output](#example-output)
- [Requirements](#requirements)## Usage
1. Edit config.json with your desired settings.
2. Install deps `pip install -r requirements.txt`
3. Run `python3 treesummary.py `.Where `` is the path to the directory containing the code.
You can optionally pass `--md-to-mermaid-only ` to generate html and mermaid diagrams from existing md files.
State is stored in the output/treesummary_state.pkl file, so you can run the script multiple times to generate summaries for different directories or resume from a previous run.
If you have [Ingest](https://github.com/sammcj/ingest) installed, TreeSummary will give you an estimate of the number of tokens that will be used by the LLM.
## Config
- `aws_region`: The AWS region to use for the LLM
- `model_id`: The model ID to use for the LLM
- `file_extensions`: A list of file extensions to process
- `max_tokens`: The maximum number of tokens to generate
- `system_prompt`: The prompt to use for the system
- `file_prompt`: The prompt to use for each file
- `summary_prompt`: The prompt to use for the summary
- `limit`: The number of files to process (0 for all)
- `parallel`: The number of files to process in parallel
- `supersummary_interval`: The number of files to process before generating a supersummary
- `generate_final_summary`: Whether to generate a final summary
- `final_summary_prompt`: The prompt to use for the final summary
- `generate_file_modernisation_recommendations`: Whether to generate file level modernisation recommendations
- `file_modernisation_prompt`: The prompt to use for the file level modernisation recommendations
- `generate_modernisation_summary`: Whether to generate a modernisation recommendations summary
- `modernisation_summary_prompt`: The prompt to use for the modernisation summary
- `temperature`: The temperature sampling for the LLM
- `top_p`: The top_p sampling for the LLM
- `ignore_paths`: A list of paths to ignore## Example Output
For example output see [example_output](example_output).
```shell
python treesummary.py . (mainU)
Total files found to process: 1
Starting fresh processing of 1 files.
Processing batch of 1 files.
Attempting to run ingest on 1 files.
Running ingest command: ingest ./treesummary.py
Ingest command output:⠋ Traversing directory and building tree.. [0s] [ℹ️] Top 10 largest files (by estimated token count):
- 1. /Users/samm/git/sammcj/treesummary/treesummary.py (3,737 tokens)[✅] Copied to clipboard successfully.
[ℹ️] Tokens (Approximate): 3,782Extracted token count: 3782
Estimated total tokens for this batch: 3782
Processing batch of 1 files.
Processing files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00, 6.59s/it]
Results have been saved to /Users/samm/git/sammcj/treesummary/output/summary_output_20241017-1854.md
Total files processed: 1
Generating final summary...
```## Requirements
- Python 3.12.x
- AWS Bedrock Access (via `aws sso login`)Author: Sam McLeod