https://github.com/oskarrough/llmlake
https://github.com/oskarrough/llmlake
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/oskarrough/llmlake
- Owner: oskarrough
- Created: 2026-05-11T12:48:36.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-28T15:12:27.000Z (about 1 month ago)
- Last Synced: 2026-05-28T16:22:30.633Z (about 1 month ago)
- Language: TypeScript
- Size: 165 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# llmlake
A local tool that helps you learn from your (possibly many) LLM sessions across agents (Claude Code, Pi, Codex, Hermes): it transforms the raw session files into denormalized .parquet files you can query with DuckDB and turn into (HTML) insights using installable agent skills.
```
~/.claude ~/.codex ~/.pi ~/.hermes
└──────────┴──┬───┴─────────┘
▼
collect → data/sessions (↔ optional rsync)
│
▼
build → data/parquet
│
┌──────────┴──────────┐
query ai skills
(SQL) (HTML insights)
```
---
Install [Duckdb](https://duckdb.org/install/) and `git clone https://github.com/oskarrough/llmlake`.
Install the skills into any supported coding agent:
```sh
bunx skills add oskarrough/llmlake
```
Once inside the cloned repo, you can _collect_ sessions, _build_ them into .parquet files, _query_ the DB with SQL.
`./llmlake collect`
Moves all raw session files from your local computer into `./data/sessions`. The `data` folder is gitignored.
`./llmlake build`
Transforms them into parquet files inside `data/parquet`.
```sh
./llmlake query -c "SELECT session_id, count(*) FROM events WHERE agent='pi' GROUP BY 1 ORDER BY 2 DESC LIMIT 10;"
```
Query the parquet with duckdb (or ask your agent to do it)
`./llmlake sync `
Bonus feature: two-way rsync between `data/sessions/` and a shared folder, so multiple devices share one library. For example, I use it to store my data in dropbox: `./llmlake sync ~/Dropbox/my-ai-sessions`.
## Skills
After `collect` and `build`, open any coding agent with skills support in this repo and run one of these:
- `llmlake:explore-lake` — ask questions about sessions, costs, tools, models, projects, or activity patterns.
- `llmlake:generate-insights` — create an HTML report for a period, such as last week, last month, or all time.
- `llmlake:inspect-session` — create a focused HTML deep-dive for one session id.
## File overview for contributors
- `collect-{claude,codex,hermes, pi}` — copy sessions
- `build` — parse every collected JSONL file into parquet
- `build-one` — parse a single raw JSONL file into parquet
- `query` — duckdb shell over `data/parquet/` with an `events` view