https://github.com/oskarrough/llmlake

Last synced: 25 days ago
JSON representation

Host: GitHub
URL: https://github.com/oskarrough/llmlake
Owner: oskarrough
Created: 2026-05-11T12:48:36.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2026-05-28T15:12:27.000Z (about 1 month ago)
Last Synced: 2026-05-28T16:22:30.633Z (about 1 month ago)
Language: TypeScript
Size: 165 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# llmlake

A local tool that helps you learn from your (possibly many) LLM sessions across agents (Claude Code, Pi, Codex, Hermes): it transforms the raw session files into denormalized .parquet files you can query with DuckDB and turn into (HTML) insights using installable agent skills.

```
~/.claude ~/.codex ~/.pi ~/.hermes
└──────────┴──┬───┴─────────┘
▼
collect → data/sessions (↔ optional rsync)
│
▼
build → data/parquet
│
┌──────────┴──────────┐
query ai skills
(SQL) (HTML insights)
```

---

Install [Duckdb](https://duckdb.org/install/) and `git clone https://github.com/oskarrough/llmlake`.

Install the skills into any supported coding agent:

```sh
bunx skills add oskarrough/llmlake
```

Once inside the cloned repo, you can _collect_ sessions, _build_ them into .parquet files, _query_ the DB with SQL.

`./llmlake collect`

Moves all raw session files from your local computer into `./data/sessions`. The `data` folder is gitignored.

`./llmlake build`

Transforms them into parquet files inside `data/parquet`.

```sh
./llmlake query -c "SELECT session_id, count(*) FROM events WHERE agent='pi' GROUP BY 1 ORDER BY 2 DESC LIMIT 10;"
```

Query the parquet with duckdb (or ask your agent to do it)

`./llmlake sync `

Bonus feature: two-way rsync between `data/sessions/` and a shared folder, so multiple devices share one library. For example, I use it to store my data in dropbox: `./llmlake sync ~/Dropbox/my-ai-sessions`.

## Skills

After `collect` and `build`, open any coding agent with skills support in this repo and run one of these:

- `llmlake:explore-lake` — ask questions about sessions, costs, tools, models, projects, or activity patterns.
- `llmlake:generate-insights` — create an HTML report for a period, such as last week, last month, or all time.
- `llmlake:inspect-session` — create a focused HTML deep-dive for one session id.

## File overview for contributors

- `collect-{claude,codex,hermes, pi}` — copy sessions
- `build` — parse every collected JSONL file into parquet
- `build-one` — parse a single raw JSONL file into parquet
- `query` — duckdb shell over `data/parquet/` with an `events` view

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oskarrough/llmlake

Awesome Lists containing this project

README