https://github.com/informigados/analyze-codex-tokens
Analyze Codex token usage, sessions, prompts, and subagent overhead to optimize AI workflows.
https://github.com/informigados/analyze-codex-tokens
ai ai-tools analytics app cli codex cost-optimization developer-tools jsonl openai prompt-engineering python subagents token-analysis tokens vscode vscode-extension
Last synced: about 1 month ago
JSON representation
Analyze Codex token usage, sessions, prompts, and subagent overhead to optimize AI workflows.
- Host: GitHub
- URL: https://github.com/informigados/analyze-codex-tokens
- Owner: informigados
- License: mit
- Created: 2026-04-10T04:09:31.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-10T17:37:14.000Z (about 2 months ago)
- Last Synced: 2026-05-03T23:39:20.318Z (about 1 month ago)
- Topics: ai, ai-tools, analytics, app, cli, codex, cost-optimization, developer-tools, jsonl, openai, prompt-engineering, python, subagents, token-analysis, tokens, vscode, vscode-extension
- Language: Python
- Homepage:
- Size: 146 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# ๐ง Analyze Codex Tokens
[](https://www.python.org/downloads/)
[](#-tests)
[](LICENSE)
[](https://deepwiki.com/informigados/analyze-codex-tokens)
Understand exactly how your Codex sessions consume tokens.
> โน๏ธ **Codex** is an OpenAI product. This project is an independent local analyzer for Codex session logs.
This tool scans your local Codex logs and generates a **clear, structured analysis** of usage, costs, prompts, and agent behavior.
It also supports localized console/report output with English as the default base language.
## ๐ What It Does
Analyzes `.jsonl` session logs from:
* `~/.codex/sessions`
* `~/.codex/archived_sessions`
Then generates:
### ๐ Token Report
A complete breakdown of:
* ๐ข Total tokens (input, output, cached, reasoning)
* ๐ Usage by project
* ๐ธ Most expensive sessions
* ๐ค Subagent usage & overhead
* ๐ Subagents without parent in selected range
* โ๏ธ Input vs output ratios
* ๐ง Instruction-heavy sessions
* ๐ Optimization insights
* ๐งพ Structured JSON output for automation
### ๐งพ Prompt Extraction
Creates a `/prompts` folder with:
* All user prompts
* Organized by project
* Sorted by time
## โ๏ธ Requirements
* Python 3.10+
* No external dependencies
## โถ๏ธ How to Run
### Run the script
```bash
python analyze-codex-tokens.py
```
or on Windows:
```powershell
py analyze-codex-tokens.py
```
## ๐ Output
Default location:
```
./reports/-YYYY-MM-DD_HHMMSS/
```
Files generated:
* `token_report.md`
* `token_report.json`
* `/prompts/*.md`
## ๐งฉ CLI Options
You can run with direct CLI flags (recommended for CI/scripts):
```bash
python analyze-codex-tokens.py \
--since-days 7 \
--lang pt-br \
--output-dir ./reports \
--redact-prompts \
--json
```
Windows PowerShell:
```powershell
py analyze-codex-tokens.py --since-days 7 --lang pt-br --output-dir .\reports --redact-prompts --json
```
Available flags:
* `--since-days N`
* `--since-date YYYY-MM-DD`
* `--codex-home PATH`
* `--output-dir PATH`
* `--lang en|pt-br|pt-pt|es`
* `--redact-prompts` / `--no-redact-prompts`
* `--json` / `--no-json`
## ๐ง Optional Configuration (ENV Fallback)
### Filter by last N days
```bash
export SINCE_DAYS=7
```
```powershell
$env:SINCE_DAYS="7"
```
### Filter by date
```bash
export SINCE_DATE="2026-03-30"
```
```powershell
$env:SINCE_DATE="2026-03-30"
```
### Custom Codex directory
```bash
export CODEX_HOME="/path/to/.codex"
```
```powershell
$env:CODEX_HOME="C:\path\to\.codex"
```
### Custom output directory
```bash
export OUTPUT_DIR="/path/to/output"
```
```powershell
$env:OUTPUT_DIR="C:\path\to\output"
```
If `OUTPUT_DIR` is not set and `--output-dir` is not provided, the script creates a language-prefixed timestamped folder under:
```
./reports/-/
```
### Redact prompts in outputs
```bash
export REDACT_PROMPTS=true
```
```powershell
$env:REDACT_PROMPTS="true"
```
### Toggle JSON output
```bash
export WRITE_JSON=true
```
```powershell
$env:WRITE_JSON="true"
```
### Language for console/report output
```bash
export REPORT_LANG="pt-br"
```
```powershell
$env:REPORT_LANG="pt-br"
```
Supported values:
* `en` (default)
* `pt-br`
* `pt-pt`
* `es`
Examples:
Generate report in Brazilian Portuguese with automatic folder naming (PowerShell):
```powershell
py analyze-codex-tokens.py --lang pt-br
```
Generate report in European Portuguese with automatic folder naming (Bash):
```bash
python analyze-codex-tokens.py --lang pt-pt
```
Use environment variable in PowerShell, then run normally:
```powershell
$env:REPORT_LANG="es"
py analyze-codex-tokens.py
```
Use environment variable in Windows CMD:
```cmd
set REPORT_LANG=en
py analyze-codex-tokens.py
```
`--lang` takes precedence over `REPORT_LANG` when both are set.
If you use `--output-dir .\reports` (or `./reports/`), the tool also auto-creates `-` inside `reports`.
## ๐ง Key Features
* ๐ Recursive `.jsonl` discovery (works with VS Code extension)
* ๐ค Subagent tracking and cost analysis
* ๐งฑ Markdown-safe report formatting (better organization/readability)
* ๐งพ JSON report export (`token_report.json`)
* ๐ Optional prompt redaction mode
* ๐ Deep token breakdown
* ๐ Identify inefficiencies fast
## โ ๏ธ Notes
* Requires local Codex logs
* Only sessions with `total_tokens > 0` are included in the analysis
* If no data appears, check your `.codex` folder
* VS Code extension may store additional data in `.sqlite` (not parsed yet)
## โ
Tests
Run unit tests:
```bash
python -m unittest discover -s tests -v
```
PowerShell:
```powershell
py -m unittest discover -s tests -v
```
## ๐ Changelog
### 2026-04-10 (1.0.0)
- Initial release.
## ๐ฅ Authors
- INformigados: https://github.com/informigados/
- Alex Brito: https://github.com/alexbritodev
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.