An open API service indexing awesome lists of open source software.

https://github.com/informigados/analyze-codex-tokens

Analyze Codex token usage, sessions, prompts, and subagent overhead to optimize AI workflows.
https://github.com/informigados/analyze-codex-tokens

ai ai-tools analytics app cli codex cost-optimization developer-tools jsonl openai prompt-engineering python subagents token-analysis tokens vscode vscode-extension

Last synced: about 1 month ago
JSON representation

Analyze Codex token usage, sessions, prompts, and subagent overhead to optimize AI workflows.

Awesome Lists containing this project

README

          

# ๐Ÿง  Analyze Codex Tokens

[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![Tests](https://img.shields.io/badge/tests-passing-brightgreen.svg)](#-tests)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/informigados/analyze-codex-tokens)

Understand exactly how your Codex sessions consume tokens.

> โ„น๏ธ **Codex** is an OpenAI product. This project is an independent local analyzer for Codex session logs.

This tool scans your local Codex logs and generates a **clear, structured analysis** of usage, costs, prompts, and agent behavior.
It also supports localized console/report output with English as the default base language.

## ๐Ÿš€ What It Does

Analyzes `.jsonl` session logs from:

* `~/.codex/sessions`
* `~/.codex/archived_sessions`

Then generates:

### ๐Ÿ“Š Token Report

A complete breakdown of:

* ๐Ÿ”ข Total tokens (input, output, cached, reasoning)
* ๐Ÿ“ Usage by project
* ๐Ÿ’ธ Most expensive sessions
* ๐Ÿค– Subagent usage & overhead
* ๐Ÿ”— Subagents without parent in selected range
* โš–๏ธ Input vs output ratios
* ๐Ÿง  Instruction-heavy sessions
* ๐Ÿ“‰ Optimization insights
* ๐Ÿงพ Structured JSON output for automation

### ๐Ÿงพ Prompt Extraction

Creates a `/prompts` folder with:

* All user prompts
* Organized by project
* Sorted by time

## โš™๏ธ Requirements

* Python 3.10+
* No external dependencies

## โ–ถ๏ธ How to Run

### Run the script

```bash
python analyze-codex-tokens.py
```

or on Windows:

```powershell
py analyze-codex-tokens.py
```

## ๐Ÿ“‚ Output

Default location:

```
./reports/-YYYY-MM-DD_HHMMSS/
```

Files generated:

* `token_report.md`
* `token_report.json`
* `/prompts/*.md`

## ๐Ÿงฉ CLI Options

You can run with direct CLI flags (recommended for CI/scripts):

```bash
python analyze-codex-tokens.py \
--since-days 7 \
--lang pt-br \
--output-dir ./reports \
--redact-prompts \
--json
```

Windows PowerShell:

```powershell
py analyze-codex-tokens.py --since-days 7 --lang pt-br --output-dir .\reports --redact-prompts --json
```

Available flags:

* `--since-days N`
* `--since-date YYYY-MM-DD`
* `--codex-home PATH`
* `--output-dir PATH`
* `--lang en|pt-br|pt-pt|es`
* `--redact-prompts` / `--no-redact-prompts`
* `--json` / `--no-json`

## ๐Ÿ”ง Optional Configuration (ENV Fallback)

### Filter by last N days

```bash
export SINCE_DAYS=7
```

```powershell
$env:SINCE_DAYS="7"
```

### Filter by date

```bash
export SINCE_DATE="2026-03-30"
```

```powershell
$env:SINCE_DATE="2026-03-30"
```

### Custom Codex directory

```bash
export CODEX_HOME="/path/to/.codex"
```

```powershell
$env:CODEX_HOME="C:\path\to\.codex"
```

### Custom output directory

```bash
export OUTPUT_DIR="/path/to/output"
```

```powershell
$env:OUTPUT_DIR="C:\path\to\output"
```

If `OUTPUT_DIR` is not set and `--output-dir` is not provided, the script creates a language-prefixed timestamped folder under:

```
./reports/-/
```

### Redact prompts in outputs

```bash
export REDACT_PROMPTS=true
```

```powershell
$env:REDACT_PROMPTS="true"
```

### Toggle JSON output

```bash
export WRITE_JSON=true
```

```powershell
$env:WRITE_JSON="true"
```

### Language for console/report output

```bash
export REPORT_LANG="pt-br"
```

```powershell
$env:REPORT_LANG="pt-br"
```

Supported values:

* `en` (default)
* `pt-br`
* `pt-pt`
* `es`

Examples:

Generate report in Brazilian Portuguese with automatic folder naming (PowerShell):

```powershell
py analyze-codex-tokens.py --lang pt-br
```

Generate report in European Portuguese with automatic folder naming (Bash):

```bash
python analyze-codex-tokens.py --lang pt-pt
```

Use environment variable in PowerShell, then run normally:

```powershell
$env:REPORT_LANG="es"
py analyze-codex-tokens.py
```

Use environment variable in Windows CMD:

```cmd
set REPORT_LANG=en
py analyze-codex-tokens.py
```

`--lang` takes precedence over `REPORT_LANG` when both are set.

If you use `--output-dir .\reports` (or `./reports/`), the tool also auto-creates `-` inside `reports`.

## ๐Ÿง  Key Features

* ๐Ÿ” Recursive `.jsonl` discovery (works with VS Code extension)
* ๐Ÿค– Subagent tracking and cost analysis
* ๐Ÿงฑ Markdown-safe report formatting (better organization/readability)
* ๐Ÿงพ JSON report export (`token_report.json`)
* ๐Ÿ”’ Optional prompt redaction mode
* ๐Ÿ“Š Deep token breakdown
* ๐Ÿ“ˆ Identify inefficiencies fast

## โš ๏ธ Notes

* Requires local Codex logs
* Only sessions with `total_tokens > 0` are included in the analysis
* If no data appears, check your `.codex` folder
* VS Code extension may store additional data in `.sqlite` (not parsed yet)

## โœ… Tests

Run unit tests:

```bash
python -m unittest discover -s tests -v
```

PowerShell:

```powershell
py -m unittest discover -s tests -v
```

## ๐Ÿ“ Changelog

### 2026-04-10 (1.0.0)

- Initial release.

## ๐Ÿ‘ฅ Authors

- INformigados: https://github.com/informigados/
- Alex Brito: https://github.com/alexbritodev

## ๐Ÿ“œ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.