https://github.com/semcod/code2llm
Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction
https://github.com/semcod/code2llm
ast cfg code code2data code2logic code2process data dfg diagram flow graphs llm
Last synced: 8 days ago
JSON representation
Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction
- Host: GitHub
- URL: https://github.com/semcod/code2llm
- Owner: semcod
- License: apache-2.0
- Created: 2026-02-28T19:03:31.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-05-24T18:26:34.000Z (15 days ago)
- Last Synced: 2026-05-24T20:18:18.994Z (15 days ago)
- Topics: ast, cfg, code, code2data, code2logic, code2process, data, dfg, diagram, flow, graphs, llm
- Language: Python
- Homepage: https://wronai.github.io/code2flow/
- Size: 75.1 MB
- Stars: 0
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Roadmap: ROADMAP.md
Awesome Lists containing this project
README
# code2llm - Generated Analysis Files
## AI Cost Tracking
   
  
- π€ **LLM usage:** $7.5000 (240 commits)
- π€ **Human dev:** ~$8680 (86.8h @ $100/h, 30min dedup)
Generated on 2026-05-26 using [openrouter/qwen/qwen3-coder-next](https://openrouter.ai/qwen/qwen3-coder-next)
---
This directory contains the complete analysis of your project generated by `code2llm`. Each file serves a specific purpose for understanding, refactoring, and documenting your codebase. # noqa: E501
## π Generated Files Overview
When you run `code2llm ./ -f all`, the following files are created:
### π― Core Analysis Files
| File | Format | Purpose | Key Insights |
|------|--------|---------|--------------|
| `evolution.toon.yaml` | **YAML** | **π Refactoring queue** - Prioritized improvements | 0 refactoring actions needed |
| `map.toon.yaml` | **YAML** | **πΊοΈ Structural map + project header** - Modules, imports, exports, signatures, stats, alerts, hotspots, trend | Project architecture overview |
### π€ LLM-Ready Documentation
| File | Format | Purpose | Use Case |
|------|--------|---------|----------|
| `context.md` | **Markdown** | **π LLM narrative** - Architecture summary | Paste into ChatGPT/Claude for code analysis |
### π Visualizations
| File | Format | Purpose | Description |
|------|--------|---------|-------------|
| `calls.mmd` | **Mermaid** | **π Call graph** | Function dependencies (edges only) |
## π Quick Start Commands
### Basic Analysis
```bash
# Quick health check (TOON format only)
code2llm ./ -f toon
# Generate all formats (what created these files)
code2llm ./ -f all
# LLM-ready context only
code2llm ./ -f context
```
### Performance Options
```bash
# Fast analysis for large projects
code2llm ./ -f toon --strategy quick
# Memory-limited analysis
code2llm ./ -f all --max-memory 500
# Skip PNG generation (faster)
code2llm ./ -f all --no-png
```
### Refactoring Focus
```bash
# Get refactoring recommendations
code2llm ./ -f evolution
# Focus on specific code smells
code2llm ./ -f toon --refactor --smell god_function
# Data flow analysis
code2llm ./ -f flow --data-flow
```
## π Understanding Each File
### `analysis.toon` - Health Diagnostics
**Purpose**: Quick overview of code health issues
**Key sections**:
- **HEALTH**: Critical issues (π΄) and warnings (π‘)
- **REFACTOR**: Prioritized refactoring actions
- **COUPLING**: Module dependencies and potential cycles
- **LAYERS**: Package complexity metrics
- **FUNCTIONS**: High-complexity functions (CC β₯ 10)
- **CLASSES**: Complex classes needing attention
**Example usage**:
```bash
# View health issues
cat analysis.toon | head -30
# Check refactoring priorities
grep "REFACTOR" analysis.toon
```
### `evolution.toon.yaml` - Refactoring Queue
**Purpose**: Step-by-step refactoring plan
**Key sections**:
- **NEXT**: Immediate actions to take
- **RISKS**: Potential breaking changes
- **METRICS-TARGET**: Success criteria
**Example usage**:
```bash
# Get refactoring plan
cat evolution.toon.yaml
# Track progress
grep "NEXT" evolution.toon.yaml
```
### `flow.toon` - Legacy Data Flow Analysis
**Purpose**: Understand data movement through the system (legacy / explicit opt-in)
**Key sections**:
- **PIPELINES**: Data processing chains
- **CONTRACTS**: Function input/output contracts
- **SIDE_EFFECTS**: Functions with external impacts
**Example usage**:
```bash
# Find data pipelines
grep "PIPELINES" flow.toon
# Identify side effects
grep "SIDE_EFFECTS" flow.toon
```
### `map.toon.yaml` - Structural Map + Project Header
**Purpose**: High-level architecture overview plus compact project header
**Key sections**:
- **MODULES**: All modules with basic stats
- **IMPORTS**: Dependency relationships
- **EXPORTS**: Public API surface and signatures
- **HEADER**: Stats, alerts, hotspots, evolution trend
**Example usage**:
```bash
# See project structure
cat map.toon.yaml | head -50
# Find public APIs
grep "SIGNATURES" map.toon.yaml
```
### `project.toon.yaml` - Compact Analysis View
**Purpose**: Compact module view generated from project.yaml data
**Status**: Legacy view generated on demand from unified project.yaml
**Example usage**:
```bash
# View compact project structure
cat project.toon.yaml | head -30
# Find largest files
grep -E "^ .*[0-9]{3,}$" project.toon.yaml | sort -t',' -k2 -n -r | head -10
```
### `prompt.txt` - Ready-to-Send LLM Prompt
**Purpose**: Pre-formatted prompt listing all generated files for LLM conversation
**Generation**: Written when `code2llm` runs with a source path and requests `-f all` (including `--no-chunk`) or `code2logic` # noqa: E501
**Contents**:
- **Files section**: Lists all existing generated files with descriptions, including `project.toon.yaml` when generated by `-f all` # noqa: E501
- **Source files section**: Highlights important source files such as `cli_exports/orchestrator.py`
- **Missing section**: Shows which files weren't generated (if any)
- **Task section**: Refactoring brief with concrete execution instructions, not just analysis
- **Priority Order section**: State-dependent refactoring priorities, starting with blockers and then architecture cleanup # noqa: E501
- **Requirements section**: Guidelines for suggested changes
**Example usage**:
```bash
# View the prompt
cat prompt.txt
# Copy to clipboard and paste into ChatGPT/Claude
cat prompt.txt | pbcopy # macOS
cat prompt.txt | xclip -sel clip # Linux
```
### `context.md` - LLM Narrative
**Purpose**: Ready-to-paste context for AI assistants
**Key sections**:
- **Overview**: Project statistics
- **Architecture**: Module breakdown
- **Entry Points**: Public interfaces
- **Patterns**: Design patterns detected
**Example usage**:
```bash
# Copy to clipboard for LLM
cat context.md | pbcopy # macOS
cat context.md | xclip -sel clip # Linux
# Use with Claude/ChatGPT for code analysis
```
### Visualization Files (`*.mmd`, `*.png`)
**Purpose**: Visual understanding of code structure
**Files**:
- `flow.mmd` - Detailed control flow with complexity colors
- `calls.mmd` - Simple call graph
- `compact_flow.mmd` - High-level module view
- `*.png` - Pre-rendered images
**Example usage**:
```bash
# View diagrams
open flow.png # macOS
xdg-open flow.png # Linux
# Edit in Mermaid Live Editor
# Copy content of .mmd files to https://mermaid.live
```
## π Common Analysis Patterns
### 1. Code Health Assessment
```bash
# Quick health check
code2llm ./ -f toon
cat analysis.toon | grep -E "(HEALTH|REFACTOR)"
```
### 2. Refactoring Planning
```bash
# Get refactoring queue
code2llm ./ -f evolution
cat evolution.toon.yaml
# Focus on specific issues
code2llm ./ -f toon --refactor --smell god_function
```
### 3. LLM Assistance
```bash
# Generate context for AI
code2llm ./ -f context
cat context.md
# Use with Claude: "Based on this context, help me refactor the god modules"
```
### 4. Team Documentation
```bash
# Generate all docs for team
code2llm ./ -f all -o ./docs/
# Create visual diagrams
open docs/flow.png
```
## π Interpreting Metrics
### Complexity Metrics (CC)
- **π΄ Critical (β₯5.0)**: Immediate refactoring needed
- **π High (3.0-4.9)**: Consider refactoring
- **π‘ Medium (1.5-2.9)**: Monitor complexity
- **π’ Low (0.1-1.4)**: Acceptable
- **βͺ Basic (0.0)**: Simple functions
### Module Health
- **GOD Module**: Too large (>500 lines, >20 methods)
- **HUB**: High fan-out (calls many modules)
- **FAN-IN**: High incoming dependencies
- **CYCLES**: Circular dependencies
### Data Flow Indicators
- **PIPELINE**: Sequential data processing
- **CONTRACT**: Clear input/output specification
- **SIDE_EFFECT**: External state modification
## π οΈ Integration Examples
### CI/CD Pipeline
```bash
#!/bin/bash
# Analyze code quality in CI
code2llm ./ -f toon -o ./analysis
if grep -q "π΄ GOD" ./analysis/analysis.toon; then
echo "β God modules detected"
exit 1
fi
```
### Pre-commit Hook
```bash
#!/bin/sh
# .git/hooks/pre-commit
code2llm ./ -f toon -o ./temp_analysis
if grep -q "π΄" ./temp_analysis/analysis.toon; then
echo "β οΈ Critical issues found. Review before committing."
fi
rm -rf ./temp_analysis
```
### Documentation Generation
```bash
# Generate docs for README
code2llm ./ -f context -o ./docs/
echo "## Architecture" >> README.md
cat docs/context.md >> README.md
```
## π Next Steps
1. **Review `analysis.toon`** - Identify critical issues
2. **Check `evolution.toon.yaml`** - Plan refactoring priorities
3. **Use `context.md`** - Get LLM assistance for complex changes
4. **Reference visualizations** - Understand system architecture
5. **Track progress** - Re-run analysis after changes
## π§ Advanced Usage
### Custom Analysis
```bash
# Deep analysis with all insights
code2llm ./ -m hybrid -f all --max-depth 15 -v
# Performance-optimized
code2llm ./ -m static -f toon --strategy quick
# Refactoring-focused
code2llm ./ -f toon,evolution --refactor
```
### Output Customization
```bash
# Separate output directories
code2llm ./ -f all -o ./analysis-$(date +%Y%m%d)
# Split YAML into multiple files
code2llm ./ -f yaml --split-output
# Separate orphaned functions
code2llm ./ -f yaml --separate-orphans
```
---
**Generated by**: `code2llm ./ -f all --readme`
**Analysis Date**: 2026-05-25
**Total Functions**: 1312
**Total Classes**: 143
**Modules**: 427
For more information about code2llm, visit: https://github.com/tom-sapletta/code2llm
## License
Licensed under Apache-2.0.