https://github.com/wronai/code2logic
Code2Logic analyzes codebases and generates compact, LLM-friendly representations with semantic understanding. Perfect for feeding project context to AI assistants, building code documentation, or analyzing code structure.
https://github.com/wronai/code2logic
code2logic compactness framework library litellm llm ollama python refactoring
Last synced: 3 months ago
JSON representation
Code2Logic analyzes codebases and generates compact, LLM-friendly representations with semantic understanding. Perfect for feeding project context to AI assistants, building code documentation, or analyzing code structure.
- Host: GitHub
- URL: https://github.com/wronai/code2logic
- Owner: wronai
- License: apache-2.0
- Created: 2026-01-03T10:07:28.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-01-12T14:04:15.000Z (3 months ago)
- Last Synced: 2026-01-12T16:13:13.290Z (3 months ago)
- Topics: code2logic, compactness, framework, library, litellm, llm, ollama, python, refactoring
- Language: Python
- Homepage:
- Size: 2.15 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Code2Logic

[](https://badge.fury.io/py/code2logic)
[](https://www.python.org/downloads/)
[](https://www.apache.org/licenses/LICENSE-2.0)
**Convert source code to logical representation for LLM analysis.**
Code2Logic analyzes codebases and generates compact, LLM-friendly representations with semantic understanding.
Perfect for feeding project context to AI assistants, building code documentation, or analyzing code structure.
## โจ Features
- ๐ณ **Multi-language support** - Python, JavaScript, TypeScript, Java, Go, Rust, and more
- ๐ฏ **Tree-sitter AST parsing** - 99% accuracy with graceful fallback
- ๐ **NetworkX dependency graphs** - PageRank, hub detection, cycle analysis
- ๐ **Rapidfuzz similarity** - Find duplicate and similar functions
- ๐ง **NLP intent extraction** - Human-readable function descriptions
- ๐ฆ **Zero dependencies** - Core works without any external libs
## ๐ Installation
### Basic (no dependencies)
```bash
pip install code2logic
```
### Full (all features)
```bash
pip install code2logic[full]
```
### Selective features
```bash
pip install code2logic[treesitter] # High-accuracy AST parsing
pip install code2logic[graph] # Dependency analysis
pip install code2logic[similarity] # Similar function detection
pip install code2logic[nlp] # Enhanced intents
```
## ๐ Quick Start
```bash
code2logic ./ -f yaml --compact --function-logic --with-schema -o project.yaml
code2logic ./ -f toon --ultra-compact --function-logic --with-schema -o project.toon
```
### Command Line
```bash
# Standard Markdown output
code2logic /path/to/project
# If the `code2logic` entrypoint is not available (e.g. running from source without install):
python -m code2logic /path/to/project
# Compact YAML (14% smaller, meta.legend transparency)
code2logic /path/to/project -f yaml --compact -o analysis-compact.yaml
# Ultra-compact TOON (71% smaller, single-letter keys)
code2logic /path/to/project -f toon --ultra-compact -o analysis-ultra.toon
# Generate schema alongside output
code2logic /path/to/project -f yaml --compact --with-schema
# With detailed analysis
code2logic /path/to/project -d detailed
```

### Python API
```python
from code2logic import analyze_project, MarkdownGenerator
# Analyze a project
project = analyze_project("/path/to/project")
# Generate output
generator = MarkdownGenerator()
output = generator.generate(project, detail_level='standard')
print(output)
# Access analysis results
print(f"Files: {project.total_files}")
print(f"Lines: {project.total_lines}")
print(f"Languages: {project.languages}")
# Get hub modules (most important)
hubs = [p for p, n in project.dependency_metrics.items() if n.is_hub]
print(f"Key modules: {hubs}")
```
### Organized Imports
```python
# Core analysis
from code2logic import ProjectInfo, ProjectAnalyzer, analyze_project
# Format generators
from code2logic import (
YAMLGenerator,
JSONGenerator,
TOONGenerator,
LogicMLGenerator,
GherkinGenerator,
)
# LLM clients
from code2logic import get_client, BaseLLMClient
# Development tools
from code2logic import run_benchmark, CodeReviewer
```
## ๐ Output Formats
### Markdown (default)
Human-readable documentation with:
- Project structure tree with hub markers (โ
)
- Dependency graphs with PageRank scores
- Classes with methods and intents
- Functions with signatures and descriptions
### Compact
Ultra-compact format optimized for LLM context:
```text
# myproject | 102f 31875L | typescript:79/python:23
ENTRY: index.ts main.py
HUBS: evolution-manager llm-orchestrator
[core/evolution]
evolution-manager.ts (3719L) C:EvolutionManager | F:createEvolutionManager
task-queue.ts (139L) C:TaskQueue,Task
```
### JSON
Machine-readable format for:
- RAG (Retrieval-Augmented Generation)
- Database storage
- Further analysis
## ๐ง Configuration
### Library Status
Check which features are available:
```bash
code2logic --status
```
```text
Library Status:
tree_sitter: โ
networkx: โ
rapidfuzz: โ
nltk: โ
spacy: โ
```
### LLM Configuration
Manage LLM providers, models, API keys, and routing priorities:
```bash
code2logic llm status
code2logic llm set-provider auto
code2logic llm set-model openrouter nvidia/nemotron-3-nano-30b-a3b:free
code2logic llm key set openrouter
code2logic llm priority set-provider openrouter 10
code2logic llm priority set-mode provider-first
code2logic llm priority set-llm-model nvidia/nemotron-3-nano-30b-a3b:free 5
code2logic llm priority set-llm-family nvidia/ 5
code2logic llm config list
```
Notes:
- `code2logic llm set-provider auto` enables automatic fallback selection: providers are tried in priority order.
- API keys should be stored in `.env` (or environment variables), not in `litellm_config.yaml`.
- These commands write configuration files:
- `.env` in the current working directory
- `litellm_config.yaml` in the current working directory
- `~/.code2logic/llm_config.json` in your home directory
#### Priority modes
You can choose how automatic fallback ordering is computed:
- `provider-first`
providers are ordered by provider priority (defaults + overrides)
- `model-first`
providers are ordered by priority rules for the provider's configured model (exact/prefix)
- `mixed`
providers are ordered by the best (lowest) priority from either provider priority or model rules
Configure the mode:
```bash
code2logic llm priority set-mode provider-first
code2logic llm priority set-mode model-first
code2logic llm priority set-mode mixed
```
Model priority rules are stored in `~/.code2logic/llm_config.json`.
### Python API (Library Status)
```python
from code2logic import get_library_status
status = get_library_status()
# {'tree_sitter': True, 'networkx': True, ...}
```
## ๐ Analysis Features
### Dependency Analysis
- **PageRank** - Identifies most important modules
- **Hub detection** - Central modules marked with โ
- **Cycle detection** - Find circular dependencies
- **Clustering** - Group related modules
### Intent Generation
Functions get human-readable descriptions:
```yaml
methods:
async findById(id:string) -> Promise # retrieves user by id
async createUser(data:UserDTO) -> Promise # creates user
validateEmail(email:string) -> boolean # validates email
```
### Similarity Detection
Find duplicate and similar functions:
```yaml
Similar Functions:
core/auth.ts::validateToken:
- python/auth.py::validate_token (92%)
- services/jwt.ts::verifyToken (85%)
```
## ๐๏ธ Architecture
```text
code2logic/
โโโ analyzer.py # Main orchestrator
โโโ parsers.py # Tree-sitter + fallback parser
โโโ dependency.py # NetworkX dependency analysis
โโโ similarity.py # Rapidfuzz similar detection
โโโ intent.py # NLP intent generation
โโโ generators.py # Output generators (MD/Compact/JSON)
โโโ models.py # Data structures
โโโ cli.py # Command-line interface
```
## ๐ Integration Examples
### With Claude/ChatGPT
```python
from code2logic import analyze_project, CompactGenerator
project = analyze_project("./my-project")
context = CompactGenerator().generate(project)
# Use in your LLM prompt
prompt = f"""
Analyze this codebase and suggest improvements:
{context}
"""
```
### With RAG Systems
```python
import json
from code2logic import analyze_project, JSONGenerator
project = analyze_project("./my-project")
data = json.loads(JSONGenerator().generate(project))
# Index in vector DB
for module in data['modules']:
for func in module['functions']:
embed_and_store(
text=f"{func['name']}: {func['intent']}",
metadata={'path': module['path'], 'type': 'function'}
)
```
## ๐งช Development
### Setup
```bash
git clone https://github.com/wronai/code2logic
cd code2logic
poetry install --with dev -E full
poetry run pre-commit install
# Alternatively, you can use Makefile targets (prefer Poetry if available)
make install-full
```
### Tests
```bash
make test
make test-cov
# Or directly:
poetry run pytest
poetry run pytest --cov=code2logic --cov-report=html
```
### Type Checking
```bash
make typecheck
# Or directly:
poetry run mypy code2logic
```
### Linting
```bash
make lint
make format
# Or directly:
poetry run ruff check code2logic
poetry run black code2logic
```
## ๐ Performance
| Codebase Size | Files | Lines | Time | Output Size |
| --- | --- | --- | --- | --- |
| Small | 10 | 1K | <1s | ~5KB |
| Medium | 100 | 30K | ~2s | ~50KB |
| Large | 500 | 150K | ~10s | ~200KB |
Compact format is ~10-15x smaller than Markdown.
## ๐ฌ Code Reproduction Benchmarks
Code2Logic can reproduce code from specifications using LLMs. Benchmark results:
### Format Comparison (Token Efficiency)
| Format | Score | Token Efficiency | Spec Tokens | Runs OK |
| --- | --- | --- | --- | --- |
| **YAML** | **71.1%** | 42.1 | **366** | 66.7% |
| **Markdown** | 65.6% | **48.7** | 385 | **100%** |
| JSON | 61.9% | 23.7 | 605 | 66.7% |
| Gherkin | 51.3% | 19.1 | 411 | 66.7% |
### Key Findings
- **YAML is best for score** - 71.1% reproduction accuracy
- **Markdown is best for token efficiency** - 48.7 score/1000 tokens
- **YAML uses 39.6% fewer tokens than JSON** with 9.2% higher score
- **Markdown has 100% runs OK** - generated code always executes
### Run Benchmarks
```bash
# Token-aware benchmark
python examples/11_token_benchmark.py --folder tests/samples/ --no-llm
# Async multi-format benchmark
python examples/09_async_benchmark.py --folder tests/samples/ --no-llm
# Function-level reproduction
python examples/10_function_reproduction.py --file tests/samples/sample_functions.py --no-llm
python examples/15_unified_benchmark.py --folder tests/samples/ --no-llm
# Terminal markdown rendering demo
python examples/16_terminal_demo.py --folder tests/samples/
```
## ๐ค Contributing
Contributions welcome! Please read our [Contributing Guide](CONTRIBUTING.md).
## ๐ License
Apache 2 License - see [LICENSE](LICENSE) for details.
## ๐ Companion Packages
### logic2test - Generate Tests from Logic
Generate test scaffolds from Code2Logic output:
```bash
# Show what can be generated
python -m logic2test out/code2logic/project.c2l.yaml --summary
# Generate unit tests
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/
# Generate all test types (unit, integration, property)
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type all
```
```python
from logic2test import TestGenerator
generator = TestGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate_unit_tests('out/logic2test/tests/')
print(f"Generated {result.tests_generated} tests")
```
### logic2code - Generate Code from Logic
Generate source code from Code2Logic output:
```bash
# Show what can be generated
python -m logic2code out/code2logic/project.c2l.yaml --summary
# Generate Python code
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/
# Generate stubs only
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-only
```
```python
from logic2code import CodeGenerator
generator = CodeGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate('out/logic2code/generated_code/')
print(f"Generated {result.files_generated} files")
```
### Full Workflow: Code โ Logic โ Tests/Code
```bash
# 1. Analyze existing codebase
code2logic src/ -f yaml -o out/code2logic/project.c2l.yaml
# 2. Generate tests for the codebase
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type all
# 3. Generate code scaffolds (for refactoring)
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-only
```
## ๐ Documentation
- [00 - Docs Index](docs/00-index.md) - Documentation home (start here)
- [01 - Getting Started](docs/01-getting-started.md) - Install and first steps
- [02 - Configuration](docs/02-configuration.md) - API keys, environment setup
- [03 - CLI Reference](docs/03-cli-reference.md) - Command-line usage
- [04 - Python API](docs/04-python-api.md) - Programmatic usage
- [05 - Output Formats](docs/05-output-formats.md) - Format comparison and usage
- [06 - Format Specifications](docs/06-format-specifications.md) - Detailed format specs
- [07 - TOON Format](docs/07-toon.md) - Token-Oriented Object Notation
- [08 - LLM Integration](docs/08-llm-integration.md) - OpenRouter/Ollama/LiteLLM
- [09 - LLM Comparison](docs/09-llm-comparison-report.md) - Provider/model comparison
- [10 - Benchmarking](docs/10-benchmark.md) - Benchmark methodology and results
- [11 - Repeatability](docs/11-repeatability.md) - Repeatability testing
- [12 - Examples](docs/12-examples.md) - Usage workflows and examples
- [13 - Architecture](docs/13-architecture.md) - System design and components
- [14 - Format Analysis](docs/14-format-analysis.md) - Deeper format evaluation
- [15 - Logic2Test](docs/15-logic2test.md) - Test generation from logic files
- [16 - Logic2Code](docs/16-logic2code.md) - Code generation from logic files
- [17 - LOLM](docs/17-lolm.md) - LLM provider management
- [18 - Reproduction Testing](docs/18-reproduction-testing.md) - Format validation and code regeneration
- [19 - Monorepo Workflow](docs/19-monorepo-workflow.md) - Managing all packages from repo root
## ๐งฉ Examples
- [examples/](examples/) - All runnable examples
- [examples/run_examples.sh](examples/run_examples.sh) - Example runner script (multi-command workflows)
- [examples/code2logic/](examples/code2logic/) - Minimal project + docker example for code2logic
- [examples/logic2test/](examples/logic2test/) - Minimal project + docker example for logic2test
- [examples/logic2code/](examples/logic2code/) - Minimal project + docker example for logic2code
## ๐ Links
- [Documentation](https://code2logic.readthedocs.io)
- [PyPI](https://pypi.org/project/code2logic/)
- [GitHub](https://github.com/wronai/code2logic)
- [Issues](https://github.com/wronai/code2logic/issues)