https://github.com/node0/crystallizer
A Power Tool dressed up as summarization assistant
https://github.com/node0/crystallizer
Last synced: 5 months ago
JSON representation
A Power Tool dressed up as summarization assistant
- Host: GitHub
- URL: https://github.com/node0/crystallizer
- Owner: Node0
- License: agpl-3.0
- Created: 2025-05-09T01:35:14.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-08T11:12:10.000Z (about 1 year ago)
- Last Synced: 2025-06-18T20:44:52.994Z (about 1 year ago)
- Language: Python
- Size: 63.5 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Crystallizer
A Map -> Reduce powerhouse,
disguised as an insight summarization tool.
Crystallizer is a programmable, LLM-powered
general purpose data traversal and transformation tool.
Its default use-case will be as insight extraction
and cohesion across N parts of long documents (think books).
However it can be programmed to do a large number
of open-ended tasks, owing to it's templated
system and task prompt design.

## Installation
**📋 [Complete Installation Guide](INSTALLATION.md)** - Choose your preferred tool
### Quick Links by Tool:
- **🚀 [UV Installation](INSTALLATION.md#-uv-recommended---fastest)** - Fastest setup
- **📦 [Poetry Installation](INSTALLATION.md#-poetry-best-for-development)** - Best for development
- **🐍 [Pip Installation](INSTALLATION.md#-standard-pip-universal)** - Universal compatibility
### Quick Start (pip)
**Requirements**: Python 3.11+
```bash
# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Test installation
python crystallizer.py --help
```
## Usage
```bash
python crystallizer.py \
--system-prompt system_prompt.j2 \
--haystack-path ./chat_logs \
--connection ollama-local \
--task-label gluon_design \
--output-dir ./crystals
```
## Configuration
Configure each LLM connection in `config.json`:
```json
{
"inference_service_connections": {
"ollama-local": {
"api_type": "ollama",
"base_url": "http://localhost:11434",
"default_model": "qwen2.5-coder:32b",
"default_ctx_len": 18000
},
"openai-main": {
"api_type": "openai",
"base_url": "https://api.openai.com/v1",
"api_key": "sk-...",
"default_model": "gpt-4o-mini",
"default_ctx_len": 128000
}
}
}
```
## Features
- **Token-Aware Windowing**: Automatically chunks large documents to fit LLM context limits
- **Multi-Provider Support**: Works with Ollama (local) and OpenAI (cloud) backends
- **Template-Driven Prompts**: Jinja2 templates for custom system prompts
- **Hierarchical Processing**: 3-segment micro-windowing with merge strategies
- **Professional Logging**: Semantic progress tracking with contextual semaphores
- **Batch Processing**: Handle single files or entire directories
## License
GNU AGPLv3