https://github.com/forkwright/zetesis
Sovereign research substrate — free-first, self-hosted-orchestration, budget-capped research and search aggregation for agent runtimes.
https://github.com/forkwright/zetesis
Last synced: 3 days ago
JSON representation
Sovereign research substrate — free-first, self-hosted-orchestration, budget-capped research and search aggregation for agent runtimes.
- Host: GitHub
- URL: https://github.com/forkwright/zetesis
- Owner: forkwright
- License: agpl-3.0
- Created: 2026-04-23T12:45:59.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-22T04:31:03.000Z (22 days ago)
- Last Synced: 2026-05-22T11:39:58.112Z (22 days ago)
- Size: 84 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# zetesis
*ζήτησις - systematic inquiry.*
Planned sovereign research substrate for unifying research and search providers behind one Rust interface with budget enforcement, rate-limit management, cited result normalization, and a cache layer.
**Status:** Phase 1 scaffold. The locked four-crate workspace has landed; `sylloge` owns the initial provider/result/budget surface, and `zetesis` re-exports the public facade.
**Canonical state:** See the internal planning record maintained for active roadmap and blocker status.
## Why
Frontier-model built-in search is an opaque black-box priced per-token. Vendor aggregators (Brave / Exa / Tavily / You.com / Valyu / Perplexity) are pay-per-query with little architectural control and costs that compound fast at agent scale.
Zetesis takes a different shape:
- **Free-first routing** across free-quality academic and reference APIs (Semantic Scholar, arXiv, OpenAlex, Crossref, PubMed, Wikipedia). Most research queries resolve here.
- **Self-hosted orchestration** for deep research through a local-first research loop running against local LLMs on dedicated GPU time. Multi-step synthesis should not require paid model APIs by default.
- **Budget-capped paid APIs** (Brave, Exa) as fallback only when Tier 0 misses.
- **Cached by default** with per-provider freshness windows.
- **Cited + structured** output always; no synthesis without source provenance.
## Architecture
> **Phase 1 scaffold.** The four-crate decomposition below is the locked public design; active planning and decision records are maintained internally.
| Crate | Role |
|-------|------|
| `zetesis` | Facade, CLI, daemon binary, adapter traits |
| `syllogē` | Provider abstraction, routing, budget, cache, deep-research-orchestrator wrapper |
| `elenkhos` | Retrospective steel-manning engine |
| `synopsis` | Briefing synthesizer |
## Consumer map
- **aletheia** - nous agents call zetesis as their primary research tool
- **dioptron** - sovereign web runtime; D5 tiered knowledge store ingests zetesis results
- **akroasis** - OSINT domain public-source research
## Non-goals
- Not a conversational search UI (consumer concern)
- Not a content crawler (route to Firecrawl / trafilatura when needed)
- Not an LLM synthesis engine (consumer's LLM layer)
- Not a vector store (that is `heurēma`)
- Not a credentials manager (operator vault owns API keys)
## Development
Current planning authority lives in the internal project state record. This public README stays stable: purpose, boundaries, consumer map, and the locked crate shape. It should not duplicate the live roadmap.
## Design Notes
- [Multi-signal classifiers](docs/research/multi-signal-classifiers.md) - required evidence record shape for classifier designs that combine weighted signals before export.
- [Deep research provider decision](docs/research/deep-research-provider-decision.md) - Phase 05 decision to vendor the local-deep-researcher loop pattern, reuse the gpt-researcher `vllm_openai` adapter shape, and reject open_deep_research as the default contract.
## License
AGPL-3.0-or-later