https://github.com/greynewell/schemaflux
Structured data compiler. Pass pipeline, pluggable backends.
https://github.com/greynewell/schemaflux
build-tool code-generation compiler compiler-design data-compiler data-transformation frontmatter golang json-ld markdown mist-stack pass-pipeline schema sitemap ssg static-site-generator structured-data taxonomy yaml zero-dependencies
Last synced: about 1 month ago
JSON representation
Structured data compiler. Pass pipeline, pluggable backends.
- Host: GitHub
- URL: https://github.com/greynewell/schemaflux
- Owner: greynewell
- Created: 2025-05-14T23:46:38.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-17T21:32:26.000Z (3 months ago)
- Last Synced: 2026-02-20T14:48:07.308Z (3 months ago)
- Topics: build-tool, code-generation, compiler, compiler-design, data-compiler, data-transformation, frontmatter, golang, json-ld, markdown, mist-stack, pass-pipeline, schema, sitemap, ssg, static-site-generator, structured-data, taxonomy, yaml, zero-dependencies
- Language: Go
- Homepage: https://schemaflux.dev
- Size: 245 KB
- Stars: 12
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# schemaflux
Structured data compiler. Part of the [MIST stack](https://github.com/greynewell/mist-go).
[](https://go.dev)
[](LICENSE)
[](#)
```
markdown + frontmatter -> frontend -> 12 passes -> backend -> output
```
Zero external deps. Single static binary.
## How it works
Entities (structured data with fields, taxonomies, relationships) go in. A config defines the schema. Passes resolve slugs, sort, enrich, group, score relationships, validate. Backends emit output from the finalized IR.
```
1,997 entities -> 2,328 pages in ~500ms
```
1. **Frontend** parses markdown + YAML frontmatter into IR
2. **Passes** transform IR: slugs, sorting, enrichment, taxonomy grouping, related scoring, graph enrichment, content analysis, URL resolution, schema generation, validation
3. **Backend** emits output. IR is immutable at this point.
The built-in HTML backend produces a complete static site: taxonomy pages, pagination, A-Z indices, search index, JSON-LD, Open Graph, sitemaps, RSS, `llms.txt`.
## Install
```bash
go install github.com/greynewell/schemaflux/cmd/schemaflux@latest
schemaflux build --config schemaflux.yaml
```
## Config
```yaml
site:
name: "My Dataset"
base_url: "https://example.com"
paths:
content: "./content"
output: "./output"
templates: "./templates"
taxonomies:
- name: category
label: Categories
field: category
templates:
entity: entity.html
homepage: index.html
```
## Architecture
```
compiler.Compile(cfg)
-> frontend.Parse() # markdown + YAML -> IR
-> pass.Registry.RunAll() # 12 ordered passes
-> backend.Emit() # IR -> output
internal/
compiler/
frontend/ Parse structured data into IR
ir/ Program, ResolvedEntity, TaxonomyGroup
pass/ 12 passes with declared dependencies
backend/ Pluggable output (html/ ships built-in)
config/ YAML config types
entity/ Untyped AST
markdown/ Markdown-to-HTML renderer
yaml/ YAML parser
```
## Badge
[](https://schemaflux.dev)
```markdown
[](https://schemaflux.dev)
```