{"id":25961700,"url":"https://github.com/ttab/revisor","last_synced_at":"2026-02-25T07:08:25.339Z","repository":{"id":243849166,"uuid":"807627967","full_name":"ttab/revisor","owner":"ttab","description":null,"archived":false,"fork":false,"pushed_at":"2024-12-18T07:22:06.000Z","size":246,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-07T10:46:01.119Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ttab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-29T13:16:36.000Z","updated_at":"2024-12-18T07:08:42.000Z","dependencies_parsed_at":"2024-10-16T20:58:39.736Z","dependency_job_id":"44a31549-9e6c-4e49-bc8e-930c754640cb","html_url":"https://github.com/ttab/revisor","commit_stats":null,"previous_names":["ttab/revisor"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/ttab/revisor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ttab%2Frevisor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ttab%2Frevisor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ttab%2Frevisor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ttab%2Frevisor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ttab","download_url":"https://codeload.github.com/ttab/revisor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ttab%2Frevisor/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263061183,"owners_count":23407599,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-04T19:41:01.542Z","updated_at":"2026-02-25T07:08:25.236Z","avatar_url":"https://github.com/ttab.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Revisor\n\nRevisor allows you to define specifications for NewsDoc contents as a series of declarations and pattern matching extensions to existing declarations.\n\n## Breaking changes\n\n### v0.9.0\n\nI this release we remove the ability to define global blocks and attributes that automatically are available for all document types. Globals was a mistake and have now been replaced by block definitions:\n\n``` json\n{\n  \"version\": 1,\n  \"name\": \"block-example\",\n  \"documents\": [\n    {\n      \"name\": \"Article\",\n      \"description\": \"An editorial article\",\n      \"declares\": \"core/article\",\n      \"content\": [\n        {\"ref\": \"core://text\"}\n      ]\n    }\n  ],\n  \"content\": [\n    {\n      \"id\": \"core://text\",\n      \"block\": {\n        \"description\": \"A standard text block\",\n        \"declares\": {\"type\":\"core/text\"},\n        \"attributes\": {\n          \"role\": {\n            \"optional\":true,\n            \"enum\": [\"heading-1\", \"heading-2\", \"preamble\"]\n          }\n        },\n        \"data\": {\n          \"text\":{\n            \"allowEmpty\":true,\n            \"format\": \"html\"\n          }\n        }\n      }\n    }\n  ]\n}\n```\n\nInline blocks can still be declared as before.\n\nWhen using `ref` it's possible to extend the block in the same block constraint. Writing this:\n\n``` json\n{\n  \"version\": 1,\n  \"name\": \"block-example\",\n  \"documents\": [\n    {\n      \"name\": \"Article\",\n      ...\n      \"meta\": [\n        {\n          \"ref\": \"core://newsvalue\",\n          \"count\": 1\n        }\n      ]\n      ...\n```\n\n...is equivalent to:\n\n``` json\n      ...\n      \"meta\": [\n        {\n          \"ref\": \"core://newsvalue\"\n        },\n        {\n          \"match\": {\"type\": \"core/newsvalue\"},\n          \"count\": 1\n        }\n      ]\n      ...\n```\n\n...as any constraints will be treated as a block constraint with a `match` directive equivalent to the `declares` object of the referenced block.\n\n## Writing specifications\n\nThe main entities points in a specification are documents, blocks and properties. Documents are declared by type, blocks by type, rel, and/or role, and properties by name. An entity is not valid if we don't have a matching declaration for it, regardless of whether somebody has pattern-matched against it.\n\nBoth pattern matching and a lot of the validation that's performed is done though key value pairs of a name and a string constraint. Say that we want to match all links that have a rel of \"subject\", \"channel\", or \"section\" and add the ability to have \"broader\" links added to them, the specification would then look like this:\n\n``` json\n{\n  \"name\": \"Associated with and broader links\",\n  \"description\": \"Extends subject, channel, and section links with broader links\",\n  \"match\": {\"rel\": {\n    \"enum\": [\"subject\", \"channel\", \"section\"]\n  }},\n  \"links\": [\n    {\n      \"declares\": {\"rel\":\"broader\"},\n      \"attributes\": {\n        \"type\": {},\n        \"title\": {}\n      }\n    }\n  ]\n}\n```\n\nHere we declare that links with `rel` \"broader\" are valid for all blocks that matches our expression, see \"Block attributes\" for a list of attributes that can be used in pattern matching. We also define that the attributes `type` and `title` must be present. The `{\"enum\":...}` object and the empty objects (`{}`) for `type` and `title` are all examples of string constraints.\n\n### String constraints\n\n| Name          | Use                                                                                                        |\n|:--------------|:-----------------------------------------------------------------------------------------------------------|\n| optional      | Set to `true` if the value doesn't have to be present                                                      |\n| allowEmpty    | Set to `true` if an empty value is ok.                                                                     |\n| const         | A specific `\"value\"` that must match                                                                       |\n| enum          | A list `[\"of\", \"values\"]` where one must match                                                             |\n| pattern       | A regular expression that the value must match                                                             |\n| glob          | A list of glob patterns `[\"http://**\", \"https://**\"]` where one must match                                 |\n| format        | A named format that the value must follow                                                                  |\n| time          | A time format specification                                                                                |\n| colourFormats | Controls the \"colour\" format. Any combination of  \"hex\", \"rgb\", and \"rgba\". Defaults to `[\"rgb\", \"rgba\"]`. |\n| geometry      | The geometry and coordinate type that must be used for WKT strings.                                        |\n| labels        | Labels used to describe the value                                                                          |\n| hints         | Key value pairs used to describe the value                                                                 |\n\nThe distinction between optional and allowEmpty is only relevant for data attributes. The document and block attributes defined in the NewsDoc schema always exist, so `optional` and `allowEmpty` will be treated as equivalent. \n\n#### Formats\n\nThe following formats are available:\n\n* `RFC3339`: an RFC3339 timestamp (\"2022-05-11T14:10:32Z\")\n* `int`: an integer (\"1234\")\n* `float`: a floating point number (\"12.34\")\n* `bool`: a boolean (\"true\" or \"false\")\n* `html`: validate the contents as HTML\n* `uuid`: validate the string as a UUID\n* `wkt`: validate the string as a [WKT geometry](#wkt-geometry).\n* `colour`: a colour in one of the formats specified in `colourFormats`.\n\nWhen using the format \"html\" it's also possible to use `htmlPolicy` to use a specific HTML policy. See the section on [HTML policies](#markdown-header-html-policies).\n\nThe document and block `uuid` attributes are always validated as UUIDs and need no additional \"uuid\" format specified.\n\n#### Time formats\n\nA Go time parsing layout (see the [time package](https://pkg.go.dev/time#pkg-constants) for documentation) that should be used to validate the timestamp.\n\n#### Globs\n\nGlob matching uses [https://github.com/gobwas/glob](https://github.com/gobwas/glob) for matching, and the glob patterns are compiled with \"/\" and \"+\" as separators.\n\n#### WKT geometry\n\nThe geometry specification is a combination of the geometry type to expect, and optionally the types of coordinates it should contain, in the format `{geometry-type}[-{coordinates}]`. If no geometry is specified any of the supported types and coordinates are allowed. If no coordinates are specified the default X and Y coordinates are assumed.\n\nGeometry types:\n\n* `point`\n* `multipoint`\n* `linestring`\n* `multilinestring`\n* `polygon`\n* `multipolygon`\n* `circularstring`\n\nCoordinates, X and Y is the default if nothing else is specified:\n\n* `z`: X, Y, and Z coordinates\n* `m`: X and Y coordinates and a measurement\n* `zm`:X, Y and Z coordinates and a measurement\n\n#### Labels and hints\n\nLabels and hints do not play any role in the validation of documents. They are instead meant to describe the value for systems that use the information in the revisor schema to process the data correctly. It could f.ex. be used to tell a system that a specific WKT point is the position of the document itself, that a string value should be indexed as a keyword (non-tokenised), or provide other kinds of processing hints unrelated to the validation.\n\nExample block declaration:\n\n``` json\n{\n  \"declares\": {\n    \"type\": \"tt/slugline\"\n  },\n  \"maxCount\": 1,\n  \"attributes\": {\n    \"value\": {\n      \"labels\": [\"keyword\"],\n      \"hints\": {\"alias\": [\"slug\"]}\n    }\n  }\n}\n```\n\n### Writing a document specification\n\nA specification for a document contains:\n\n* documentation attributes `name` and `description`\n* a declaration (`declares`) or pattern matching rule (`match`) \n* attribute constraints (`attributes`)\n* `meta`, `links`, and `content` block specifications\n\n``` json\n{\n  \"name\": \"Planning item\",\n  \"description\": \"Planned news coverage\",\n  \"declares\": \"core/newscoverage\",\n  \"meta\": [\n    {\n      \"name\": \"Main metadata block\",\n      \"declares\": {\"type\":\"core/newscoverage\"},\n      \"count\": 1,\n      \"data\": {\n        \"dateGranularity\": {\"enum\":[\"date\", \"datetime\"]},\n        \"description\": {\"allowEmpty\":true},\n        \"start\": {\"format\":\"RFC3339\"},\n        \"end\": {\"format\":\"RFC3339\"},\n        \"priority\": {},\n        \"publicDescription\":{\"allowEmpty\":true},\n        \"slug\": {\"allowEmpty\":true}\n      }\n    }\n  ],\n  \"links\": [\n    {\n      \"declares\": {\"type\": \"x-im/assignment\"},\n      \"links\": [\n        {\n          \"declares\": {\n            \"rel\":\"assignment\", \"type\": \"x-im/assignment\"\n          },\n          \"attributes\": {\n            \"uuid\": {}\n          }\n        }\n      ]\n    }\n  ]\n}\n```\n\n### Writing a block specification\n\nA block specification can contain:\n\n* documentation attributes `name` and `description`\n* a declaration (`declares`) or pattern matching rule (`match`) \n* attribute constraints (`attributes`)\n* `data` constraints\n* `meta`, `links`, and `content` block specifications\n* `count`, `minCount` and `maxCount` to control how many times a block can occur in the list of blocks it's in\n* `blocksFrom` directives that borrows the allowed blocks from a declared document type.\n\n``` json\n{\n  \"declares\": {\"type\": \"core/socialembed\"},\n  \"links\": [\n    {\n      \"declares\": {\"rel\":\"self\", \"type\":\"core/tweet\"},\n      \"maxCount\": 1,\n      \"attributes\": {\n        \"uri\": {\"glob\":[\"core://tweet/*\"]},\n        \"url\": {\"glob\":[\"https://twitter.com/narendramodi/status/*\"]}\n      }\n    },\n    {\n      \"declares\": {\"rel\":\"alternate\", \"type\":\"text/html\"},\n      \"maxCount\": 1,\n      \"attributes\": {\n        \"url\": {\"glob\":[\"https://**\"]},\n        \"title\": {}\n      },\n      \"data\": {\n        \"context\": {},\n        \"provider\": {}\n      }\n    }\n  ]\n}\n```\n\n### HTML policies\n\nHTML policies are used to restrict what elements and attributes can be used in strings with the format \"html\". Attributes are defined as string constraints on elements. The default policy could look like this:\n\n``` json\n  \"htmlPolicies\": [\n    {\n      \"name\": \"default\",\n      \"elements\": {\n        \"strong\": {\n          \"attributes\": {\n            \"id\": {\"optional\":true}\n          }\n        },\n        \"a\": {\n          \"attributes\": {\n            \"id\": {\"optional\":true},\n            \"href\": {}\n          }\n        }\n      }\n    },\n    {\n      \"name\": \"table\",\n      \"uses\": \"default\",\n      \"elements\": {\n        \"tr\": {\n          \"attributes\": {\n            \"id\": {\"optional\":true}\n          }\n        },\n        \"td\": {\n          \"attributes\": {\n            \"id\": {\"optional\":true}\n          }\n        },\n        \"th\": {\n          \"attributes\": {\n            \"id\": {\"optional\":true}\n          }\n        }\n      }\n    }\n  ]\n```\n\nAll \"html\" strings that use the default policy would then be able to use `\u003cstrong\u003e` and `\u003ca\u003e`, and the \"href\" attribute would be requred for `\u003ca\u003e`. A \"html\" string that uses the \"table\" policy would be able to use everything from the default policy *and* `\u003ctr\u003e`, `\u003ctd\u003e`, and `\u003cth\u003e`.\n\nA customer can extend HTML policies using the \"extend\" attribute:\n\n``` json\n  \"htmlPolicies\": [\n    {\n      \"extends\": \"default\",\n      \"elements\": {\n        \"personTag\": {\n          \"attributes\": {\n            \"id\": {}\n          }\n        }\n      }\n    }\n  ]\n```\n\nThis would add support for \"\u003cpersontag\u003e/\u003cpersonTag\u003e\" (HTML is case insensitive) to the default policy, and any policies that use it. Only one level of \"extends\" and \"uses\" is allowed, further chaining policies will result in an error.\n\n### Attribute reference\n\n#### Document attributes\n\nA list of available document attributes, and whether they can be used in pattern matching.\n\n| Name     | Description                                    | Match |\n|:---------|------------------------------------------------|:------|\n| uuid     | The document uuid                              | No    |\n| uri      | The URI that identifies the document           | No    |\n| url      | A web-browsable location for the document      | No    |\n| type     | The type of the document                       | Yes   |\n| language | The document language                          | No    |\n| title    | The document title                             | No    |\n\n#### Block attributes\n\nA list of available block attributes, and whether they can be used in pattern matching.\n\n| Name        | Description                                               | Match |\n|:------------|:----------------------------------------------------------|:------|\n| uuid        | The UUID of the document the block represents             | No    |\n| type        | The type of the block                                     | Yes   |\n| uri         | Identifies a resource in in URI form                      | Yes   |\n| url         | A web-browsable location for the block                    | Yes   |\n| title       | Human readable title of the block                         | No    |\n| rel         | The relationship the block describes                      | Yes   |\n| name        | A name that identifies the block                          | Yes   |\n| value       | A generic value for the block                             | Yes   |\n| contenttype | The content type of the resource that the block describes | Yes   |\n| role        | The role that the block or resource has                   | Yes   |\n\n## Testing\n\nRevisor implements a file-driven test in `TestValidateDocument` that checks so that all the \"testdata/results/*.json\" files match the validation results for the corresponding document under \"testdata/\". Result files with the prefix \"base-\" will be validated against \"constraints/naviga.json\", for result files with the prefix \"example-\" the \"constraints/example.json\" constraints will be used as well.\n\nIf the constraints have been updated, or new example documents have been added, the result files can be regenerated using `./update-test-results.sh`.\n\n### Benchmarks\n\nThe benchmark `BenchmarkValidateDocument` tests the performance of validating \"testdata/example-article.json\" against the naviga and example organisation contsraint sets.\n\nTo run the benchmark execute:\n\n``` bash\n$ go test -bench . -benchmem -cpu 1\n```\n\nAdd the flags `-memprofile memprofile.out -cpuprofile profile.out` to collect CPU and memory profiles. Run `go tool pprof -web profile.out` for the respective profile files to open a profile graph in your web browser.\n\n#### Comparing benchmarks\n\nInstall `benchstat`: `go install golang.org/x/perf/cmd/benchstat@latest`.\n\nRun the benchmark on the unchanged code (stash your changes or check out main):\n\n``` bash\n$ go test -bench . -benchmem -count 5 -cpu 1 | tee old.txt\n```\n\nThen run the benchmarks on the new code:\n\n``` bash\n$ go test -bench . -benchmem -count 5 -cpu 1 | tee new.txt\n```\n\nFinally, run benchstat to get a summary of the change:\n\n``` bash\n$ benchstat old.txt new.txt\nname              old time/op    new time/op    delta\nValidateDocument     203µs ± 7%      99µs ± 3%  -51.03%  (p=0.008 n=5+5)\n\nname              old alloc/op   new alloc/op   delta\nValidateDocument     134kB ± 0%      35kB ± 0%  -73.74%  (p=0.008 n=5+5)\n\nname              old allocs/op  new allocs/op  delta\nValidateDocument     1.05k ± 0%     0.59k ± 0%  -43.48%  (p=0.008 n=5+5)\n```\n\n### Fuzz tests\n\nThere are two fuzz targets in the project: `FuzzValidationWide` that allows fuzzing of the document and two constraint sets. It will load the core constraints, the example organisation constraints, and all documents in \"./testdata/\" and add them as fuzzing seeds. `FuzzValidationConstraints` adds all constraint sets from the \"./constraints/\" and adds them as fuzzing seeds. The fuzzing operation is then done against all documents in \"./testdata/\".\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fttab%2Frevisor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fttab%2Frevisor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fttab%2Frevisor/lists"}