{"id":22226583,"url":"https://github.com/adah1972/json_schema_converter","last_synced_at":"2025-04-11T12:23:22.261Z","repository":{"id":147987525,"uuid":"199161560","full_name":"adah1972/json_schema_converter","owner":"adah1972","description":"Make JSON schemas suitable for use in validators and MongoDB","archived":false,"fork":false,"pushed_at":"2019-08-30T04:21:29.000Z","size":56,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-25T08:42:38.343Z","etag":null,"topics":["json","json-schema","mongodb","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adah1972.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-27T12:22:43.000Z","updated_at":"2024-12-22T13:40:13.000Z","dependencies_parsed_at":"2023-04-14T19:04:17.667Z","dependency_job_id":null,"html_url":"https://github.com/adah1972/json_schema_converter","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adah1972%2Fjson_schema_converter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adah1972%2Fjson_schema_converter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adah1972%2Fjson_schema_converter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adah1972%2Fjson_schema_converter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adah1972","download_url":"https://codeload.github.com/adah1972/json_schema_converter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248401350,"owners_count":21097328,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["json","json-schema","mongodb","python"],"created_at":"2024-12-03T00:31:20.366Z","updated_at":"2025-04-11T12:23:22.237Z","avatar_url":"https://github.com/adah1972.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"﻿# JSON Schema Converter\n\n## Problems\n\n[JSON Schema][1] provides a useful way to validate JSON data. However,\n[MongoDB][2] supports only a subset of JSON Schema specification draft 4.\nSpecifically, [definitions and references are left out][3] (as of 28\nJuly 2019). MongoDB 3.2 and 3.4 does not even support JSON Schema, but\nonly a [MongoDB-proprietary validation method][4]. Besides, [MongoDB has\na richer type system than JSON Schema][5]. . . .\n\n[1]: https://json-schema.org/\n[2]: https://www.mongodb.com/\n[3]: https://docs.mongodb.com/manual/reference/operator/query/jsonSchema/#json-schema-omission\n[4]: https://docs.mongodb.com/v3.2/core/document-validation/\n[5]: https://docs.mongodb.com/manual/reference/operator/query/type/#document-type-available-types\n\n## Solution\n\n### One input, multiple outputs\n\nMy decision is that all my schemas should take the same format, but can\nbe converted to serve different purposes. I also decide that all types\nshould be treated equally, so a custom type will be referenced simply as\n`\"type\": \"MyType\"`, instead of `\"$ref\": \"…\"` (or `\"bsonType\": \"…\"`, in\nthe case of MongoDB BSON-specific types). Apart from that, the input\nformat conforms to JSON Schema (draft 4). This will allow people to\nwrite simply `\"type\": \"objectId\"` when the MongoDB BSON type ObjectId is\nintended.\n\n### Example\n\nAn input (assume it is named *test.json*):\n\n```json\n{\n  \"type\": \"object\",\n  \"required\": [\"_id\", \"name\", \"gender\"],\n  \"properties\": {\n    \"_id\": {\n      \"type\": \"objectId\"\n    },\n    \"name\": {\n      \"type\": \"string\",\n      \"maxLength\": 80\n    },\n    \"gender\": {\n      \"type\": \"string\",\n      \"enum\": [\"male\", \"female\", \"ambiguous\", \"unknown\"]\n    },\n    \"age\": {\n      \"type\": \"number\"\n    }\n  },\n  \"additionalProperties\": false\n}\n```\n\n---\n\nOutput with `./convert_schema.py -t draft4 test.json` (`-t draft4` can\nbe omitted, as it is the default):\n\n```json\n{\n  \"$schema\": \"http://json-schema.org/draft-04/schema#\",\n  \"definitions\": {\n    \"objectId\": {\n      \"type\": \"object\",\n      \"required\": [\n        \"$oid\"\n      ],\n      \"properties\": {\n        \"$oid\": {\n          \"type\": \"string\",\n          \"pattern\": \"^[0-9A-Fa-f]{24}$\"\n        }\n      },\n      \"additionalProperties\": false\n    }\n  },\n  \"type\": \"object\",\n  \"required\": [\n    \"_id\",\n    \"name\",\n    \"gender\"\n  ],\n  \"properties\": {\n    \"_id\": {\n      \"$ref\": \"#/definitions/objectId\"\n    },\n    \"name\": {\n      \"type\": \"string\",\n      \"maxLength\": 80\n    },\n    \"gender\": {\n      \"type\": \"string\",\n      \"enum\": [\n        \"male\",\n        \"female\",\n        \"ambiguous\",\n        \"unknown\"\n      ]\n    },\n    \"age\": {\n      \"type\": \"number\"\n    }\n  },\n  \"additionalProperties\": false\n}\n```\n\n(You can see that `\"type\": \"objectId\"` is changed to `\"$ref\":\n\"#/definitions/objectId\"`, and a definition of `objectId` — included in\nmy script — is generated automatically. Of course, you can add your own\ndefinitions too.)\n\n---\n\nOutput with `./convert_schema.py -t mongo36 test.json`:\n\n```json\n{\n  \"$jsonSchema\": {\n    \"bsonType\": \"object\",\n    \"required\": [\n      \"_id\",\n      \"name\",\n      \"gender\"\n    ],\n    \"properties\": {\n      \"_id\": {\n        \"bsonType\": \"objectId\"\n      },\n      \"name\": {\n        \"bsonType\": \"string\",\n        \"maxLength\": 80\n      },\n      \"gender\": {\n        \"bsonType\": \"string\",\n        \"enum\": [\n          \"male\",\n          \"female\",\n          \"ambiguous\",\n          \"unknown\"\n        ]\n      },\n      \"age\": {\n        \"bsonType\": \"number\"\n      }\n    },\n    \"additionalProperties\": false\n  }\n}\n```\n\n(You can see that `\"type\"` is changed to `\"bsonType\"` — for consistency,\nalthough only necessary for types not present in standard JSON — and the\nwhole thing is wrapped in a `$jsonSchema` field for easy use with\nMongoDB.)\n\n---\n\nOutput with `./convert_schema.py -t mongo32 test.json` (only basic\nsupport for this target type is implemented, as MongoDB 3.4 will soon\nreach its end of life):\n\n```json\n{\n  \"_id\": {\n    \"$type\": \"objectId\"\n  },\n  \"name\": {\n    \"$type\": \"string\"\n  },\n  \"gender\": {\n    \"$type\": \"string\",\n    \"$in\": [\n      \"male\",\n      \"female\",\n      \"ambiguous\",\n      \"unknown\"\n    ]\n  }\n}\n```\n\nThe old MongoDB validation method has the strange behaviour that `$type`\nmatches the type of the elements of an array, instead of the array type.\nThis is taken account of in the current code.\n\n### Expanding definitions for ‘mongo36’\n\nOne major shortcoming of the MongoDB schema validation is that\n**definitions** and **$ref** are not supported. My converter supports\ndefinition expansion for the ‘mongo36’ target. Of course, recursive\ntypes have to be crippled.\n\nLet us look at a loose definition of `geoJsonObject` as follows (please\nnotice the recursive definition of `coordinates`):\n\n```json\n{\n  \"definitions\": {\n    \"coordinates\": {\n      \"type\": \"array\",\n      \"items\": {\n        \"anyOf\": [\n          {\"type\": \"number\"},\n          {\"type\": \"coordinates\"}\n        ]\n      }\n    },\n    \"geoJsonObject\": {\n      \"type\": \"object\",\n      \"required\": [\"type\", \"coordinates\"],\n      \"properties\": {\n        \"type\": {\n          \"type\": \"string\",\n          \"enum\": [\n            \"Point\",\n            \"MultiPoint\",\n            \"LineString\",\n            \"MultiLineString\",\n            \"Polygon\",\n            \"MultiPolygon\"\n          ]\n        },\n        \"coordinates\": {\"type\": \"coordinates\"}\n      },\n      \"additionalProperties\": false\n    }\n  }\n}\n```\n\nIf a location was described this way:\n\n```json\n{\n  \"properties\": {\n    \"location\": {\n      \"type\": \"geoJsonObject\"\n    }\n  }\n}\n```\n\nThe ‘draft4’ output would be:\n\n```json\n{\n  \"properties\": {\n    \"location\": {\n      \"$ref\": \"#/definitions/geoJsonObject\"\n    }\n  }\n}\n```\n\nThe ‘mongo36’ output would be:\n\n```json\n{\n  \"properties\": {\n    \"location\": {\n      \"bsonType\": \"object\",\n      \"required\": [\n        \"type\",\n        \"coordinates\"\n      ],\n      \"properties\": {\n        \"type\": {\n          \"bsonType\": \"string\",\n          \"enum\": [\n            \"Point\",\n            \"MultiPoint\",\n            \"LineString\",\n            \"MultiLineString\",\n            \"Polygon\",\n            \"MultiPolygon\"\n          ]\n        },\n        \"coordinates\": {\n          \"bsonType\": \"array\"\n        }\n      },\n      \"additionalProperties\": false\n    }\n  }\n}\n```\n\n### Alternative definitions\n\nSometimes we may want one type to serve very different purposes for\ndifferent kinds of outputs. E.g. we may want to use a string in [data\nURI scheme][6] in JSON data input, but also to store the data in a more\nefficient way in the database — MongoDB has a `binData` type\nspecifically for this purpose. Therefore, we may want to use separate\ndefinitions for `json` schema output and `mongodb` schema output. For\nthis reason, I have introduced `alt_definitions`. Let us look at an\nexample:\n\n```json\n{\n  \"alt_definitions\": {\n    \"json\": {\n      \"binaryData\": {\n        \"type\": \"string\",\n        \"pattern\": \"^data:.*?,\"\n      }\n    },\n    \"mongodb\": {\n      \"binaryData\": {\n        \"type\": \"object\",\n        \"required\": [\"media_type\", \"data\"],\n        \"properties\": {\n          \"media_type\": {\"type\": \"string\"},\n          \"data\": {\"type\": \"binData\"}\n        },\n        \"additionalProperties\": false\n      }\n    }\n  }\n}\n```\n\nWith these definitions, a `binaryData` type can validate against a\nstring like `\"data:image/png;base64, …\"` as JSON input, but also\nvalidate against an object containing a `media_type` tag as well as real\nbinary `data` stored in MongoDB.\n\n[6]: https://en.wikipedia.org/wiki/Data_URI_scheme\n\n### Separate definitions\n\nThis script now supports providing the definitions file separately from\nthe schema so that it is easier to share common definitions in a\nproject. One can use the `-d` command-line option to pass additional\ndefinitions (this option can be repeated).\n\n### A last notice\n\nAn input may be valid for one target type but not for another. For\nexample, I do not use the BSON type *Timestamp*, and my converter does\nnot include special support for it (though it can be done quite\ntrivially): while a schema containing the type `timestamp` can be\nconverted targeting ‘mongo32’/‘mongo36’, it will fail when targeting\n‘draft4’.\n\n## System requirements\n\nFor simplicity, the code is written for Python 3.6+ only. No additional\npackages are needed.\n\n## Licence\n\nCopyright © 2019 Wu Yongwei.\n\nThis software is provided ‘as-is’, without any express or implied\nwarranty. In no event will the author be held liable for any damages\narising from the use of this software. Permission is granted to anyone\nto use this software for any purpose, including commercial applications,\nand to alter it and redistribute it freely, subject to the following\nrestrictions:\n\n1. The origin of this software must not be misrepresented; you must not\n   claim that you wrote the original software. If you use this software\n   in a product, an acknowledgement in the product documentation would\n   be appreciated but is not required.\n2. Altered source versions must be plainly marked as such, and must not\n   be misrepresented as being the original software.\n3. This notice may not be removed or altered from any source distribution.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadah1972%2Fjson_schema_converter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadah1972%2Fjson_schema_converter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadah1972%2Fjson_schema_converter/lists"}