Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pennsignals/aptos
:sunny: A tool for validating data using JSON Schema and converting JSON Schema documents into different data-interchange formats
https://github.com/pennsignals/aptos
avro avro-schema cli command-line-tool data-interchange json-schema python3 schema-conversion validation
Last synced: 12 days ago
JSON representation
:sunny: A tool for validating data using JSON Schema and converting JSON Schema documents into different data-interchange formats
- Host: GitHub
- URL: https://github.com/pennsignals/aptos
- Owner: pennsignals
- License: apache-2.0
- Archived: true
- Created: 2017-08-07T22:01:12.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-10-09T22:24:27.000Z (about 4 years ago)
- Last Synced: 2024-04-16T01:37:50.800Z (7 months ago)
- Topics: avro, avro-schema, cli, command-line-tool, data-interchange, json-schema, python3, schema-conversion, validation
- Language: Python
- Homepage:
- Size: 72.3 KB
- Stars: 150
- Watchers: 19
- Forks: 24
- Open Issues: 6
-
Metadata Files:
- Readme: readme.md
- Contributing: contributing.json
- License: LICENSE
Awesome Lists containing this project
README
> Validate client-submitted data using [JSON Schema](http://json-schema.org/) documents and convert JSON Schema documents into different data-interchange formats.
## Contents
- [Installation](#installation)
- [Usage](#usage)
- [Data Validation](#data-validation)
- [Data Validation CLI](#data-validation-cli)
- [Data Validation API](#data-validation-api)
- [Structured Messaged Generation](#structured-message-generation)
- [Supported Data-Interchange Formats](#supported-data-interchange-formats)
- [Avro](#avro)
- [Data-Interchange CLI](#data-interchange-cli)
- [Data-Interchange API](#data-interchange-api)
- [Testing](#testing)
- [Additional Resources](#additional-resources)
- [Future Considerations](#future-considerations)
- [Maintainers](#maintainers)
- [Contributing](#contributing)
- [License](#license)## Why aptos?
- Validate client-submitted data
- Convert JSON Schema documents into different data-interchange formats
- Simple syntax
- CLI support for data validation and JSON Schema conversion
- [Stop Being a "Janitorial" Data Scientist](https://medium.com/@rightlag/stop-being-a-janitorial-data-scientist-5959cccbeac)## Installation
**via pip**
$ pip install aptos
**via git**
$ git clone https://github.com/pennsignals/aptos.git && cd aptos
$ python setup.py install## Usage
`aptos` supports the following capabilities:
- **Data Validation:** Validate client-submitted data using [validation keywords](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6) described in the JSON Schema specification.
- **Schema Conversion:** Convert JSON Schema documents into different data-interchange formats. See the list of [supported data-interchange formats](#supported-data-interchange-formats) for more information.```
usage: aptos [arguments] SCHEMAaptos is a tool for validating client-submitted data using the JSON Schema
vocabulary and converts JSON Schema documents into different data-interchange
formats.positional arguments:
schema JSON document containing the descriptionoptional arguments:
-h, --help show this help message and exitArguments:
{validate,convert}
validate Validate a JSON instance
convert Convert a JSON Schema into a different data-interchange
formatMore information on JSON Schema: http://json-schema.org/
```
## Data Validation
Here is a basic example of a JSON Schema:
```json
{
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
}
},
"required": ["firstName", "lastName"]
}
```Given a JSON Schema, `aptos` can validate client-submitted data to ensure that it satisfies a certain number of criteria.
JSON Schema [Validation keywords](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6) such as `minimum` and `required` can be used to impose requirements for successful validation of an instance. In the JSON Schema above, both the `firstName` and `lastName` properties are required, and the `age` property *MUST* have a value greater than or equal to 0.
| Valid Instance :heavy_check_mark: | Invalid Instance :heavy_multiplication_x: |
|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
| `{"firstName": "John", "lastName": "Doe", "age": 42}` | `{"firstName": "John", "age": -15}` (missing required property `lastName` and `age` is not greater than or equal to 0) |`aptos` can validate client-submitted data using either the CLI or the API:
### Data Validation CLI
$ aptos validate -instance INSTANCE SCHEMA
**Arguments:**
- **INSTANCE:** JSON document being validated
- **SCHEMA:** JSON document containing the description**Example - macOS:**
$ aptos validate -instance '{"firstName": "John"}' person.json
**Example - Windows:**
> aptos validate -instance "{\"firstName\": \"John\"}" person.json
| Successful Validation :heavy_check_mark: | Unsuccessful Validation :heavy_multiplication_x: |
|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
| ![](https://user-images.githubusercontent.com/2184329/29053486-5c787966-7bbe-11e7-8fd3-4cb51d87d7d9.png) | ![](https://user-images.githubusercontent.com/2184329/29053538-afcce9c6-7bbe-11e7-8be5-61ac1d876fc1.png) |### Data Validation API
```python
import jsonfrom aptos.parser import SchemaParser
from aptos.visitor import ValidationVisitorwith open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
# Invalid client-submitted data (instance)
instance = {
'firstName': 'John'
}
try:
component.accept(ValidationVisitor(instance))
except AssertionError as e:
print(e) # instance {'firstName': 'John'} is missing required property 'lastName'
```## Structured Message Generation
Given a JSON Schema, `aptos` can generate different structured messages.
:warning: **Note:** The JSON Schema being converted *MUST* be a valid [JSON Object](https://spacetelescope.github.io/understanding-json-schema/reference/object.html).
## Supported Data-Interchange Formats
| Format | Supported | Notes |
|---------------------------------------------------------------------|:------------------------:|-----------------------------|
| [Apache Avro](https://avro.apache.org/) | :heavy_check_mark: | |
| [Protocol Buffers](https://developers.google.com/protocol-buffers/) | :heavy_multiplication_x: | Planned for future releases |
| [Apache Thrift](https://thrift.apache.org/) | :heavy_multiplication_x: | Planned for future releases |
| [Apache Parquet](https://parquet.apache.org/) | :heavy_multiplication_x: | Planned for future releases |### Avro
Using the `Person` schema in the previous example, `aptos` can convert the schema into the Avro data-interchange format using either the CLI or the API.
`aptos` maps the following JSON schema types to Avro types:
| JSON Schema Type | Avro Type |
|------------------|-----------|
| `string` | `string` |
| `boolean` | `boolean` |
| `null` | `null` |
| `integer` | `long` |
| `number` | `double` |
| `object` | `record` |
| `array` | `array` |> JSON Schema documents containing the `enum` validation keyword are mapped to Avro [`enum`](http://avro.apache.org/docs/current/spec.html#Enums) `symbols` attribute.
> JSON Schema documents with the `type` keyword as an array are mapped to Avro [Union](http://avro.apache.org/docs/current/spec.html#Unions) types.
## Data-Interchange CLI
$ aptos convert -format FORMAT SCHEMA
**Arguments:**
- **FORMAT:** Data-interchange format
- **SCHEMA:** JSON document containing the description
## Data-Interchange API
```python
import jsonfrom aptos.parser import SchemaParser
from aptos.schema.visitor import AvroSchemaVisitorwith open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
record = component.accept(AvroSchemaVisitor())
print(json.dumps(record, indent=2))
```The above code generates the following Avro schema:
```json
{
"type": "record",
"fields": [
{
"doc": "",
"type": "string",
"name": "lastName"
},
{
"doc": "",
"type": "string",
"name": "firstName"
},
{
"doc": "Age in years",
"type": "long",
"name": "age"
}
],
"name": "Person"
}
```## Testing
All unit tests exist in the [tests](tests) directory.
To run tests, execute the following command:
$ python setup.py test
## Additional Resources
- [Stop Being a "Janitorial" Data Scientist](https://medium.com/@rightlag/stop-being-a-janitorial-data-scientist-5959cccbeac) - *A blog post explaining why aptos was created*
- [Understanding JSON Schema](https://spacetelescope.github.io/understanding-json-schema/) - *An excellent guide for schema authors, from the [Space Telescope Science Institute](http://www.stsci.edu/portal/)*## Future Considerations
- [Swagger](https://swagger.io/) support
- Additional [data-interchange](#supported-data-interchange-formats) formats## Maintainers
| ![Jason Walsh](https://avatars3.githubusercontent.com/u/2184329?v=3&s=128) |
|:--------------------------------------------------------------------------:|
| [Jason Walsh](https://twitter.com/rightlag) |## Contributing
Contributions welcome! Please read the [`contributing.json`](contributing.json) file first.
Join our [Slack](https://aptos-io.slack.com) channel!
## License
[Apache 2.0](LICENSE) © [Penn Signals](https://github.com/pennsignals)