An open API service indexing awesome lists of open source software.

https://github.com/michaelmior/eson

A normalization tool for denormalized databases
https://github.com/michaelmior/eson

database normalization normalize

Last synced: 4 months ago
JSON representation

A normalization tool for denormalized databases

Awesome Lists containing this project

README

        

# eson

[![CI](https://github.com/michaelmior/eson/actions/workflows/ci.yml/badge.svg)](https://github.com/michaelmior/eson/actions/workflows/ci.yml)

`eson` is a work in progress tool to extract a normalized schema from a denormalized relational schema.
The hope is that it can be useful for understanding and managing schemas of NoSQL applications.

## Installation

If you are a Rust user, you can install eson with `cargo install eson`.
Otherwise, you can download a Linux, Windows, or Mac binary from the [latest release](https://github.com/michaelmior/eson/releases/latest).

## Input format

Example input files are available in the `examples` directory.
Input files to `eson` are split into four different sections.
The first specifies the denormalized input relations in the following format:

```
users(*user_id, first_name, last_name)
```

Fields marked with a `*` compose the primary key of that relation.
The second section specifies functional dependencies for each table.
The table name is given first, followed by the left and right-hand sides of the dependency.

```
users user_id -> first_name, last_name
```

Inclusion dependencies are specified in a similar manner as in the examples below:

```
employees user_id <= users user_id
users user_id <= employees user_id
```

There are two shortcuts which can be used in this section.
Firstly, if the inclusion dependency applies in both directions, then `==` can be used instead of separately specifying two dependencies.
Second, if the fields on the right-hand side are the same as those on the left, `...` can be used to replace the fields on the right.
Employing both of these shortcuts, the two dependencies above can be written as:

```
employees user_id == users ...
```

The final section is optional and specifies statistics on tables and columns when using a heuristics-based approach for ordering functional dependencies (the `--use-stats` option).
Statistics for a relation simply list the total number of entries in the relation.
Statistics for a column list the number of unique values as well as the maximum length.

```
users 1000
users user_id 1000 1
```