Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/contiamo/rhombic
SQL parsing, lineage extraction and manipulation
https://github.com/contiamo/rhombic
lineage parser postgresql spark sql sql-lineage
Last synced: 2 months ago
JSON representation
SQL parsing, lineage extraction and manipulation
- Host: GitHub
- URL: https://github.com/contiamo/rhombic
- Owner: contiamo
- Created: 2019-07-05T13:02:07.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-05-06T05:32:56.000Z (over 1 year ago)
- Last Synced: 2024-05-28T19:53:15.821Z (7 months ago)
- Topics: lineage, parser, postgresql, spark, sql, sql-lineage
- Language: TypeScript
- Homepage:
- Size: 2.72 MB
- Stars: 48
- Watchers: 10
- Forks: 8
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
Utilities for parsing, analysing and manipulating SQL.
## Getting started
```bash
npm install rhombic
```## Description
Rhombic can parse SQL with 2 different parsers with different sets of operations applicable in each case:
- [Chevrotain](https://sap.github.io/chevrotain) based parser built from ground up to support simple statements
for ANSI SQL dialect. Parsed SQL tree can then be manipulated (adding projection items, ordering, filters) and
serialized back to SQL text. For details and available operations see [src/index.ts](src/index.ts)
- [Antlr](https://www.antlr.org) based parser (generated with [antlr4ts](https://github.com/tunnelvisionlabs/antlr4ts)) from SQL grammar derived from Apache Spark SQL grammar with the goal to support most SQL dialects with broad functionality. Currenly this mode can be used to extract SQL column level lineage. For details and available operations see [src/antlr/index.ts](src/antlr/index.ts)## Antlr parser - lineage
To build SQL column level lineage for an SQL query using Antlr-based parser:
```ts
import { antlr, TablePrimary } from "rhombic";try {
const parsingOptions = {
// if double quotes should quote identifiers:
doubleQuotedIdentifier: true
};
const q = antlr.parse("SELECT * FROM abc;", parsingOptions);console.log(q.getUsedTables()); // [{ tableName: "abc" }];
const getTable = (table: TablePrimary) => {
/* Logic to retrieve table & columns metadata */
};// Whether to use "mergedLeaves" or "tree" lineage type
const mergedLeaves = true;
const lineageOptions = {
positionalRefsEnabled: false
};
const lineage = q.getLineage(getTable, mergedLeaves, lineageOptions);
} catch (e) {
// Parsing errors
}
```You can then use something like [react-flow](https://github.com/wbkd/react-flow) to draw a nice visualization of your lineage -
![image](https://user-images.githubusercontent.com/200647/134165142-e6c5e50c-82a0-4eef-b9ec-8cc96c31dfcd.png)
## Chevrotain parser - SQL manipulation
```ts
import rhombic from "rhombic";try {
const query = rhombic
.parse("SELECT * FROM abc;")
.addProjectionItem("city")
.toString();console.log(query); // SELECT city FROM abc;
} catch (e) {
// Parsing errors
}
```## How to publish to npm?
Just update the `version` in `package.json`!
As soon as your branch will be merged to master, a new npm version will be automatically published for you.
## History
This project was built to support Contiamo® workbench editor (a fancy SQL editor).
## Resources
[SQL 2003-2 BNF](https://github.com/ronsavage/SQL/blob/master/sql-2003-2.bnf)