An open API service indexing awesome lists of open source software.

https://github.com/simonacca/dict2sql

express SQL as python data structures
https://github.com/simonacca/dict2sql

database orm python sql

Last synced: 3 months ago
JSON representation

express SQL as python data structures

Awesome Lists containing this project

README

          

# dict2sql, the missing SQL API

Build SQL queries with Python dictionaries. Composable, safe and reusable.

See [API.md](API.md) for comprehensive documentation.

# Installation

```shell
pip install git+https://github.com/simonacca/dict2sql@v3.0.0

```

# Examples

```python
from dict2sql import to_sql

query = {

"Select": ["name", "height", "country"],
"From": "mountains",
"Where": {
"Op": "AND",
"Predicates": [
{"Op": ">=", "Sx": "height", "Dx": 3000},
{"Op": "=", "Sx": "has_glacier", "Dx": True}
],
},
"Limit": 3
}

sql = to_sql(query)
print(sql)
```

produces

```sql
SELECT "name" , "height" , "country" FROM "mountains" WHERE "height" >= 3000 AND "has_glacier" = TRUE LIMIT 3
```

## INSERT Statement

```python
from dict2sql import to_sql

insert_query = {
"StatementType": "INSERT",
"Table": "users",
"Insert": {
"name": "Alice Johnson",
"email": "alice@example.com",
"age": 28,
"active": True
}
}

sql = to_sql(insert_query)
print(sql)
```

produces

```sql
INSERT INTO "users" ( "name" , "email" , "age" , "active" ) VALUES ( 'Alice Johnson' , 'alice@example.com' , 28 , TRUE )
```

## UPDATE Statement

```python
from dict2sql import to_sql

update_query = {
"StatementType": "UPDATE",
"Table": "users",
"Values": {
"email": "alice.johnson@newdomain.com",
"last_login": "2024-01-15T10:30:00"
},
"Where": {
"Op": "=",
"Sx": "id",
"Dx": 123
}
}

sql = to_sql(update_query)
print(sql)
```

produces

```sql
UPDATE "users" SET "email" = 'alice.johnson@newdomain.com' , "last_login" = '2024-01-15T10:30:00' WHERE "id" = 123
```

## DELETE Statement

```python
from dict2sql import to_sql

delete_query = {
"StatementType": "DELETE",
"Table": "users",
"Where": {
"Op": "AND",
"Predicates": [
{"Op": "=", "Sx": "active", "Dx": False},
{"Op": "<", "Sx": "last_login", "Dx": "2023-01-01"}
]
}
}

sql = to_sql(delete_query)
print(sql)
```

produces

```sql
DELETE FROM "users" WHERE "active" = FALSE AND "last_login" < '2023-01-01'
```

## Advanced SELECT Examples

### Aggregations and GROUP BY

```python
from dict2sql import to_sql

aggregation_query = {

"Select": [
"country",
{"Function": "COUNT", "Column": "*", "Alias": "mountain_count"},
{"Function": "AVG", "Column": "height", "Alias": "avg_height"},
{"Function": "MAX", "Column": "height", "Alias": "highest_peak"}
],
"From": "mountains",
"Where": {
"Op": ">=",
"Sx": "height",
"Dx": 1000
},
"GroupBy": "country",
"Having": {
"Op": ">",
"Sx": {"Function": "COUNT", "Column": "*"},
"Dx": 5
},
"OrderBy": {"Column": "avg_height", "Direction": "DESC"}
}

sql = to_sql(aggregation_query)
print(sql)
```

produces

```sql
SELECT "country" , COUNT ( * ) AS "mountain_count" , AVG ( "height" ) AS "avg_height" , MAX ( "height" ) AS "highest_peak" FROM "mountains" WHERE "height" >= 1000 GROUP BY "country" HAVING COUNT ( * ) > 5 ORDER BY "avg_height" DESC
```

### JOINs and Subqueries

```python
from dict2sql import to_sql

join_query = {

"Select": [
"m.name",
"m.height",
"c.population",
"c.continent"
],
"From": {
"Sx": {"TableName": "mountains", "Alias": "m"},
"Join": "INNER JOIN",
"Dx": {"TableName": "countries", "Alias": "c"},
"On": {"Op": "=", "Sx": "m.country_code", "Dx": "c.code"}
},
"Where": {
"Op": "AND",
"Predicates": [
{"Op": ">", "Sx": "m.height", "Dx": 8000},
{"Op": "=", "Sx": "c.continent", "Dx": "Asia"}
]
},
"OrderBy": [
{"Column": "m.height", "Direction": "DESC"},
{"Column": "c.population", "Direction": "ASC"}
]
}

sql = to_sql(join_query)
print(sql)
```

produces

```sql
SELECT "m"."name" , "m"."height" , "c"."population" , "c"."continent" FROM "mountains" AS "m" INNER JOIN "countries" AS "c" ON "m"."country_code" = "c"."code" WHERE "m"."height" > 8000 AND "c"."continent" = 'Asia' ORDER BY "m"."height" DESC , "c"."population" ASC
```

### Arithmetic Expressions

```python
from dict2sql import to_sql

arithmetic_query = {

"Select": [
"name",
"height",
{
"Left": "height",
"Op": "*",
"Right": 3.28084,
"Alias": "height_feet"
},
{
"Left": {
"Left": "height",
"Op": "-",
"Right": "base_elevation"
},
"Op": "/",
"Right": "height",
"Alias": "prominence_ratio"
}
],
"From": "mountains",
"Where": {
"Op": ">",
"Sx": "height",
"Dx": 5000
},
"OrderBy": {"Column": "prominence_ratio", "Direction": "DESC"},
"Limit": 10
}

sql = to_sql(arithmetic_query)
print(sql)
```

produces

```sql
SELECT "name" , "height" , "height" * 3.28084 AS "height_feet" , ( "height" - "base_elevation" ) / "height" AS "prominence_ratio" FROM "mountains" WHERE "height" > 5000 ORDER BY "prominence_ratio" DESC LIMIT 10
```

# Notes

## Rationale

For historical reasons in the world of relational databases interfaces usually consist of domain-specific languages (mostly dialects of SQL)
rather than composition of data structures as it is common with modern APIs (for example JSON-based REST, protobuf).
While a domain-specific language (DSL) is very well suited for interactive use, such as manually exploring a dataset, this approach has some limitations when trying to interface with a database programmatically (for example from a Python script).

This library brings a modern API to SQL databases, allowing the user to express queries as composition of basic python data structures: dicts, lists, strings, ints, floats and booleans.

Among the primary benefits of this approach is a superior ability to reuse code. All the usual python constructs and software engineering best practices are available to the query author to express queries using clean, maintainable code.

Query-as-data also means compatibility with Python's type hinting system, which translates to reduced query-correctness issues, improved error messages (at least with respect to some query engines), and a quicker development experience.

Notably, this solution eliminates one major source of friction with traditional programming-language level handling of SQL: SQL injection and excaping. While solutions to this problem such as parametrized queries have been developed over time, they heavily favor safety at the expense of expressivity; it is usually forbidden to compose parametrized queries at runtime.
How is this accomplished? By having granular information about each component of a query, `dict2sql` is easily able to apply escaping where needed, resulting in safe queries.

Finally, it should be noted that this library strictly tries to do *one* job well, namely *composing sql queries*. There is many related functionalities in this space which we explicitely avoid taking on, feeling that they are best left to other very mature libraries in the Python ecosystem. For example: connecting to the database and performing queries, parsing query return values.

## Implementation details
This project at the moment targets ANSI SQL, with the ambition of soon targeting all major SQL dialects.

Tests are based on the [Chinhook Database](https://github.com/lerocha/chinook-database).

## Contributing

Contributions and forks are welcome!

If you want to increment the current language to increase coverage of ANSI SQL, go right ahead.

If you plan to contribute major features such as support for a new dialect, it is recommended to start a PR early on in the development process to prevent duplicate work and ensure that it will be possible to merge the PR without any hiccups.