Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zbrookle/sql_to_ibis

A Python package that parses sql and converts it to ibis expressions
https://github.com/zbrookle/sql_to_ibis

data databases dataframes etl hacktoberfest ibis sql

Last synced: 3 months ago
JSON representation

A Python package that parses sql and converts it to ibis expressions

Awesome Lists containing this project

README

        

sql_to_ibis
===========

.. image:: https://github.com/zbrookle/sql_to_ibis/workflows/CI/badge.svg?branch=master
:target: https://github.com/zbrookle/sql_to_ibis/actions?query=workflow

.. image:: https://pepy.tech/badge/sql-to-ibis
:target: https://pepy.tech/project/sql-to-ibis

.. image:: https://img.shields.io/pypi/l/sql_to_ibis.svg
:target: https://github.com/zbrookle/sql_to_ibis/blob/master/LICENSE.txt

.. image:: https://img.shields.io/pypi/status/sql_to_ibis.svg
:target: https://pypi.python.org/pypi/sql_to_ibis/

.. image:: https://img.shields.io/pypi/v/sql_to_ibis.svg
:target: https://pypi.python.org/pypi/sql_to_ibis/

.. image:: https://codecov.io/gh/zbrookle/sql_to_ibis/branch/master/graph/badge.svg?
:target: https://codecov.io/gh/zbrookle/sql_to_ibis

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black

``sql_to_ibis`` is a Python_ package that translates SQL syntax into ibis_ expressions.
This provides the capability of using only one SQL dialect to target many different
backends

.. _Python: https://www.python.org/
.. _ibis: https://github.com/ibis-project/ibis

Installation
------------

.. code-block:: bash

pip install sql_to_ibis

Usage
-----

Registering and removing temp tables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To use an ibis table in sql_to_ibis you must register it. Note that for joins or
queries that involve more than one table you must use the same ibis client when
creating both ibis tables. Once the table is registered you can query it using SQL
with the *query* function. In the example below, we create and query a pandas DataFrame

.. code-block:: python

import ibis.pandas
import pandas
import sql_to_ibis

df = pandas.DataFrame({"column1": [1, 2, 3], "column2": ["4", "5", "6"]})
ibis_table = ibis.pandas.Backend().from_dataframe(
df, name="my_table", client=ibis.pandas.connect({})
)
sql_to_ibis.register_temp_table(ibis_table, "my_table")
sql_to_ibis.query(
"select column1, cast(column2 as integer) + 1 as my_col2 from my_table"
).execute()

This would output a DataFrame that looks like:

+---------+---------+
| column1 | my_col2 |
+=========+=========+
| 1 | 5 |
+---------+---------+
| 2 | 6 |
+---------+---------+
| 3 | 7 |
+---------+---------+

SQL Syntax
----------
The sql syntax for sql_to_ibis is as follows (Note that all syntax is case insensitive):

Select statement
~~~~~~~~~~~~~~~~

.. code-block:: SQL

SELECT [{ ALL | DISTINCT }]
{ [ ] | [ [ AS ] ] } [, ...]
[ FROM [, ...] ]
[ WHERE ]
[ GROUP BY { [, ...] } ]
[ HAVING ]

Example:

.. code-block:: SQL

SELECT
column4,
Sum(column1)
FROM
my_table
WHERE
column3 = 'yes'
AND column2 = 'no'
GROUP BY
column4

Note that columns with spaces in them can be expressed using double quotes. For example:

.. code-block:: SQL

SELECT
"my column 1",
column2 as "the second column"
FROM
my_table

Set operations
~~~~~~~~~~~~~~

.. code-block:: SQL


{UNION [DISTINCT] | UNION ALL | INTERSECT [DISTINCT] | EXCEPT [DISTINCT] | EXCEPT ALL}

Example:

.. code-block:: SQL

SELECT
*
FROM
table1
UNION
SELECT
*
FROM
table2

Joins
~~~~~

.. code-block:: SQL

INNER, CROSS, FULL OUTER, LEFT OUTER, RIGHT OUTER, FULL, LEFT, RIGHT

.. code-block:: SQL

SELECT
*
FROM
table1
CROSS JOIN
table2

.. code-block:: SQL

SELECT
*
FROM
table1
JOIN
table2
ON table1.column1 = table2.column1

Order by and limit
~~~~~~~~~~~~~~~~~~

.. code-block:: SQL


[ORDER BY ]
[LIMIT ]

Example:

.. code-block:: SQL

SELECT
*
FROM
table1
ORDER BY
column1
LIMIT 5

Windowed aggregation
~~~~~~~~~~~~~~~~~~~~

.. code-block:: SQL

() OVER(
[PARTITION BY ( [, ...)]
[ORDER_BY ( [, ...)]
[ ( ROWS | RANGE ) ( | BETWEEN AND ) ]
)

: UNBOUNDED PRECEDING | PRECEDING | CURRENT ROW
: UNBOUNDED FOLLOWING | FOLLOWING | CURRENT ROW

Supported expressions and functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: SQL

+, -, *, /

.. code-block:: SQL

CASE WHEN THEN [WHEN ...] ELSE END

.. code-block:: SQL

SUM, AVG, MIN, MAX

.. code-block:: SQL

{RANK | DENSE_RANK} OVER([PARTITION BY ( [, ...])])

.. code-block:: SQL

CAST ( AS )

.. code-block:: SQL

is null

.. code-block:: SQL

is not null

.. code-block:: SQL

COALESCE( [, ...])

* Anything in <> is meant to be some string
* Anything in [] is optional
* Anything in {} is grouped together

Supported Data Types for cast expressions include:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* VARCHAR, STRING
* INT16, SMALLINT
* INT32, INT, INTEGER
* INT64, BIGINT
* FLOAT16
* FLOAT32
* FLOAT, FLOAT64
* BOOL
* DATETIME64, TIMESTAMP
* CATEGORY
* OBJECT