https://github.com/mo42/sqlc
POC to compile SQL to C++
https://github.com/mo42/sqlc
Last synced: over 1 year ago
JSON representation
POC to compile SQL to C++
- Host: GitHub
- URL: https://github.com/mo42/sqlc
- Owner: mo42
- License: mit
- Created: 2024-05-20T21:26:04.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-23T21:15:09.000Z (over 1 year ago)
- Last Synced: 2025-03-06T11:06:26.869Z (over 1 year ago)
- Language: Rust
- Homepage:
- Size: 52.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# sqlc - Compile SQL to Type-Checked C++
Proof-of-concept for a compiler that translates SQL queries to type-checked C++ code.
## Motivation
- Performance: build ML pipelines than run faster than those written in Python with Pandas ([DataFrame performance](https://github.com/hosseinmoein/DataFrame?tab=readme-ov-file#performance)).
- Security: no need for a SQL runtime, just run a single-purpose C++ program that can be audited.
- Integration: run SQL on (embedded) systems that don't support a heavy DBMS but can handle self-contained C++ programs.
## Technical Details
- SQL parser based on [sqlparser-rs](https://github.com/sqlparser-rs/sqlparser-rs).
- Generated C++ code uses [DataFrame](https://github.com/hosseinmoein/DataFrame)
## Installation
```sh
git clone https://github.com/mo42/sqlc.git && cd sqlc
cargo build --release
```
## Example
Compiling the example SQL file:
```sh
cargo run -- example.sql > example.cpp
```
```sql
SELECT
date,
column2
FROM
'example.csv'
JOIN 'join.csv' USING(column2)
WHERE
joined_string = "Join string 3"
ORDER BY date ASC, column2 DESC
```
```cpp
#include
#include
using namespace hmdf;
typedef ulong idx_t;
using SqlcDataFrame = StdDataFrame;
int main(int, char**) {
SqlcDataFrame df_main;
df_main.read("example.csv", io_format::csv2);
SqlcDataFrame df_join0;
df_join0.read("join.csv", io_format::csv2);
SqlcDataFrame df = df_main.join_by_column(
df_join0, "column2", hmdf::join_policy::inner_join);
auto where_functor = [](const idx_t&,
const std::string& joined_string) -> bool {
return (joined_string == "Join string 3");
};
auto where_df = df.get_data_by_sel(
"joined_string", where_functor);
where_df.sort(
"date", sort_spec::ascen, "column2", sort_spec::desce);
std::vector idx = where_df.get_index();
std::vector date = where_df.get_column("date");
std::vector column2 = where_df.get_column("column2");
SqlcDataFrame select;
select.load_index(std::move(idx));
select.load_column("date", std::move(date));
select.load_column("column2", std::move(column2));
select.write(std::cout, hmdf::io_format::csv,
5, false, 100);
return 0;
}
```
# Documentation
Order of execution of SQL statements:
1. FROM
2. JOIN
3. WHERE
4. GROUP BY
5. HAVING
6. SELECT (column-wise computations like `2 * col1 AS twice`, window functions)
7. ORDER BY
8. LIMIT