https://github.com/traindb-project/traindb
ML model-based approximate query processing engine
https://github.com/traindb-project/traindb
Last synced: about 2 months ago
JSON representation
ML model-based approximate query processing engine
- Host: GitHub
- URL: https://github.com/traindb-project/traindb
- Owner: traindb-project
- License: apache-2.0
- Created: 2021-09-08T04:09:09.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-12-18T07:04:15.000Z (about 1 year ago)
- Last Synced: 2024-12-18T08:18:05.127Z (about 1 year ago)
- Language: Java
- Homepage: https://traindb-project.github.io/
- Size: 12.6 MB
- Stars: 72
- Watchers: 5
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-java - TrainDB
README
[](https://github.com/traindb-project/traindb/actions/workflows/maven.yml)
[](https://colab.research.google.com/github/traindb-project/traindb/blob/main/examples/traindb_tutorial.ipynb)
# 
TrainDB is an ML model-based approximate query processing engine that aims to answer time-consuming analytical queries in a few seconds.
TrainDB will provide SQL-like query interface and support various DBMS data sources.
[Docs(English)](https://traindb-doc.readthedocs.io/en/latest/) • [Docs(Korean)](https://traindb-doc.readthedocs.io/ko/latest/) • [Tutorial(Colab)](https://colab.research.google.com/github/traindb-project/traindb/blob/main/examples/traindb_tutorial.ipynb)
## Requirements
* Java 11+
* Maven 3.x
* SQLite3 (or other DBMS for catalog store, supported by datanucleus)
For python environment setup, see README in our [traindb-model](https://github.com/traindb-project/traindb-model) repository.
## Install
### Download
```console
$ git clone --recurse-submodules https://github.com/traindb-project/traindb.git
```
### Build
```console
$ cd traindb
$ mvn package
```
Then, you can find traindb-x.y-SNAPSHOT.tar.gz in traindb-assembly/target directory.
```console
$ tar xvfz traindb-assembly/target/traindb-x.y-SNAPSHOT.tar.gz
```
## Run
### Example
Now, you can execute SQL statements using the command line interface.\
You need to put JDBC driver for your DBMS into the directory included in CLASSPATH.
```console
$ cd traindb-assembly/target/traindb-x.y-SNAPSHOT
$ bin/trsql
sqlline> !connect jdbc:traindb:://
Enter username for jdbc:traindb:://localhost:
Enter password for jdbc:traindb:://localhost:
0: jdbc:traindb:://>
```
You can train ML models and run approximate queries like the following example.
```
0: jdbc:traindb:://> CREATE MODELTYPE tablegan FOR SYNOPSIS AS LOCAL CLASS 'TableGAN' IN '$TRAINDB_PREFIX/models/TableGAN.py';
No rows affected (0.255 seconds)
0: jdbc:traindb:://> TRAIN MODEL tgan MODELTYPE tablegan ON .(, , ...);
epoch 1 step 50 tensor(1.1035, grad_fn=) tensor(0.7770, grad_fn=) None
epoch 1 step 100 tensor(0.8791, grad_fn=) tensor(0.9682, grad_fn=) None
...
0: jdbc:traindb:://> CREATE SYNOPSIS FROM MODEL tgan LIMIT <# of rows to generate>;
...
0: jdbc:traindb:://> SELECT APPROXIMATE avg() FROM .;
```