https://github.com/techascent/tech.ml.dataset.sql
SQL bindings for tech.ml.dataset
https://github.com/techascent/tech.ml.dataset.sql
Last synced: about 1 year ago
JSON representation
SQL bindings for tech.ml.dataset
- Host: GitHub
- URL: https://github.com/techascent/tech.ml.dataset.sql
- Owner: techascent
- License: epl-2.0
- Created: 2020-05-09T20:37:19.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-03-12T13:33:59.000Z (over 2 years ago)
- Last Synced: 2024-05-02T00:22:02.869Z (about 2 years ago)
- Language: Clojure
- Size: 240 KB
- Stars: 17
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tech.ml.dataset.sql
[](https://clojars.org/techascent/tech.ml.dataset.sql)
* [API Documentation](https://techascent.github.io/tech.ml.dataset.sql/)
Minimal SQL bindings for
[`tech.ml.dataset`](https://github.com/techascent/tech.ml.dataset).
## Usage
This library provides versions of jdbc.next and honeysql. `tech.ml.dataset`
is expected to be transitively provided.
Recommended driver: `[postgresql/postgresql "42.2.12"]`
Provided in namespace `tech.ml.dataset.sql`:
* `result-set->dataset` - given a result set, read all the data into a dataset.
* `sql->dataset` - Given a string sql statement, return a dataset.
* `sanitize-dataset-names-for-sql` - Transform the dataset name and the column names
to string and replace "-" with "_".
* `table-exists?` - Return true of the table of this name exists.
* `drop-table!` - Drop the table of this name.
* `drop-table-when-exists!` - Drop the table if it exists.
* `create-table!` - Using a dataset for the table name for the column names and
datatypes, create a new table.
* `ensure-table!` - Ensure that a given table exists.
* `insert-dataset!` - Insert/upsert a dataset into a table. Upsert is postgresql-only.
For efficiency when inserting/upserting a dataset the connection should be created with
{:auto-commit false}.
Want to see more functions above? We accept PRs :-). The sql/impl namespace
provides many utility functions (like creating connection strings for postgresql
servers) that may be helpful along with required helpers if you want to implement
bindings to a different sql update/insert pathway.
Included in this repo is a nice, one-stop
[docker pathway](scripts/start-local-postgres) for development purposes that will
start the a server with the expected settings used by the unit testing system.
## Example
```clojure
user> (require '[tech.v3.dataset :as ds])
nil
user> (def ds (ds/->dataset "https://github.com/techascent/tech.ml.dataset/raw/master/test/data/stocks.csv"))
#'user/ds
user> (ds/head ds)
https://github.com/techascent/tech.ml.dataset/raw/master/test/data/stocks.csv [5 3]:
| symbol | date | price |
|--------|------------|-------|
| MSFT | 2000-01-01 | 39.81 |
| MSFT | 2000-02-01 | 36.35 |
| MSFT | 2000-03-01 | 43.22 |
| MSFT | 2000-04-01 | 28.37 |
| MSFT | 2000-05-01 | 25.45 |
user> (require '[tech.v3.dataset.sql :as ds-sql])
nil
user> ;;Connections should be created with auto-commit false so that inserts are batched.
user> (require '[next.jdbc :as jdbc])
nil
user> (def dev-conn (doto (-> (ds-sql/postgre-connect-str
"localhost:5432" "dev-user"
"dev-user" "unsafe-bad-password")
(jdbc/get-connection {:auto-commit false}))
(.setCatalog "dev-user")))
#'user/dev-conn
user> dev-conn
#object[org.postgresql.jdbc.PgConnection 0x3256d7ea "org.postgresql.jdbc.PgConnection@3256d7ea"]
user> ;;set the table name and the primary keys
user> (def ds (with-meta ds
(assoc (meta ds)
:name "stocks"
:primary-key ["symbol" "date"])))
#'user/ds
user> ;;see the sql created for this table
user> (println (ds-sql/create-sql ds))
CREATE TABLE stocks (
symbol varchar,
date date,
price float,
PRIMARY KEY (symbol, date)
);
nil
user> (ds-sql/create-table! dev-conn ds)
nil
user> (ds-sql/insert-dataset! dev-conn ds)
nil
user> (def sql-ds (ds-sql/sql->dataset
dev-conn "SELECT * FROM stocks"))
#'user/sql-ds
user> (ds/head sql-ds)
_unnamed [5 3]:
| symbol | date | price |
|--------|----------------------|-------|
| MSFT | 2000-01-01T07:00:00Z | 39.81 |
| MSFT | 2000-02-01T07:00:00Z | 36.35 |
| MSFT | 2000-03-01T07:00:00Z | 43.22 |
| MSFT | 2000-04-01T07:00:00Z | 28.37 |
| MSFT | 2000-05-01T06:00:00Z | 25.45 |
user> (ds/head ds)
stocks [5 3]:
| symbol | date | price |
|--------|------------|-------|
| MSFT | 2000-01-01 | 39.81 |
| MSFT | 2000-02-01 | 36.35 |
| MSFT | 2000-03-01 | 43.22 |
| MSFT | 2000-04-01 | 28.37 |
| MSFT | 2000-05-01 | 25.45 |
```
Note that local-dates are converted to instants in UTC. The same is true for all
date/time types; all are just converted to java.sql.Date objects. Numeric datatypes,
date/time types, strings and UUID's are supported datatypes.
## Develop
See scripts directory.
Run tests
`clj -M:dev:test`
## License
Copyright © 2020 TechAscent, LLC
This program and the accompanying materials are made available under the
terms of the Eclipse Public License 2.0 which is available at
http://www.eclipse.org/legal/epl-2.0.