https://github.com/xerial/silk

Simplify SQL Workflows with Scala
https://github.com/xerial/silk

Last synced: 5 months ago
JSON representation

Simplify SQL Workflows with Scala

Host: GitHub
URL: https://github.com/xerial/silk
Owner: xerial
License: apache-2.0
Created: 2012-01-06T13:53:15.000Z (about 14 years ago)
Default Branch: master
Last Pushed: 2020-03-13T22:40:45.000Z (almost 6 years ago)
Last Synced: 2025-08-16T09:51:07.397Z (5 months ago)
Language: CSS
Homepage: http://xerial.org/silk
Size: 14.6 MB
Stars: 38
Watchers: 10
Forks: 7
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Silk: A framework for managing SQL data flows.

http://xerial.org/silk

## Examples

```scala

import xerial.silk.core._

import sampledb._

// SELECT count(*) FROM nasdaq

def dataCount = nasdaq.size

// SELECT time, close FROM nasdaq WHERE symbol = 'APPL'

def appleStock = nasdaq.filter(_.symbol is "APPL").select(_.time, _.close)

// You can use a raw SQL statjement as well:

def appleStockSQL = sql"SELECT time, close FROM nasdaq where symbol = 'APPL'"

// SELECT time, close FROM nasdaq WHERE symbol = 'APPL' LIMIT 10

appleStock.limit(10).print

// time-column based filtering

appleStock.between("2015-05-01", "2015-06-01")

for(company <- Seq("YHOO", "GOOG", "MSFT")) yield {

  nasdaq.filter(_.symbol is company).selectAll

}

```

## Milestones

 - Build SQL + local analysis workflows

 - Submit queries to Presto / Treasure Data

 - Run scheduled queries

 - Retry upon failures

 - Cache intermediate results

 - Resume workflow

 - Partial workflow executions

 - Sampling display

    - Interactive mode

 - Split a large query into small ones

    - Differential computation for time-series data

 - Windowing for stream queries

 - Object-oriented workflow

 - Input Source: fluentd/embulk

 - Output Source:

 - Workflow Executor

   - Local-only mode

   - Register SQL part to Treasure Data

   - Run complex analysis on local cache

   - UNIX command executor

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xerial/silk

Awesome Lists containing this project

README