Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eventql/eventql
Distributed "massively parallel" SQL query engine
https://github.com/eventql/eventql
analytics columnar-storage cpp cpp11 database distributed distributed-database distributed-sql distributed-storage eventql mpp sql streaming timeseries
Last synced: 27 days ago
JSON representation
Distributed "massively parallel" SQL query engine
- Host: GitHub
- URL: https://github.com/eventql/eventql
- Owner: eventql
- License: other
- Created: 2016-05-23T10:56:48.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-05-06T20:08:51.000Z (over 7 years ago)
- Last Synced: 2024-10-01T16:21:44.148Z (about 1 month ago)
- Topics: analytics, columnar-storage, cpp, cpp11, database, distributed, distributed-database, distributed-sql, distributed-storage, eventql, mpp, sql, streaming, timeseries
- Language: C++
- Homepage: http://eventql.io/
- Size: 76.8 MB
- Stars: 1,182
- Watchers: 69
- Forks: 102
- Open Issues: 113
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
- License: LICENSE
Awesome Lists containing this project
README
# EventQL
[![Build Status](https://travis-ci.org/eventql/eventql.png?branch=master)](http://travis-ci.org/eventql/eventql)
EventQL is a distributed, columnar database built for large-scale data collection
and analytics workloads. It can handle a large volume of streaming writes and
runs super-fast SQL and MapReduce queries.**More information:**
[Documentation](http://eventql.io/),
[Download](https://eventql.io/download/),
[Architecture](https://eventql.io/documentation/internals/architecture/),
[Getting Started](https://eventql.io/documentation/getting-started/first-steps/)## Features
This is a quick run-through of EventQL's key features to get you excited. For
more detailed information on these topics and their caveats you are kindly
referred to the documentation.- **Automatic partitioning.** Tables are transparently split into partitions using
a primary key and distributed among many machines. You don't have to configure
the number of shards upfront. Just insert your data and EventQL handles the rest.- **Idempotent writes.** Supports primary-key based INSERT, UPSERT and DELETE
operations. You can use the UPSERT operation for easy exactly-once ingestion
from streaming sources.- **Compact, columnar storage.** The columnar storage engine allows EventQL to
drastically reduce its I/O footprint and execute analytical queries orders of
magnitude faster than row-oriented systems.- **Standard SQL support.** (Almost) complete SQL 2009 support. (It does JOINs!)
Queries are also automatically parallelized and executed on many machines in
parallel- **Scales to petabytes.** EventQL distributes all table partitions and queries
among a number of equally privileged servers. Given enough machines you can store
and query thousands if terrabytes of data in a single table.- **Streaming, low-latency operations.** You don't have to batch-load data
into EventQL - it can handle large volumes of streaming insert and update
operations. All mutations are immediately visible and minimal SQL query latency
is ~0.1ms.- **Timeseries and relational data.** The automatic partitioning supports
timeseries as well as relational and key value data, as long as there is a good
primary key. The storage engine also supports REPEATED and RECORD types so
arbitrary JSON objects can be inserted into rows.- **HTTP API.** The HTTP API allows you to use query results in any application
and easily send data from any application or device. EventQL also supports a
native TCP-based protocol.- **Fast range scans.** Table partitions in EventQL are ordered and have a
defined keyrange, so you can perform efficient range scans on parts of the
keyspace.- **Hardware efficient**. EventQL is implemented in modern C++ and tries to
achieve maximal performance on commodity hardware by using vectorized execution
and SSE instructions.- **Highly Available**. The shared-nothing architecture of EventQL is highly
fault tolerant. A cluster consists of many, equally privileged nodes
and has no single point of failure.- **Self-contained.** You can set up a new cluster in minutes. The EventQL server
ships as a single binary and has no external dependencies except Zookeeper or a
similar coordination service.## Use Cases
Here are a few example scenarios that are particularly well suited to EventQL's
design:- Storage and analysis of streaming event, timeseries or relational data
- High volume event and sensor data logging
- Joining and correlating of timeseries data with relational tables#### Non-goals
Note that EventQL is built around specific design choices that make it an
excellent fit for real-time data analytics processing (OLAP) tasks, but also
mean it's not well suited for most transactional (OLTP) workloads.## Build
Before we can start we need to install some build dependencies. Currently
you need a modern c++ compiler, libz, autotools and python (for spidermonkey/mozbuild)# Ubuntu
$ apt-get install clang make automake autoconf libtool zlib1g-dev# OSX
$ brew install automake autoconf libtoolTo build EventQL from a distribution tarball:
$ ./configure
$ make
$ sudo make installTo build EventQL from a git checkout:
$ git clone [email protected]:eventql/eventql.git
$ cd eventql
$ ./autogen.sh
$ ./configure
$ make V=1
$ src/evql -hTo run the full (world) test suite:
$ make test
To run the quick (smoke) test suite:
$ make smoketest