Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mhausenblas/dromedar
Apache Drill On Apache Mesos
https://github.com/mhausenblas/dromedar
Last synced: 28 days ago
JSON representation
Apache Drill On Apache Mesos
- Host: GitHub
- URL: https://github.com/mhausenblas/dromedar
- Owner: mhausenblas
- License: apache-2.0
- Created: 2015-04-06T13:27:09.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-04-13T08:50:40.000Z (over 9 years ago)
- Last Synced: 2024-11-15T09:44:36.862Z (about 1 month ago)
- Language: Python
- Size: 176 KB
- Stars: 7
- Watchers: 6
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mesos - Apache Drill
README
# Dromedar: Drill on Mesos
This is a simple wrapper/enabler for running [Apache Drill](http://drill.apache.org/) on [Apache Mesos](http://mesos.apache.org/).
Dromedar (DRill On MEsos aDAptoR) gets launched via [Marathon](https://github.com/mesosphere/marathon/) and whenever a query request comes in, it launches a number of Drillbits, depending on the dataset size under query. The query-scale-factor (QSF) determines how many Drillbits are launched in relation to the dataset size and defaults to 1 Drillbit per 100MB (1:100) or `qsf=100`, for short.
Dromedar's architecture is as follows:
+----------------+ +----------------------------------------+
| Marathon | | Mesos worker node |
| | | |
| | | +-------------------------+ |
| | | | | |
| | | | Drillbit <---------[3]-------> SQL client
| | | +------------+------------+ |
| | | [2] |
| | | +------------+------------+ |
| | | | | |
| +----[2]--> drillbit.sh start | |
| | | +-------------------------+ |
| | | |
| | | |
| | | +-------------------------+ |
| | | | | |
| | |HTTP API | |
| <----[2]--+ qsf.py <---------[1]-------- [QSF]
| | | | | |
| | | +-------------------------+ |
+----------------+ +----------------------------------------+Dromedar's underlying long-runing service is `qsf.py` which itself is initially deployed through `dromedar.py`, using Marathon. Once `qsf.py` is running as a Web service it performs the following steps:
1. As an input it takes a QSF via its HTTP interface on port `9876`.
1. It uses the Marathon [HTTP API](https://mesosphere.github.io/marathon/docs/rest-api.html) to trigger on-demand Drillbits creation using the `drillbit.sh start` command.
1. The SQL client connects to (one of) the Drillbit(s) and executes the SQL query.## Dependencies
* Apache [Mesos 0.22.x](http://archive.apache.org/dist/mesos/0.22.0/mesos-0.22.0.tar.gz)
* [Marathon 0.8.1](https://downloads.mesosphere.io/marathon/v0.8.1/marathon-0.8.1.tgz)
* Apache [Drill 0.8.0](http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz)
* [marathon-python](https://github.com/thefactory/marathon-python)Note that Apache Drill and the Marathon Python package are installed via Dromedar, directly. The only two things that are assumed to be available are Mesos and Marathon itself.
## Usage
$ ./launch.sh
Then, go to the Marathon UI where you should see something like the following:
![Dromedar launched in Marathon](doc/dromedar-launched.png)
## To Do
- [x] Bootstrap (install Drill, launch Dromedar via Marathon)
- [ ] Implement QSF HTTP API
- [ ] Implement Drillbit launch/teardown based on requests
- [ ] Clarify relation/communication between QSF and SQL client (out of band??)
- [ ] Strata implementation cross-check
- [ ] Cluster deployment and testing
- [ ] HAProxy deployment?
- [ ] Examples and video walkthrough