Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eventual-inc/daft-launcher
A simple launcher for spinning up and managing Ray clusters for the Daft Query Engine.
https://github.com/eventual-inc/daft-launcher
daft distributed-computing query ray
Last synced: 2 months ago
JSON representation
A simple launcher for spinning up and managing Ray clusters for the Daft Query Engine.
- Host: GitHub
- URL: https://github.com/eventual-inc/daft-launcher
- Owner: Eventual-Inc
- License: apache-2.0
- Created: 2024-09-03T17:54:21.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2024-10-24T00:32:47.000Z (3 months ago)
- Last Synced: 2024-10-24T14:42:18.007Z (3 months ago)
- Topics: daft, distributed-computing, query, ray
- Language: Python
- Homepage: https://eventual-inc.github.io/daft-launcher
- Size: 495 KB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![PyPI Package](https://github.com/Eventual-Inc/daft-launcher/actions/workflows/publish-to-pypi.yaml/badge.svg)](https://github.com/Eventual-Inc/daft-launcher/actions/workflows/publish-to-pypi.yaml)
[![Deploy mdBook](https://github.com/Eventual-Inc/daft-launcher/actions/workflows/deploy-mdbook.yaml/badge.svg)](https://github.com/Eventual-Inc/daft-launcher/actions/workflows/deploy-mdbook.yaml)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
[![Latest](https://img.shields.io/github/v/tag/Eventual-Inc/daft-launcher?label=latest&logo=GitHub)](https://github.com/Eventual-Inc/daft-launcher/tags)
[![License](https://img.shields.io/badge/daft_launcher-docs-red.svg)](https://eventual-inc.github.io/daft-launcher)# Daft Launcher CLI Tool
`daft-launcher` is a simple launcher for spinning up and managing Ray clusters for [`daft`](https://github.com/Eventual-Inc/Daft).
For a deeper introduction, please refer to our [documentation](https://eventual-inc.github.io/daft-launcher/).
## Goal
Getting started with Daft in a local environment is easy.
However, getting started with Daft in a cloud environment is substantially more difficult.
So much more difficult, in fact, that users end up spending more time setting up their environment than actually playing with our query engine.Daft Launcher aims to solve this problem by providing a simple CLI tool to remove all of this unnecessary heavy-lifting.
## Capabilities
What Daft Launcher is capable of:
1. Spinning up clusters.
2. Listing all available clusters (as well as their statuses).
3. Submitting jobs to a cluster.
4. Connecting to the cluster (to view the Ray dashboard and submit jobs using the Ray protocol).
5. Spinning down clusters.
6. Creating configuration files.
7. Running raw SQL statements using Daft's SQL API.## Currently supported cloud providers
- [x] AWS
- [ ] GCP
- [ ] Azure## Usage
### Pre-requisites
You'll need some python package manager installed.
We recommend using `uv` for all things python.#### AWS
If you're using AWS, you'll need:
1. A valid AWS account with the necessary IAM role to spin up EC2 instances.
This IAM role can either be created by you (assuming you have the appropriate permissions).
Or this IAM role will need to be created by your administrator.
2. The [AWS CLI](https://aws.amazon.com/cli/) installed and configured on your machine.
3. To login using the AWS CLI.
For full instructions, please look [here](https://google.com).### Installation
Using `uv`:
```bash
# create project
mkdir my-project
cd my-project# initialize project and setup virtual env
uv init
uv venv
source .venv/bin/activate# install launcher
uv pip install daft-launcher
```### Example
All interactions with Daft Launcher are primarily communicated via a configuration file.
By default, Daft Launcher will look inside your `$CWD` for a file named `.daft.toml`.
You can override this behaviour by specifying a custom configuration file.```bash
# create a new configuration file
# will create a file named `.daft.toml` in the current working directory
daft init-config
# or optionally, pass in a custom name
daft init-config my-custom-config.toml# spin up a cluster
daft up
# or optionally, pass in a custom config file
daft up -c my-custom-config.toml# list all the active clusters (can have multiple clusters running at the same time)
daft list# submit a directory and a command to run on the cluster
daft submit --working-dir <...> -- command arg1 arg2 ...
# or optionally, pass in a custom config file
daft submit -c my-custom-config.toml --working-dir $WORKING_DIR -- command arg1 arg2 ...# run a direct SQL query against the daft query engine running in the remote cluster
daft sql -- "SELECT * FROM my_table WHERE column = 'value'"
# or optionally, pass in a custom config file
daft sql -c my-custom-config.toml -- "SELECT * FROM my_table WHERE column = 'value'"# spin down a cluster
daft down
# or optionally, pass in a custom name
daft down -c my-custom-config.toml
```