Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/spark-connect-go
Apache Spark Connect Client for Golang
https://github.com/apache/spark-connect-go
Last synced: 2 days ago
JSON representation
Apache Spark Connect Client for Golang
- Host: GitHub
- URL: https://github.com/apache/spark-connect-go
- Owner: apache
- License: apache-2.0
- Created: 2023-05-30T10:09:28.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-11-07T20:36:20.000Z (about 2 months ago)
- Last Synced: 2024-12-27T17:10:01.988Z (9 days ago)
- Language: Go
- Homepage: https://spark.apache.org/
- Size: 465 KB
- Stars: 176
- Watchers: 26
- Forks: 34
- Open Issues: 13
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-spark - spark-connect-go - commit/apache/spark-connect-go.svg"> - Golang bindings. (Packages / Language Bindings)
README
# Apache Spark Connect Client for Golang
This project houses the **experimental** client for [Spark
Connect](https://spark.apache.org/docs/latest/spark-connect-overview.html) for
[Apache Spark](https://spark.apache.org/) written in [Golang](https://go.dev/).## Current State of the Project
Currently, the Spark Connect client for Golang is highly experimental and should
not be used in any production setting. In addition, the PMC of the Apache Spark
project reserves the right to withdraw and abandon the development of this project
if it is not sustainable.## Getting started
This section explains how to run Spark Connect Go locally.
Step 1: Install Golang: https://go.dev/doc/install.
Step 2: Ensure you have installed `buf CLI` installed, [more info here](https://buf.build/docs/installation/)
Step 3: Run the following commands to setup the Spark Connect client.
```
git clone https://github.com/apache/spark-connect-go.git
git submodule update --init --recursivemake gen && make test
```Step 4: Setup the Spark Driver on localhost.
1. [Download Spark distribution](https://spark.apache.org/downloads.html) (3.5.0+), unzip the package.
2. Start the Spark Connect server with the following command (make sure to use a package version that matches your Spark distribution):
```
sbin/start-connect-server.sh --packages org.apache.spark:spark-connect_2.12:3.5.2
```Step 5: Run the example Go application.
```
go run cmd/spark-connect-example-spark-session/main.go
```## Runnning Spark Connect Go Application in a Spark Cluster
To run the Spark Connect Go application in a Spark Cluster, you need to build the Go application and submit it to the Spark Cluster. You can find a more detailed example runner and wrapper script in the `java` directory.
See the guide here: [Sample Spark-Submit Wrapper](java/README.md).
## How to write Spark Connect Go Application in your own project
See [Quick Start Guide](quick-start.md)
## High Level Design
The overall goal of the design is to find a good balance of principle of the least surprise for
develoeprs that are familiar with the APIs of Apache Spark and idiomatic Go usage. The high-level
structure of the packages follows roughly the PySpark giudance but with Go idioms.## Contributing
Please review the [Contribution to Spark guide](https://spark.apache.org/contributing.html)
for information on how to get started contributing to the project.