Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/officiallysingh/spark-mongodb-arangodb-demo
Demo Spark application as Spring cloud task using MongoDB and ArangoDB data sources
https://github.com/officiallysingh/spark-mongodb-arangodb-demo
Last synced: 28 days ago
JSON representation
Demo Spark application as Spring cloud task using MongoDB and ArangoDB data sources
- Host: GitHub
- URL: https://github.com/officiallysingh/spark-mongodb-arangodb-demo
- Owner: officiallysingh
- Created: 2024-04-26T03:47:41.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-05-20T10:06:51.000Z (8 months ago)
- Last Synced: 2024-05-21T10:13:39.739Z (8 months ago)
- Language: Java
- Size: 5.51 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ArangoDB & MongoDB Spark Datasource Demo
Run [**`TelosMLTask`**](src/main/java/com/telos/spark/TelosMLTask.java) as Spring boot application.
> [!IMPORTANT]
> Set VM argument `--add-exports java.base/sun.nio.ch=ALL-UNNAMED`## Requirements
This demo requires:
- JDK 17
- `maven`
- `docker`## Prepare the environment
### Local ArangoDB installation
* use `endpoints` as `localhost:8529`.
* default user is `root` so no need to specify unless its different.
* default password is blank unless set to some other value.
* DB WebUI is accessible at [**http://localhost:8529/_db/test_db/_admin/aardvark/index.html#login**](http://localhost:8529/_db/test_db/_admin/aardvark/index.html#login)
* Run the Demo class's as Java main methods.### Docker & Spark cluster
Set environment variables:```shell
export ARANGO_SPARK_VERSION=1.6.0
```Start ArangoDB cluster with docker:
```shell
STARTER_MODE=cluster ./docker/start_db.sh
```The deployed cluster will be accessible at [http://172.28.0.1:8529](http://172.28.0.1:8529) with username `root` and
password `test`.Start Spark cluster:
```shell
./docker/start_spark_3.4.sh
```## Run embedded
Test the Spark application in embedded mode:
```shell
mvn test
```Test the Spark application against ArangoDB Oasis deployment:
```shell
mvn \
-Dpassword= \
-Dendpoints= \
-Dssl.enabled=true \
-Dssl.cert.value= \
test
```## Submit to Spark cluster
Package the application:
```shell
mvn -DskipTests=true package
```Submit demo program:
```shell
docker run -it --rm \
-v $(pwd):/demo \
-v $(pwd)/docker/.ivy2:/opt/bitnami/spark/.ivy2 \
--network arangodb \
docker.io/bitnami/spark:3.4.0 \
./bin/spark-submit --master spark://spark-master:7077 \
--packages="com.arangodb:arangodb-spark-datasource-3.4_2.12:$ARANGO_SPARK_VERSION" \
--class Demo /demo/target/demo-$ARANGO_SPARK_VERSION.jar
```