https://github.com/lresende/toree-gateway
A Gateway for Apache Toree
https://github.com/lresende/toree-gateway
Last synced: 2 months ago
JSON representation
A Gateway for Apache Toree
- Host: GitHub
- URL: https://github.com/lresende/toree-gateway
- Owner: lresende
- License: apache-2.0
- Created: 2017-01-10T14:30:57.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-04-20T22:47:44.000Z (over 8 years ago)
- Last Synced: 2025-01-17T09:36:51.127Z (9 months ago)
- Language: Python
- Homepage:
- Size: 16.5 MB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# A Gateway for Apache Toree
The 'toree-gateway' enables a Jupyter Notebook server to connect with an Apache Torre kernel running on a remote machine.
The 'toree-gateway' consists of two pieces:
* A Jupyter kernel, responsible to bridge the communication between the Jupyter Notebook and the remote Toree Kernel.
* A Lifecycle component that controls start/stop of remote Toree Kernel instances.# Installing Server Side
The server side is where Spark and Toree components reside and will be processing your analytics.
* Install and deploy your spark cluster
* Install Toree Kernel
* Copy bin/startrun.sh to Toree bin folder# Installing Client Side
The following are the main steps to install toree-gateway:
* Install Anaconda 3
* Install following pip dependencies: metakernel, paramiko, configparser
* Install toree gateway distribution (e.g. /opt/toree-gateway) and set TOREE_GATEWAY_HOME```
mkdir -p /opt/toree-gateway
tar -xvf /opt/toree-gateway-2.0-bin.tgz -C /opt/toree-gateway --strip 1rm -rf /root/.local/share/jupyter/kernels/toree-gateway
mkdir -p /root/.local/share/jupyter/kernels/toree-gateway
cp /opt/toree-gateway/kernel.json /root/.local/share/jupyter/kernels/toree-gateway/kernel.json
```* Initialize set of Toree profiles (Comming soon)
# Troubleshooting
## Check logs
* Enable logs on client side ($TOREE_GATEWAY_HOME/conf/toree-gateway.properties)
* Check logs on the remote side side (e.g. $TOREE_HOME/logs)## Cleanup after failure or crash
* Kill all toree process instances are killed on the server (Spark) side (ps -ef | grep toree)
* Kill Jupyter server
* Delete all 'toree.pid' files from profiles folder (e.g. find $TOREE_GATEWAY_HOME | grep toree.pid)