Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/newfront/odsc-east-2020-decision-intelligence
This is the home of the 2020 Open Data Science Conference workshop (Creating Streaming Predictive Analytics and Decision Intelligence Systems with Apache Spark)
https://github.com/newfront/odsc-east-2020-decision-intelligence
decision-intelligence-systems odsc odsc-east-2020 spark
Last synced: about 2 months ago
JSON representation
This is the home of the 2020 Open Data Science Conference workshop (Creating Streaming Predictive Analytics and Decision Intelligence Systems with Apache Spark)
- Host: GitHub
- URL: https://github.com/newfront/odsc-east-2020-decision-intelligence
- Owner: newfront
- License: apache-2.0
- Created: 2020-03-09T21:57:59.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-16T05:54:42.000Z (about 2 years ago)
- Last Synced: 2023-03-11T10:42:17.576Z (almost 2 years ago)
- Topics: decision-intelligence-systems, odsc, odsc-east-2020, spark
- Language: Scala
- Size: 23.6 MB
- Stars: 9
- Watchers: 2
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# odsc-east-2020-decision-intelligence
This is the home of the 2020 Open Data Science Conference workshop (Creating Streaming Predictive Analytics and Decision Intelligence Systems with Apache Spark)## Workshop Material
http://bit.ly/learn-spark-mlThe workshop material is a procedural series of Zeppelin notebooks. Everything can be installed and run inside of the Docker environment that is described in the README.md in the Learn Spark ML.
## Slides and Presentation Notes
Slides are in the `presentation` directory# Streaming Predictions
## Predicting KidSafe Content from Netflix Movies
The source code is under `/spark-structured-streaming`## PreReqs for running the streaming example
1. You have already gone through and run All the notebooks from http://bit.ly/learn-spark-ml. This will prime the redis keys necessary to run the streaming example.
2. maven installed (I run maven `3.3.9`) - install with HomeBrew (`brew install [email protected]`)
3. java version 1.8.0 (I run `1.8.0_241`)
4. scala version 2.11 (I run `2.11.12`)### Build the Jar
~~~
mvn clean verify
~~~### Run the Spark App
~~~bash
export SPARK_HOME=/path/to/spark-2.4.5
$SPARK_HOME/bin/spark-submit \
--master "local[8]" \
--class "com.twilio.learn.PredictionStream" \
target/spark-redis-predict.jar \
conf/app.yaml
~~~Alternatively if SPARK_HOME is set and you have Spark-2.4.5 installed
~~~
scripts/run.sh
~~~### Send Movies to be Predicted
First open up a new terminal window and connect to the Redis docker instance to monitor redis
~~~
docker exec -it redis5 redis-cli monitor
~~~Next open up another terminal window to use the redis-cli
~~~
docker exec -it redis5 redis-cli monitor
~~~Lastly paste the following commands into the terminal
~~~
xadd v1:movies:test:kidSafe * show_id 80115338
xadd v1:movies:test:kidSafe * show_id 80196367
~~~You should see the following
~~~
1586918227.329652 [0 172.23.0.1:42906] "HMSET" "v1:movies:test:kidSafe:predict:80196367" "category" "Thrillers" "prediction" "0.0022742774331638237" "rating" "TV-MA"
1586918227.329962 [0 172.23.0.1:42862] "HMSET" "v1:movies:test:kidSafe:predict:80115338" "category" "Kids' TV" "rating" "TV-Y" "prediction" "0.9772088004695866"
~~~Now you are a machine learning expert