Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kiendang/sparkr-naivebayes-example
https://github.com/kiendang/sparkr-naivebayes-example
apache-spark mllib r scala sparkr
Last synced: 12 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/kiendang/sparkr-naivebayes-example
- Owner: kiendang
- Created: 2014-11-28T10:35:50.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2017-07-03T15:28:27.000Z (over 7 years ago)
- Last Synced: 2024-10-06T12:24:16.267Z (about 1 month ago)
- Topics: apache-spark, mllib, r, scala, sparkr
- Language: R
- Size: 230 KB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This is an example of using Spark MLlib's Naive Bayes model in R which I used as a demo at Singapore's Spark user group's first meetup http://www.meetup.com/Spark-Singapore/events/218794905/
Slides: http://www.slideshare.net/KienDang5/introduction-to-sparkr
Data source: https://archive.ics.uci.edu/ml/datasets/Spambase
# Notes
Currently access to MLlib in SparkR is still in development. Thus use this method to run MLlib in R until MLlib is officially integrated into SparkR.# Setup
1. Download SparkR:
$ git clone https://github.com/amplab-extras/SparkR-pkg.git
2. Add "org.apache.spark" % "spark-mllib_2.10" % "1.1.0", "org.scalanlp" % "breeze_2.10" % "0.10", "net.rforge" % "Rserve" % "0.6-8.1" to libraryDependencies in SparkR-pkg/pkg/src/build.sbt
3. Copy src/RToScalaRDD.scala in this repo to SparkR-pkg/pkg/src/src
4. Install SparkR:
R
devtools::install_local("path/to/SparkR-pkg/pkg")# Run example:
$ path/to/SparkR/sparkR
$ source("./R/1_naivebayes.R")
$ source("./R/2_spam_naivebayes.R")