An open API service indexing awesome lists of open source software.

https://github.com/shink/spark-ml-algorithm-docker

Spark ML algorithms on docker
https://github.com/shink/spark-ml-algorithm-docker

docker-image spark spark-ml

Last synced: about 2 months ago
JSON representation

Spark ML algorithms on docker

Awesome Lists containing this project

README

          

# Spark ML Algorithms on Docker







中文文档
Docker Hub
GitHub Packages

## Algorithms

- [KMeans](kmeans)
- [Latent Dirichlet Allocation](lda)
- [Gaussian Mixture Model](gmm)
- [Binomial Logistic Regression](binomial-logistic-regression)
- [Multinomial Logistic Regression](multinomial-logistic-regression)
- [Decision Tree Classification](decision-tree-classification)
- [Random Forest Classification](random-forest-classification)
- [Gradient-boosted Tree Classification](gradient-boosted-tree-classification)
- [Isotonic Regression](isotonic-regression)
- [Factorization Machines Regression](factorization-machines-regression)
- [Naive Bayes](naive-bayes)
- [Linear Regression](linear-regression)

## Development

Requirements:

- JDK 8+
- Maven 3+
- Docker 19+
- Hadoop 2+
- Spark 3+

Compile and build:

```shell
mvn clean package -DskipTests
```

Build docker image:

```shell
mvn clean package -DskipTests -Pdocker
```

## References

[v3.1.2 ml-guide](https://spark.apache.org/docs/3.1.2/ml-guide.html)

[examples on GitHub](https://github.com/apache/spark/tree/master/examples/src/main/scala/org/apache/spark/examples/ml)

## License

[MIT](LICENSE)