Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/picnicml/doddle-model

:cake: doddle-model: machine learning in Scala.
https://github.com/picnicml/doddle-model

breeze data-science doddle-model machine-learning scala

Last synced: about 2 months ago
JSON representation

:cake: doddle-model: machine learning in Scala.

Awesome Lists containing this project

README

        

doddle-model

---


Latest Release
Build Status
Coverage
Code Quality
License
Chat




latest release




build status




coverage




code quality




license




chat


---

`doddle-model` is an in-memory machine learning library that can be summed up with three main characteristics:
* it is built on top of [Breeze](https://github.com/scalanlp/breeze)
* it provides [immutable estimators](https://en.wikipedia.org/wiki/Immutable_object) that are a _doddle_ to use in parallel code
* it exposes its functionality through a [scikit-learn](https://github.com/scikit-learn/scikit-learn)-like API [2] in idiomatic Scala using [typeclasses](https://en.wikipedia.org/wiki/Type_class)

#### How does it compare to existing solutions?
`doddle-model` takes the position of scikit-learn in Scala and as a consequence, it's much more lightweight than e.g. Spark ML. Fitted models can be deployed anywhere, from simple applications to concurrent, distributed systems built with Akka, Apache Beam or a framework of your choice. Training of estimators happens in-memory, which is advantageous unless you are dealing with enormous datasets that absolutely cannot fit into RAM.

### Installation
The project is published for Scala versions 2.11, 2.12 and 2.13. Add the dependency to your SBT project definition:
```scala
libraryDependencies ++= Seq(
"io.github.picnicml" %% "doddle-model" % "",
// add optionally to utilize native libraries for a significant performance boost
"org.scalanlp" %% "breeze-natives" % "1.0"
)
```
Note that the latest version is displayed in the _Latest Release_ badge above and that the _v_ prefix should be removed from the SBT definition.

### Getting Started
For a complete list of code examples see [doddle-model-examples](https://github.com/picnicml/doddle-model-examples).

### Contributing
Want to help us? :raised_hands: We have a [document](https://github.com/picnicml/doddle-model/blob/master/.github/CONTRIBUTING.md) that will make deciding how to do that much easier.

### Performance
Performance of implementations is described [here](https://github.com/picnicml/doddle-model/wiki/Performance). Also, take a peek at what's written in that document if you encounter `java.lang.OutOfMemoryError: Java heap space`.

### Core Maintainers
This is a collaborative project which wouldn't be possible without all the [awesome contributors](https://github.com/picnicml/doddle-model/graphs/contributors). The core team currently consists of the following developers:
- [@inejc](https://github.com/inejc)
- [@matejklemen](https://github.com/matejklemen)

### Resources
* [1] [Pattern Recognition and Machine Learning, Christopher Bishop](http://www.springer.com/gp/book/9780387310732)
* [2] [API design for machine learning software: experiences from the scikit-learn project, L. Buitinck et al.](https://arxiv.org/abs/1309.0238)
* [3] [UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science, Dua, D. and Karra Taniskidou, E.](http://archive.ics.uci.edu/ml)