https://github.com/feedzai/fos-core
Feedzai Open Scoring Server
https://github.com/feedzai/fos-core
Last synced: about 1 year ago
JSON representation
Feedzai Open Scoring Server
- Host: GitHub
- URL: https://github.com/feedzai/fos-core
- Owner: feedzai
- License: apache-2.0
- Created: 2014-01-20T17:05:17.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2016-03-09T19:22:35.000Z (over 10 years ago)
- Last Synced: 2025-04-30T04:48:54.583Z (about 1 year ago)
- Language: Java
- Size: 131 KB
- Stars: 7
- Watchers: 32
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# feedzai Open Scoring Server (FOS)
FOS is machine learning model training and scoring server in Java.
[](https://feedzaios.ci.cloudbees.com/job/fos-core/)
[](http://www.cloudbees.com/dev)
## Why FOS
There are pretty good machine learning training and scoring frameworks/libraries out there, but they don't provide the
following benefits:
1. Common API: fos provides a common abstraction for model attributes, model training and model scoring. Using a [Weka]
based classifier will use have exactly the same API as using a R based classifier.
1. Scoring & Training as a remote service: Training and scoring can be farmed to dedicated servers in the network
enabling both vertical and horizontal scaling.
1. Import and Export models: A model could be trained in a development box and imported seamlessly into a remote server
1. Scalable and low latency scoring: Marshalling and Unmarshalling scoring requests/responses can be responsible
for a significant amount of overhead. Along with the slow RMI based interface, fos also supports scoring using [Kryo].
## Compiling FOS
You need:
1. [Java SDK]: Java 7
1. [Maven]: Tested with Maven 3
1. Access to maven central repo (or a local proxy)
After both the [Java SDK] and [Maven] are installed run the following command
`mvn clean install`
This should compile fos-core, ran all the tests and install all modules into your local maven repo.
# FOS Quickstart
## Running FOS
In order to start a FOS sever you need to create a bundle that contains both the core components and one or more fos backend implementations.
## Creating a FOS server bundle
To create a bundle type
`make package` on the `fos-core` project root. This will:
1. Build fos core
2. Copy all dependencies listed in `non-core-dependencies.xml` into fos-server lib directory (this includes the weka and R API implementations)
3. Create a tar.gz bundle with all the necessary code plus shell scripts to bootstrap the process. The file will be available as `fos-server/target/fos-server-bin.tar.gz`.
## Running FOS
1. Untar the server bundle on your install location. Let's assume FOS will be installed in the home dir
```
$ cd ~
$ tar xf /fos-server/target/fos-server-bin.tar.gz
```
By now you should have a `fos-server` on your home.
```
$ cd fos-server
$ ~/fos-server $ ls
bin conf lib LICENSE.txt log models README.txt
```
Inside there are a couple directories:
1. `bin` with startup scripts
1. `conf` with FOS configuration scripts
1. `models` where trained models are going to be found
To start fos, run the bundled startup script `bin/startup.sh`
```
$ bin/startup.sh
03-Feb 04:28:31 INFO com.feedzai.fos.server.Runner Starting fos server using configuration from conf/fos.properties
03-Feb 04:28:31 INFO com.feedzai.fos.server.Runner FOS Server started in 272ms
```
We've just started a FOS server with [Weka] support built into the [fos-weka] fos module.
Currently FOS can only support an active module per runtime instance. The active server is
set in the `fos.factoryName` configuration option inside `conf/fos.properties` file:
```
# the fos implementation to launch
fos.factoryName=com.feedzai.fos.impl.weka.WekaManagerFactory
```
## Training and scoring my first model
We've prepared a couple [FOS samples]. Check them out
## Implementing a FOS Module
### Creating your own manager
fos-core does not provide any concrete implementation. However, a bundled fos server includes both a [fos-weka] (active by default) and a [fos-r] implementation. It is pretty easy to create a new implementation if you want to leverage existing code.
Your first step will be to understand [ManagerFactory] interface. A [ManagerFactory] should perform all the necessary boostrapping for a given [Manager] implementation, which must provide implementations for:
1. Model training
2. Model management (add, removal, import and export)
3. A [Scorer] implementation
Lets dissect a real example:
```java
public class WekaManagerFactory implements ManagerFactory {
@Override
public Manager createManager(FosConfig configuration) {
WekaManagerConfig wekaManagerConfig = new WekaManagerConfig(configuration);
return new WekaManager(wekaManagerConfig);
}
}
```
The first step is parse [Manager] specific configuration parameters
```java
WekaManagerConfig wekaManagerConfig = new WekaManagerConfig(configuration);
```
Now that we have specific config, we can create a new [WekaManager]:
```java
return new WekaManager(wekaManagerConfig);
```
Most of the heavy lifting is done by [WekaManager] implementation. Since most of them are Weka specifc, it is not worthwhile to go into implementation details here. The following operations are performed:
1. Search for previously saved models and load their configuration.
1. Create a `fos-weka` [Scorer] implementation.
1. Start listening to requests via RMI and [Kryo].
1. Start a thread pool to allow parallel scoring.
### Implementing model training
In order to implement [model training], you need to supply the [model configuration] and training instances. You can see a practical example in [fos training sample].
A model configuration is composed by:
1. A set of key-value properties relevant from each implementation. We recomend to define all configuration options in a dedicated class (see [weka model configuration] for example).
2. An [attribute] list. The number of atributes in the training data must match the configuration attribute list. An attribute can be:
1. [Numerical attribute]: Numeric attributes can be real or integer types.
2. [Categorical attribute]: For attributes with a limited set of valid values.
You'll have to convert these FOS abstractions to a format your implementation understands.
There are multiple training entry points:
```java
UUID trainAndAdd(ModelConfig config,List instances) throws FOSException;
```
`traindAndAdd` must train a new classifier with the given configuration and using the given `instances`. It should return the serialized classifier and automatically make it avaiable for scoring
```java
UUID trainAndAddFile(ModelConfig config,String path) throws FOSException;
```
`trainAndAddFile` Same as above, but instances are read from a CSV file.
```java
Model train(ModelConfig config,List instances) throws FOSException;
```
`train` Trains a model and returns a Model. The Model implementation can be either a ModelBinary, which contains its serialized representation, or a ModelPMML, which contains a String with its representation in PMML. The model is not made available for scoring.
```java
Model trainFile(ModelConfig config, String path) throws FOSException;
```
`trainFile` same as above, but instances are read from a CSV file.
### Implementing model Scoring
Along with training models, the manager is also responsible for providing a [Scorer] implementation. There is a weka [weka scoring] example for reference.
### Managing models
[Kryo]: https://github.com/EsotericSoftware/kryo
[fos-r]: https://github.com/feedzai/fos-r
[fos-weka]: https://github.com/feedzai/fos-weka
[Weka]: http://www.cs.waikato.ac.nz/ml/weka/
[R]: http://www.r-project.org/
[Maven]: http://maven.apache.org/
[Java SDK]: http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
[FOS samples]: https://github.com/feedzai/FosSample
[ManagerFactory]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/ManagerFactory.java?source=cc
[Manager]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/Manager.java?source=cc
[Scorer]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/Scorer.java?source=cc
[WekaManager]: https://github.com/feedzai/fos-weka/blob/master/src/main/java/com/feedzai/fos/impl/weka/WekaManager.java?source=cc
[model training]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/Manager.java?source=c#L120
[fos training sample]: https://github.com/feedzai/fos-sample/blob/master/src/main/java/com/feedzai/fos/samples/weka/WekaTraining.java#L65
[attribute]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/Attribute.java?source=cc#L37
[Numerical attribute]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/NumericAttribute.java?source=c#L32
[Categorical attribute]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/CategoricalAttribute.java#L46
[model configuration]: https://github.com/feedzai/fos-core/blob/master/fos-api/src/main/java/com/feedzai/fos/api/ModelConfig.java?source=cc#L45
[weka model configuration]: https://github.com/feedzai/fos-weka/blob/master/src/main/java/com/feedzai/fos/impl/weka/config/WekaModelConfig.java?source=c#L45
[weka scoring]: https://github.com/feedzai/fos-sample/blob/master/src/main/java/com/feedzai/fos/samples/weka/WekaScoring.java#L45