Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mcapuccini/scala-cp

Conformal Prediction in Scala
https://github.com/mcapuccini/scala-cp

Last synced: 3 months ago
JSON representation

Conformal Prediction in Scala

Awesome Lists containing this project

README

        

# Scala-CP

[![Build Status](https://travis-ci.org/mcapuccini/scala-cp.svg?branch=master)](https://travis-ci.org/mcapuccini/scala-cp)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/810ed0d38e6f47079eab3426f6bf6f95)](https://www.codacy.com/app/m-capuccini/scala-cp?utm_source=github.com&utm_medium=referral&utm_content=mcapuccini/scala-cp&utm_campaign=Badge_Grade)

Scala-CP is a Scala implementation of the Conformal Prediction (CP) framework, introduced by Vovk *et. al.* in the book Algorithmic Learning in a Random World. When assigning confidence to machine learning models, CP is a nice alternative to cross-validation. Instead of predicting a value for a certain feature vector, a conformal predictor outputs a prediction set/region that contains the correct prediction with probability *1-𝜺*, where *𝜺* is a user-defined significance level. The choose of the significance level will of course influence the size of the prediction set/region. In alternative, using CP one can predict object-specific p-values for unseen examples.

## Table of Contents
- [Getting started](#getting-started)
- [Documentation](#documentation)
- [Examples](#examples)
- [Scala-CP with Spark MLlib](https://github.com/mcapuccini/scala-cp/blob/master/cp/src/test/scala/se/uu/it/cp/SparkTest.scala)
- [Scala-CP with LIBLINEAR](https://github.com/mcapuccini/scala-cp/blob/master/cp/src/test/scala/se/uu/it/cp/LibLinTest.scala)
- [ZeppelinHub: Scala-CP with Spark MLlib](https://www.zepl.com/viewer/notebooks/bm90ZTovL21hcmNvY2FwL3plcHBlbGluLWNwLzUyNjVlOGQyYjkxOTRmNGU4MWM4OGJjMzQyMDMzZDk5L25vdGUuanNvbg)
- [List of publications](#list-of-publications)
- [Roadmap](#roadmap)

## Getting started
Scala-CP can be used along with any Scala/Java machine learning library and algorithm. All you have to do is to add the Scala-CP dependency to your *pom.xml* file:

```xml

...

se.uu.it
cp
0.1.0

...

```

## Documentation
The API documentation is available at: https://mcapuccini.github.io/scala-cp/scaladocs/.

## Examples
For some usage examples please refer to the unit tests:

- [Scala-CP with Spark MLlib](https://github.com/mcapuccini/scala-cp/blob/master/cp/src/test/scala/se/uu/it/cp/SparkTest.scala)
- [Scala-CP with LIBLINEAR](https://github.com/mcapuccini/scala-cp/blob/master/cp/src/test/scala/se/uu/it/cp/LibLinTest.scala)

You can also refer to this Apache Zeppelin notebooks for more examples:

- [ZeppelinHub: Scala-CP with Spark MLlib](https://www.zepl.com/viewer/notebooks/bm90ZTovL21hcmNvY2FwL3plcHBlbGluLWNwLzUyNjVlOGQyYjkxOTRmNGU4MWM4OGJjMzQyMDMzZDk5L25vdGUuanNvbg)

## List of publications
- [M. Capuccini, L. Carlsson, U. Norinder and O. Spjuth, "Conformal Prediction in Spark: Large-Scale Machine Learning with Confidence," 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC), Limassol, 2015, pp. 61-67.](http://ieeexplore.ieee.org/document/7406330/)
- [Ahmed, L., Georgiev, V., Capuccini, M., Toor, S., Schaal, W., Laure, E., & Spjuth, O. (2018). Efficient iterative virtual screening with Apache Spark and conformal prediction. Journal of cheminformatics, 10(1), 1-8.](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0265-z)

## Roadmap

### Inductive Conformal Prediction
- [x] Classification
- [ ] Regression

### Transductive Conformal Prediction
- [ ] Classification
- [ ] Regression