Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ehsanmok/spark-lp
Distributed Linear Programming Solver on top of Apache Spark
https://github.com/ehsanmok/spark-lp
distributed-computing distributed-optimization high-performance linear-programming scala spark
Last synced: about 19 hours ago
JSON representation
Distributed Linear Programming Solver on top of Apache Spark
- Host: GitHub
- URL: https://github.com/ehsanmok/spark-lp
- Owner: ehsanmok
- License: apache-2.0
- Created: 2016-11-14T22:15:14.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2021-01-04T19:07:27.000Z (about 4 years ago)
- Last Synced: 2025-01-06T17:14:31.719Z (5 days ago)
- Topics: distributed-computing, distributed-optimization, high-performance, linear-programming, scala, spark
- Language: Scala
- Homepage: https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337
- Size: 38.1 KB
- Stars: 75
- Watchers: 7
- Forks: 23
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# spark-lp
![Project unmaintained](https://img.shields.io/badge/project-unmaintained-red.svg)
This package offers an implementation of [Mehrohra's predictor-corrector interior point algorithm](https://en.wikipedia.org/wiki/Mehrotra_predictor%E2%80%93corrector_method), described in my thesis [Distributed linear programming with Apache Spark](https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337), to solve **large-scale** [linear programming](https://en.wikipedia.org/wiki/Linear_programming) problems at the *lowest cost* using Apache Spark.
Linear programming has the following standard form:
minimize c^T x
subject to Ax=b and x >= 0where `c, b` are given vectors ((.)^T is the traspose operation), `A` is a given `m` by `n` matrix and `x` is the objective vector. We assume that in `A` the number of rows (equations) is
at most equal to the number of columns (unknowns) (`m <= n`) and `A` has full row rank, thus `AA^T` is invertible.## Example
The following is an example of using spark-lp *locally* to solve a linear programming problem in parallel with 2 cores and 2 partitions:
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors}
import org.apache.spark.mllib.optimization.lp.VectorSpace._
import org.apache.spark.mllib.optimization.lp.vs.dvector.DVectorSpace
import org.apache.spark.mllib.optimization.lp.vs.vector.DenseVectorSpace
import org.apache.spark.mllib.optimization.lp.LPval sparkConf = new SparkConf().setMaster("local[2]").setAppName("TestLPSolver")
val sc = new SparkContext(sparkConf)
val numPartitions = 2
val cArray = Array(2.0, 1.5, 0.0, 0.0, 0.0, 0.0, 0.0)
val BArray = Array(
Array(12.0, 16.0, 30.0, 1.0, 0.0),
Array(24.0, 16.0, 12.0, 0.0, 1.0),
Array(-1.0, 0.0, 0.0, 0.0, 0.0),
Array(0.0, -1.0, 0.0, 0.0, 0.0),
Array(0.0, 0.0, -1.0, 0.0, 0.0),
Array(0.0, 0.0, 0.0, 1.0, 0.0),
Array(0.0, 0.0, 0.0, 0.0, 1.0))
val bArray = Array(120.0, 120.0, 120.0, 15.0, 15.0)val c: DVector = sc.parallelize(cArray, numPartitions).glom.map(new DenseVector(_))
val rows: DMatrix = sc.parallelize(BArray, numPartitions).map(Vectors.dense(_))
val b: DenseVector = new DenseVector(bArray)val (v, x): (Double, DVector) = LP.solve(c, rows, b, sc=sc)
val xx = Vectors.dense(x.flatMap(_.toArray).collect())
println(s"optimial vector is $xx")
println("optimal min value: " + v)## Software Architecture Overview
Detailed descriptions of our design is described in chapter 4 of the [thesis](https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337).
## Advantages
* spark-lp is unique because it is **open-source** and it can solve large-scale LP problems in a distributed way with **fault-tolerance** over **commodity clusters** of machines. Thus, it provides the *lowest cost* opportunity for such applications. See page 42 for cluster results [here](https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337).
* spark-lp is at least ~10X *faster* and more accurate than spark-tfocs for solving large-scale LP problems. See page 38 for local results [here](https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337). Our benchmark shows that spark-tfocs is *not* suitable even for small LP problems.
## Future plans:
* Add preprocessing to capture more general LP formats.
* Add infeasibility detection.
* Extend to QP solver.
* Add GPU support, as described in page 47 [here](https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340337), using INDArray provided in [ND4J](http://nd4j.org/) library.