https://github.com/antononcube/raku-math-sparsematrix
Raku package for sparse matrix technology.
https://github.com/antononcube/raku-math-sparsematrix
linear-algebra raku rakulang sparse-matrix
Last synced: 6 months ago
JSON representation
Raku package for sparse matrix technology.
- Host: GitHub
- URL: https://github.com/antononcube/raku-math-sparsematrix
- Owner: antononcube
- License: artistic-2.0
- Created: 2024-09-24T01:20:48.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-15T05:38:01.000Z (11 months ago)
- Last Synced: 2025-02-08T11:13:01.119Z (8 months ago)
- Topics: linear-algebra, raku, rakulang, sparse-matrix
- Language: Raku
- Homepage: https://raku.land/zef:antononcube/Math::SparseMatrix
- Size: 161 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README-work.md
- License: LICENSE
Awesome Lists containing this project
README
# Math::SparseMatrix
[](https://github.com/antononcube/Raku-Math-SparseMatrix/actions)
[](https://github.com/antononcube/Raku-Math-SparseMatrix/actions)
[](https://github.com/antononcube/Raku-Math-SparseMatrix/actions)[](https://raku.land/zef:antononcube/Math::SparseMatrix)
[](https://opensource.org/licenses/Artistic-2.0)Raku package for sparse matrix algorithms:
- Implements (some of) the algorithms described (and spelled-out in FORTRAN) in the book
"Sparse Matrix Technology" by S. Pissanetzky, [SP1].- Provides convenient interface to accessing sparse matrix elements, rows, column, and sub-matrices.
-----
## Motivation
Sparse Matrix Algebra (SMA) is a "must have" for many computational workflows.
Here is a (non-exhaustive) list given in the order of _my_ preferences:- Recommendation Systems (RS)
- I make recommenders often during Exploratory Data Analysis (EDA).
- For me, RS are "first order regression."
- I also specialize in the making of RS.
- I implemented a Raku recommender without SMA,
["ML::StreamsBlendingRecommender"](https://github.com/antononcube/Raku-ML-StreamsBlendingRecommender), [AAp1],
but it is too slow for "serious" datasets.
- Still useful; see [AAv1].
- Latent Semantic Analysis (LSA)
- LSA is one my favorite Unsupervised Machine Learning (ML) workflows.
- That means that this SMA package should have algorithms facilitating the programming of:
- Singular Value Decomposition (SVD)
- Non-Negative Matrix Factorization (NNMF)
- Graphs
- There is a natural (representation) connection between sparse matrices and graphs.
- Many graph algorithms can leverage (fast) SMA.
- So far (2024-09-25) the algorithms in "Graph", [AAp2], do not use SMA and that is feature- and speed limiting.
- Optimization
- For large scale optimization problems using SMA is a must.
- Since their constraints are given with sparse matrices.
- Partial Differential Equations (PDE) solving-----
## Usage examples
Here is a _simple_ sparse matrix in Compressed Sparse Row (CSR) format:
```perl6
use Math::SparseMatrix;
use Math::SparseMatrix::Utilities;my $nrow = 5;
my $ncol = 8;
my $density = 0.2;
my $tol = 0.01;
my $type = 'CSR';my $matrix1 = generate-random-sparse-matrix($nrow, $ncol, :$density, :$tol, :$type):!decorated;
say $matrix1;
```Here it is "pretty printed":
```perl6
$matrix1.print;
```Here `10` is multiplied with all elements:
```perl6
my $matrix2 = $matrix1.multiply(10);$matrix2.print;
```Here is the dot-product of the original matrix with its transpose:
```perl6
my $matrix3 = $matrix1.dot($matrix1.transpose);$matrix3.print;
```-----
## Special features
Here are few features that other SMA packages typically do not provide.
### Named rows and columns
It is very convenient to have named rows and columns that are respected (or preserved)
in the typical SMA operations.Here is an example:
```perl6
my $smat = Math::SparseMatrix.new($matrix1, row-names => 'a' .. 'e', column-names => 'A' .. 'H');$smat.print;
```Here is the dot-product of that matrix with its transpose:
```perl6
my $smat2 = $smat.dot($smat.transpose);$smat2.round(0.02).print;
```### Implicit value
The sparse matrices can have an _implicit value_ that is different from 0.
For example, adding a number to a sparse matrix produces another sparse matrix
with different implicit value:```perl6
my $matrix3 = $matrix1.add(10);
``````perl6
$matrix3.implicit-value
```Here is the pretty print:
```perl6
$matrix3.print(:iv)
```**Remark:** Currently, the implicit values are ignored in `dot`.
-----
## Design
### General
- There should be a "main" class, `Math::SpareMatrix` that:
- Provides the SMA functionalities
- Delegates to concrete sparse matrix classes that are based on different representation formats
- Can have named rows, columns, and dimensions
- Gives access to sparse matrix elements, rows, columns, and sub-matrices
- The default or "main" core sparse matrix class should use Compressed Sparse Row (CSR) format.
- Also, a class using Dictionary Of Keys (DOK) format should be provided.
- The core sparse matrix classes do not have named rows, columns, and dimensions.
- Ideally, a class using `NativeCall` should be implemented at some point.
- It looks like this is "a must", since the CSR and DOK classes are fairly slow.
- Both "plain C" and macOS [Accelerate](https://developer.apple.com/accelerate/) implementations should be made.
- The _most important operation_ is Matrix-Vector Dot Product.
- The current design is to use one-row or one-column matrices for the vectors.
- Dense vectors are (of course) also supported### Object-Oriented Programming (OOP) architecture
- The OOP [Decorator Design Pattern](https://en.wikipedia.org/wiki/Decorator_pattern) is used to organize the SMA functionalities.
- In that pattern:
- The _Component_ is played by the class [`Math::SparseMatrix::Abstract`](./lib/Math/SparseMatrix/Abstract.rakumod).
- The _ConcreteComponent_ is played by the classes:
- [`Math::SparseMatrix::CSR`](./lib/Math/SparseMatrix/CSR.rakumod)
- [`Math::SparseMatrix::DOK`](./lib/Math/SparseMatrix/DOK.rakumod)
- The concrete component classes provide the core SMA operations.
- The _Decorator_ is played by [`Math::SparseMatrix`](./lib/Math/SparseMatrix.rakumod).
- That is a "top level", interface class.
- Allows access using named rows and columns.
- "Hides" the actual component class used.Here is a corresponding diagram:
```mermaid
classDiagram
class Abstract["Math::SparseMatrix::Abstract"] {
<>
+value-at()
+row-at()
+column-at()
+row-slice()
+AT-POS()
+print()
+transpose()
+add()
+multiply()
+dot()
}class CSRStruct {
<>
}
class NativeCSR["Math::SparseMatrix::CSR::Native"] {
$row_ptr
$col_index
@values
nrow
ncol
implicit_value
}class NativeAdapater["Math::SparseMatrix::NativeAdapter"] {
+row-ptr
+col-index
+values
+nrow
+ncol
+implicit-value
}class CSR["Math::SparseMatrix::CSR"] {
@row-ptr
@col-index
@values
nrow
ncol
implicit-value
}
class DOK["Math::SparseMatrix::DOK"] {
%adjacency-map
nrow
ncol
implicit-value
}
class SparseMatrix["Math::SparseMatrix"] {
Abstract core-matrix
+AT-POS()
+print()
+transpose()
+add()
+multiply()
+dot()
}
CSR --> Abstract : implements
DOK --> Abstract : implements
NativeAdapater --> Abstract : implements
SparseMatrix --> Abstract : Hides actual component class
SparseMatrix *--> Abstract
NativeAdapater *--> NativeCSR
NativeCSR -- CSRStruct : reprents
```### Implementation details
- Again, the most important operation is Matrix-Vector Dot Product.
- It has to be as fast as possible.
- There are two Dot Product implementations for CSR:
- Direct
- Symbolic-&-numeric
- (Currently) the direct one is 20-50% faster.
- It seems it is a good idea to provide for some operations _symbolic_ (or sparse matrix elements pattern) methods.
- For example:
- `add-pattern` / `add`
- `dot-pattern` / `dot-numeric`
- It is important to have access methods / operators.
- All three are used in the accessor implementation: `AT-POS`, `postcircumfix:<[ ]>`, `postcircumfix:<[; ]>` .-----
## Performance
- Performance of CSR and DOK sparse matrices is not good: between 40 to 150 times slower than Wolfram Language.
- (Using the same matrices, of course.)
- It is somewhat surprising that DOK is faster than CSR.
- (Using pure-Raku.)
- `NativeCall` based implementations are ≈ 100 times faster.
- See ["Math::SparseMatrix::Native"](https://github.com/antononcube/Raku-Math-SparseMatrix-Native), [AAp3].-----
## Acknowledgements
Thanks to [@lizmat](https://github.com/lizmat) and [@tony-o](https://github.com/tony-o) for helping figuring out the proper use of `postcircumfix:<[]>` and `postcircumfix:<[; ]>`
in order to have the named rows and columns functionalities.-----
## References
### Books
[SP1] Sergio Pissanetzky, Sparse Matrix Technology, Academic Pr (January 1, 1984), ISBN-10: 0125575807, ISBN-13: 978-0125575805.
### Packages
[AAp1] Anton Antonov,
[ML::StreamsBlendingRecommender Raku package](https://github.com/antononcube/Raku-ML-StreamsBlendingRecommender),
(2021-2024),
[GitHub/antononcube](https://github.com/antononcube).[AAp2] Anton Antonov,
[Graph Raku package](https://github.com/antononcube/Raku-Graph),
(2024),
[GitHub/antononcube](https://github.com/antononcube).[AAp3] Anton Antonov,
[Math::SparseMatrix::Native Raku package](https://github.com/antononcube/Raku-Math-SparseMatrix-Native),
(2024),
[GitHub/antononcube](https://github.com/antononcube).### Videos
[AAv1] Anton Antonov,
["TRC 2022 Implementation of ML algorithms in Raku"](https://youtu.be/efRHfjYebs4?si=J5P8pK1TgGSxdlmD&t=193),
(2022),
[YouTube/antononcube](https://www.youtube.com/@AAA4prediction).