Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/cornellius-gp/linear_operator

A LinearOperator implementation to wrap the numerical nuts and bolts of GPyTorch
https://github.com/cornellius-gp/linear_operator
Last synced: 7 days ago
JSON representation
A LinearOperator implementation to wrap the numerical nuts and bolts of GPyTorch
Host: GitHub
URL: https://github.com/cornellius-gp/linear_operator
Owner: cornellius-gp
License: mit
Created: 2022-05-23T21:09:54.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-09-06T22:30:17.000Z (4 months ago)
Last Synced: 2024-12-09T03:59:52.236Z (14 days ago)
Language: Python
Size: 2.79 MB
Stars: 95
Watchers: 8
Forks: 29
Open Issues: 25
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

README

        # LinearOperator

[![Test](https://github.com/cornellius-gp/linear_operator/actions/workflows/run_test_suite.yml/badge.svg)](https://github.com/cornellius-gp/linear_operator/actions/workflows/run_test_suite.yml)

[![Documentation](https://readthedocs.org/projects/linear-operator/badge/?version=latest)](https://linear-operator.readthedocs.io/en/latest/?badge=latest)

[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

[![Python Version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

[![Conda](https://img.shields.io/conda/v/gpytorch/linear_operator.svg)](https://anaconda.org/gpytorch/linear_operator)

[![PyPI](https://img.shields.io/pypi/v/linear_operator.svg)](https://pypi.org/project/linear_operator)

LinearOperator is a PyTorch package for abstracting away the linear algebra routines needed for structured matrices (or operators).

**This package is in beta.**

Currently, most of the functionality only supports positive semi-definite and triangular matrices.

Package development TODOs:

 - [x] Support PSD operators

 - [x] Support triangular operators

 - [ ] Interface to specify structure (i.e. symmetric, triangular, PSD, etc.)

 - [ ] Add algebraic routines for symmetric operators

 - [ ] Add algebraic routines for generic square operators

 - [ ] Add algebraic routines for generic rectangular operators

 - [ ] Add sparse operators

To get started, run either

```sh

pip install linear_operator

# or

conda install linear_operator -c gpytorch

```

or [see below](#installation) for more detailed instructions.

## Why LinearOperator

Before describing what linear operators are and why they make a useful abstraction, it's easiest to see an example.

Let's say you wanted to compute a matrix solve:

$$\boldsymbol A^{-1} \boldsymbol b.$$

If you didn't know anything about the matrix $\boldsymbol A$, the simplest (and best) way to accomplish this in code is:

```python

# A = torch.randn(1000, 1000)

# b = torch.randn(1000)

torch.linalg.solve(A, b)  # computes A^{-1} b

```

While this is easy, the `solve` routine is $\mathcal O(N^3)$, which gets very slow as $N$ grows large.

However, let's imagine that we knew that $\boldsymbol A$ was equal to a low rank matrix plus a diagonal

(i.e. $\boldsymbol A = \boldsymbol C \boldsymbol C^\top + \boldsymbol D$

for some skinny matrix $\boldsymbol C$ and some diagonal matrix $\boldsymbol D$.)

There's now a very efficient $\boldsymbol O(N)$ routine to compute $\boldsymbol A^{-1}$ (the [Woodbury formula](https://en.wikipedia.org/wiki/Woodbury_matrix_identity)).

**In general**, if we know that $\boldsymbol A$ has structure,

we want to use efficient linear algebra routines - rather than the general routines -

that exploit this structure.

### Without LinearOperator

Implementing the efficient solve that exploits $\boldsymbol A$'s low-rank-plus-diagonal structure would look something like this:

```python

def low_rank_plus_diagonal_solve(C, d, b):

    # A = C C^T + diag(d)

    # A^{-1} b = D^{-1} b - D^{-1} C (I + C^T D^{-1} C)^{-1} C^T D^{-1} b

    #   where D = diag(d)

    D_inv_b = b / d

    D_inv_C = C / d.unsqueeze(-1)

    eye = torch.eye(C.size(-2))

    return (

        D_inv_b - D_inv_C @ torch.cholesky_solve(

            C.mT @ D_inv_b,

            torch.linalg.cholesky(eye + C.mT @ D_inv_C, upper=False),

            upper=False

        )

    )

# C = torch.randn(1000, 20)

# d = torch.randn(1000)

# b = torch.randn(1000)

low_rank_plus_diagonal_solve(C, d, b)  # computes A^{-1} b in O(N) time, instead of O(N^3)

```

While this is efficient code, it's not ideal for a number of reasons:

1. It's a lot more complicated than `torch.linalg.solve(A, b)`.

2. There's no object that represents $\boldsymbol A$.

   To perform any math with $\boldsymbol A$, we have to pass around the matrix `C` and the vector `d`.

### With LinearOperator

The LinearOperator package offers the best of both worlds:

```python

from linear_operator.operators import DiagLinearOperator, LowRankRootLinearOperator

# C = torch.randn(1000, 20)

# d = torch.randn(1000)

# b = torch.randn(1000)

A = LowRankRootLinearOperator(C) + DiagLinearOperator(d)  # represents C C^T + diag(d)

```

it provides an interface that lets us treat $\boldsymbol A$ as if it were a generic tensor,

using the standard PyTorch API:

```python

torch.linalg.solve(A, b)  # computes A^{-1} b efficiently!

```

Under-the-hood, the `LinearOperator` object keeps track of the algebraic structure of $\boldsymbol A$ (low rank plus diagonal)

and determines the most efficient routine to use (the Woodbury formula).

This way, we can get a efficient $\mathcal O(N)$ solve while abstracting away all of the details.

Crucially, $\boldsymbol A$ is never explicitly instantiated as a matrix, which makes it possible to scale

to very large operators without running out of memory:

```python

# C = torch.randn(10000000, 20)

# d = torch.randn(10000000)

# b = torch.randn(10000000)

A = LowRankRootLinearOperator(C) + DiagLinearOperator(d)  # represents a 10M x 10M matrix!

torch.linalg.solve(A, b)  # computes A^{-1} b efficiently!

```

## What is a Linear Operator?

A linear operator is a generalization of a matrix.

It is a linear function that is defined in by its application to a vector.

The most common linear operators are (potentially structured) matrices,

where the function applying them to a vector are (potentially efficient)

matrix-vector multiplication routines.

In code, a `LinearOperator` is a class that

1. specifies the tensor(s) needed to define the LinearOperator,

1. specifies a `_matmul` function (how the LinearOperator is applied to a vector),

1. specifies a `_size` function (how big is the LinearOperator if it is represented as a matrix, or batch of matrices), and

1. specifies a `_transpose_nonbatch` function (the adjoint of the LinearOperator).

1. (optionally) defines other functions (e.g. `logdet`, `eigh`, etc.) to accelerate computations for which efficient sturcture-exploiting routines exist.

For example:

```python

class DiagLinearOperator(linear_operator.LinearOperator):

    r"""

    A LinearOperator representing a diagonal matrix.

    """

    def __init__(self, diag):

        # diag: the vector that defines the diagonal of the matrix

        self.diag = diag

    def _matmul(self, v):

        return self.diag.unsqueeze(-1) * v

    def _size(self):

        return torch.Size([*self.diag.shape, self.diag.size(-1)])

    def _transpose_nonbatch(self):

        return self  # Diagonal matrices are symmetric

    # this function is optional, but it will accelerate computation

    def logdet(self):

        return self.diag.log().sum(dim=-1)

# ...

D = DiagLinearOperator(torch.tensor([1., 2., 3.])

# Represents the matrix

#   [[1., 0., 0.],

#    [0., 2., 0.],

#    [0., 0., 3.]]

torch.matmul(D, torch.tensor([4., 5., 6.])

# Returns [4., 10., 18.]

```

While `_matmul`, `_size`, and `_transpose_nonbatch` might seem like a limited set of functions,

it turns out that most functions on the `torch` and `torch.linalg` namespaces can be efficiently implemented

using only these three primitative functions.

Moreover, because `_matmul` is a linear function, it is very easy to compose linear operators in various ways.

For example: adding two linear operators (`SumLinearOperator`) just requires adding the output of their `_matmul` functions.

This makes it possible to define very complex compositional structures that still yield efficient linear algebraic routines.

Finally, `LinearOperator` objects can be composed with one another, yielding new `LinearOperator` objects and automatically keeping track of algebraic structure after each computation.

As a result, users never need to reason about what efficient linear algebra routines to use  (so long as the input elements defined by the user encode known input structure).

See the [using LinearOperator objects](#using-linearoperator-objects) section for more details.

## Use Cases

There are several use cases for the LinearOperator package.

Here we highlight two general themes:

### Modular Code for Structured Matrices

For example, let's say that you have a generative model that involves

sampling from a high-dimensional multivariate Gaussian.

This sampling operation will require storing and manipulating a large covariance matrix,

so to speed things up you might want to experiment with different structured

approximations of that covariance matrix.

This is easy with the LinearOperator package.

```python

from gpytorch.distributions import MultivariateNormal

# variance = torch.randn(10000)

cov = DiagLinearOperator(variance)

# or

# cov = LowRankRootLinearOperator(...) + DiagLinearOperator(...)

# or

# cov = KroneckerProductLinearOperator(...)

# or

# cov = ToeplitzLinearOperator(...)

# or

# ...

mvn = MultivariateNormal(torch.zeros(cov.size(-1), cov) # 10000-dimensional MVN

mvn.rsample()  # returns a 10000-dimensional vector

```

### Efficient Routines for Complex Operators

Many of the efficient linear algebra routines in LinearOperator are iterative algorithms

based on matrix-vector multiplication.

Since matrix-vector multiplication obeys many nice compositional properties

it is possible to obtain efficient routines for extremely complex compositional LienarOperators:

```python

from linear_operator.operators import KroneckerProductLinearOperator, RootLinearOperator, ToeplitzLinearOperator

# mat1 = 200 x 200 PSD matrix

# mat2 = 100 x 100 PSD matrix

# vec3 = 20000 vector

A = KroneckerProductLinearOperator(mat1, mat2) + RootLinearOperator(ToeplitzLinearOperator(vec3))

# represents a 20000 x 20000 matrix

torch.linalg.solve(A, torch.randn(20000))  # Sub O(N^3) routine!

```

## Using LinearOperator Objects

LinearOperator objects share (mostly) the same API as `torch.Tensor` objects.

Under the hood, these objects use `__torch_function__` to dispatch all efficient linear algebra operations

to the `torch` and `torch.linalg` namespaces.

This includes

- `torch.add`

- `torch.cat`

- `torch.clone`

- `torch.diagonal`

- `torch.dim`

- `torch.div`

- `torch.expand`

- `torch.logdet`

- `torch.matmul`

- `torch.numel`

- `torch.permute`

- `torch.prod`

- `torch.squeeze`

- `torch.sub`

- `torch.sum`

- `torch.transpose`

- `torch.unsqueeze`

- `torch.linalg.cholesky`

- `torch.linalg.eigh`

- `torch.linalg.eigvalsh`

- `torch.linalg.solve`

- `torch.linalg.svd`

Each of these functions will either return a `torch.Tensor`, or a new `LinearOperator` object,

depending on the function.

For example:

```python

# A = RootLinearOperator(...)

# B = ToeplitzLinearOperator(...)

# d = vec

C = torch.matmul(A, B)  # A new LienearOperator representing the product of A and B

torch.linalg.solve(C, d)  # A torch.Tensor

```

For more examples, see the [examples folder](https://github.com/cornellius-gp/linear_operator/blob/main/examples/).

### Batch Support and Broadcasting

`LinearOperator` objects operate naturally in batch mode.

For example, to represent a batch of 3 `100 x 100` diagonal matrices:

```python

# d = torch.randn(3, 100)

D = DiagLinearOperator(d)  # Reprents an operator of size 3 x 100 x 100

```

These objects fully support broadcasted operations:

```python

D @ torch.randn(100, 2)  # Returns a tensor of size 3 x 100 x 2

D2 = DiagLinearOperator(torch.randn([2, 1, 100]))  # Represents an operator of size 2 x 1 x 100 x 100

D2 + D  # Represents an operator of size 2 x 3 x 100 x 100

```

### Indexing

`LinearOperator` objects can be indexed in ways similar to torch Tensors. This includes:

- Integer indexing (get a row, column, or batch)

- Slice indexing (get a subset of rows, columns, or batches)

- LongTensor indexing (get a set of individual entries by index)

- Ellipses (support indexing operations with arbitrary batch dimensions)

```python

D = DiagLinearOperator(torch.randn(2, 3, 100))  # Represents an operator of size 2 x 3 x 100 x 100

D[-1]  # Returns a 3 x 100 x 100 operator

D[..., :10, -5:]  # Returns a 2 x 3 x 10 x 5 operator

D[..., torch.LongTensor([0, 1, 2, 3]), torch.LongTensor([0, 1, 2, 3])]  # Returns a 2 x 3 x 4 tensor

```

### Composition and Decoration

LinearOperators can be composed with one another in various ways.

This includes

- Addition (`LinearOpA + LinearOpB`)

- Matrix multiplication (`LinearOpA @ LinearOpB`)

- Concatenation (`torch.cat([LinearOpA, LinearOpB], dim=-2)`)

- Kronecker product (`torch.kron(LinearOpA, LinearOpB)`)

In addition, there are many ways to "decorate" LinearOperator objects.

This includes:

- Elementwise multiplying by constants (`torch.mul(2., LinearOpA)`)

- Summing over batches (`torch.sum(LinearOpA, dim=-3)`)

- Elementwise multiplying over batches (`torch.prod(LinearOpA, dim=-3)`)

See the documentation for a [full list of supported composition and decoration operations](https://linear-operator.readthedocs.io/en/latest/composition_decoration_operators.html).

## Installation

LinearOperator requires Python >= 3.8.

### Standard Installation (Most Recent Stable Version)

We recommend installing via `pip` or Anaconda:

```sh

pip install linear_operator

# or

conda install linear_operator -c gpytorch

```

The installation requires the following packages:

- PyTorch >= 1.11

- Scipy

You can customize your PyTorch installation (i.e. CUDA version, CPU only option)

by following the [PyTorch installation instructions](https://pytorch.org/get-started/locally/).

### Installing from the `main` Branch (Latest Unsable Version)

To install what is currently on the `main` branch (potentially buggy and unstable):

```sh

pip install --upgrade git+https://github.com/cornellius-gp/linear_operator.git

```

### Development Installation

If you are contributing a pull request, it is best to perform a manual installation:

```sh

git clone https://github.com/cornellius-gp/linear_operator.git

cd linear_operator

pip install -e ".[dev,docs,test]"

```

## Contributing

See the contributing guidelines [CONTRIBUTING.md](https://github.com/cornellius-gp/linear_operator/blob/main/CONTRIBUTING.md)

for information on submitting issues and pull requests.

## License

LinearOperator is [MIT licensed](https://github.com/cornellius-gp/linear_operator/blob/main/LICENSE).