{"id":17336847,"url":"https://github.com/wesselb/stheno","last_synced_at":"2025-04-05T12:05:01.527Z","repository":{"id":45537433,"uuid":"94204636","full_name":"wesselb/stheno","owner":"wesselb","description":"Gaussian process modelling in Python","archived":false,"fork":false,"pushed_at":"2024-01-20T14:11:37.000Z","size":12094,"stargazers_count":218,"open_issues_count":4,"forks_count":18,"subscribers_count":10,"default_branch":"master","last_synced_at":"2024-12-06T22:36:11.572Z","etag":null,"topics":["gaussian-processes","machine-learning","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wesselb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-06-13T11:15:22.000Z","updated_at":"2024-11-07T11:44:11.000Z","dependencies_parsed_at":"2024-01-20T15:26:25.877Z","dependency_job_id":"fa5613eb-7ac3-45f3-bf8c-a4793ce3013e","html_url":"https://github.com/wesselb/stheno","commit_stats":{"total_commits":661,"total_committers":2,"mean_commits":330.5,"dds":"0.013615733736762503","last_synced_commit":"f6a9ca3a60c82c8d7d61d117708428133bf6f7d0"},"previous_names":[],"tags_count":38,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wesselb%2Fstheno","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wesselb%2Fstheno/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wesselb%2Fstheno/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wesselb%2Fstheno/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wesselb","download_url":"https://codeload.github.com/wesselb/stheno/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247332602,"owners_count":20921853,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gaussian-processes","machine-learning","python"],"created_at":"2024-10-15T15:32:43.618Z","updated_at":"2025-04-05T12:05:01.505Z","avatar_url":"https://github.com/wesselb.png","language":"Python","funding_links":[],"categories":["📦 Packages"],"sub_categories":["Python"],"readme":"# [Stheno](https://github.com/wesselb/stheno)\n\n[![CI](https://github.com/wesselb/stheno/workflows/CI/badge.svg?branch=master)](https://github.com/wesselb/stheno/actions?query=workflow%3ACI)\n[![Coverage Status](https://coveralls.io/repos/github/wesselb/stheno/badge.svg?branch=master)](https://coveralls.io/github/wesselb/stheno?branch=master)\n[![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://wesselb.github.io/stheno)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nStheno is an implementation of Gaussian process modelling in Python. See \nalso [Stheno.jl](https://github.com/willtebbutt/Stheno.jl).\n\n[Check out our post about linear models with Stheno and JAX.](https://wesselb.github.io/2021/01/19/linear-models-with-stheno-and-jax.html)\n\nContents:\n\n* [Nonlinear Regression in 20 Seconds](#nonlinear-regression-in-20-seconds)\n* [Installation](#installation)\n* [Manual](#manual)\n    - [AutoGrad, TensorFlow, PyTorch, or JAX? Your Choice!](#autograd-tensorflow-pytorch-or-jax-your-choice)\n    - [Model Design](#model-design)\n    - [Finite-Dimensional Distributions](#finite-dimensional-distributions)\n    - [Prior and Posterior Measures](#prior-and-posterior-measures)\n    - [Inducing Points](#inducing-points)\n    - [Kernels and Means](#kernels-and-means)\n    - [Batched Computation](#batched-computation)\n    - [Important Remarks](#important-remarks)\n* [Examples](#examples)\n    - [Simple Regression](#simple-regression)\n    - [Hyperparameter Optimisation with Varz](#hyperparameter-optimisation-with-varz)\n    - [Hyperparameter Optimisation with PyTorch](#hyperparameter-optimisation-with-pytorch)\n    - [Decomposition of Prediction](#decomposition-of-prediction)\n    - [Learn a Function, Incorporating Prior Knowledge About Its Form](#learn-a-function-incorporating-prior-knowledge-about-its-form)\n    - [Multi-Output Regression](#multi-output-regression)\n    - [Approximate Integration](#approximate-integration)\n    - [Bayesian Linear Regression](#bayesian-linear-regression)\n    - [GPAR](#gpar)\n    - [A GP-RNN Model](#a-gp-rnn-model)\n    - [Approximate Multiplication Between GPs](#approximate-multiplication-between-gps)\n    - [Sparse Regression](#sparse-regression)\n    - [Smoothing with Nonparametric Basis Functions](#smoothing-with-nonparametric-basis-functions)\n\n## Nonlinear Regression in 20 Seconds\n\n```python\n\u003e\u003e\u003e import numpy as np\n\n\u003e\u003e\u003e from stheno import GP, EQ\n\n\u003e\u003e\u003e x = np.linspace(0, 2, 10)           # Some points to predict at\n\n\u003e\u003e\u003e y = x ** 2                          # Some observations\n\n\u003e\u003e\u003e f = GP(EQ())                        # Construct Gaussian process.\n\n\u003e\u003e\u003e f_post = f | (f(x), y)              # Compute the posterior.\n\n\u003e\u003e\u003e pred = f_post(np.array([1, 2, 3]))  # Predict!\n\n\u003e\u003e\u003e pred.mean\n\u003cdense matrix: shape=3x1, dtype=float64\n mat=[[1.   ]\n      [4.   ]\n      [8.483]]\u003e\n\n\u003e\u003e\u003e pred.var\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[ 8.032e-13  7.772e-16 -4.577e-09]\n      [ 7.772e-16  9.999e-13  2.773e-10]\n      [-4.577e-09  2.773e-10  3.313e-03]]\u003e\n```\n\n[These custom matrix types are there to accelerate the underlying linear algebra.](#important-remarks)\nTo get vanilla NumPy/AutoGrad/TensorFlow/PyTorch/JAX arrays, use `B.dense`:\n\n```python\n\u003e\u003e\u003e from lab import B\n\n\u003e\u003e\u003e B.dense(pred.mean)\narray([[1.00000068],\n       [3.99999999],\n       [8.4825932 ]])\n\n\u003e\u003e\u003e B.dense(pred.var)\narray([[ 8.03246358e-13,  7.77156117e-16, -4.57690943e-09],\n       [ 7.77156117e-16,  9.99866856e-13,  2.77333267e-10],\n       [-4.57690943e-09,  2.77333267e-10,  3.31283378e-03]])\n```\n\nMoar?! Then read on!\n\n## Installation\n\n```\npip install stheno\n```\n\n## Manual\n\nNote: [here](https://wesselb.github.io/stheno) is a nicely rendered and more\nreadable version of the docs.\n\n### AutoGrad, TensorFlow, PyTorch, or JAX? Your Choice!\n\n```python\nfrom stheno.autograd import GP, EQ\n```\n\n```python\nfrom stheno.tensorflow import GP, EQ\n```\n\n```python\nfrom stheno.torch import GP, EQ\n```\n\n```python\nfrom stheno.jax import GP, EQ\n```\n\n### Model Design\n\nThe basic building block is a `f = GP(mean=0, kernel, measure=prior)`, which takes\nin [a _mean_, a _kernel_](#kernels-and-means), and a _measure_.\nThe mean and kernel of a GP can be extracted with `f.mean` and `f.kernel`.\nThe measure should be thought of as a big joint distribution that assigns a mean and\na kernel to every variable `f`.\nA measure can be created with `prior = Measure()`.\nA GP `f` can have different means and kernels under different measures.\nFor example, under some _prior_ measure, `f` can have an `EQ()` kernel; but, under some\n_posterior_ measure, `f` has a kernel that is determined by the posterior distribution\nof a GP.\n[We will see later how posterior measures can be constructed.](#prior-and-posterior-measures)\nThe measure with which a `f = GP(kernel, measure=prior)` is constructed can be\nextracted with `f.measure == prior`.\nIf the keyword argument `measure` is not set, then automatically a new measure is\ncreated, which afterwards can be extracted with `f.measure`.\n\nDefinition, where `prior = Measure()`:\n\n```python\nf = GP(kernel)\n\nf = GP(mean, kernel)\n\nf = GP(kernel, measure=prior)\n\nf = GP(mean, kernel, measure=prior)\n```\n\nGPs that are associated to the same measure can be combined into new GPs, which is\nthe primary mechanism used to build cool models.\n\nHere's an example model:\n\n```python\n\u003e\u003e\u003e prior = Measure()\n\n\u003e\u003e\u003e f1 = GP(lambda x: x ** 2, EQ(), measure=prior)\n\n\u003e\u003e\u003e f1\nGP(\u003clambda\u003e, EQ())\n\n\u003e\u003e\u003e f2 = GP(Linear(), measure=prior)\n\n\u003e\u003e\u003e f2\nGP(0, Linear())\n\n\u003e\u003e\u003e f_sum = f1 + f2\n\n\u003e\u003e\u003e f_sum\nGP(\u003clambda\u003e, EQ() + Linear())\n\n\u003e\u003e\u003e f_sum + GP(EQ())  # Not valid: `GP(EQ())` belongs to a new measure!\nAssertionError: Processes GP(\u003clambda\u003e, EQ() + Linear()) and GP(0, EQ()) are associated to different measures.\n```\n\nTo avoid setting the keyword `measure` for every `GP` that you create, you can enter\na measure as a context:\n\n```python\n\u003e\u003e\u003e with Measure() as prior:\n        f1 = GP(lambda x: x ** 2, EQ())\n        f2 = GP(Linear())\n        f_sum = f1 + f2\n\n\u003e\u003e\u003e prior == f1.measure == f2.measure == f_sum.measure\nTrue\n```\n\n\n\n#### Compositional Design\n\n* Add and subtract GPs and other objects.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ(), measure=prior) + GP(Exp(), measure=prior)\n    GP(0, EQ() + Exp())\n\n    \u003e\u003e\u003e GP(EQ(), measure=prior) + GP(EQ(), measure=prior)\n    GP(0, 2 * EQ())\n  \n    \u003e\u003e\u003e GP(EQ()) + 1\n    GP(1, EQ())\n  \n    \u003e\u003e\u003e GP(EQ()) + 0\n    GP(0, EQ())\n  \n    \u003e\u003e\u003e GP(EQ()) + (lambda x: x ** 2)\n    GP(\u003clambda\u003e, EQ())\n\n    \u003e\u003e\u003e GP(2, EQ(), measure=prior) - GP(1, EQ(), measure=prior)\n    GP(1, 2 * EQ())\n    ```\n    \n* Multiply GPs and other objects.\n\n    *Warning:*\n    The product of two GPs it *not* a Gaussian process.\n    Stheno approximates the resulting process by moment matching.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(1, EQ(), measure=prior) * GP(1, Exp(), measure=prior)\n    GP(\u003clambda\u003e + \u003clambda\u003e + -1 * 1, \u003clambda\u003e * Exp() + \u003clambda\u003e * EQ() + EQ() * Exp())\n  \n    \u003e\u003e\u003e 2 * GP(EQ())\n    GP(2, 2 * EQ())\n  \n    \u003e\u003e\u003e 0 * GP(EQ())\n    GP(0, 0)\n\n    \u003e\u003e\u003e (lambda x: x) * GP(EQ())\n    GP(0, \u003clambda\u003e * EQ())\n    ```\n    \n* Shift GPs.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).shift(1)\n    GP(0, EQ() shift 1) \n    ```\n    \n* Stretch GPs.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).stretch(2)\n    GP(0, EQ() \u003e 2)\n    ```\n    \n* Select particular input dimensions.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).select(1, 3)\n    GP(0, EQ() : [1, 3])\n    ```\n    \n* Transform the input.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).transform(f)\n    GP(0, EQ() transform f)\n    ```\n    \n* Numerically take the derivative of a GP.\n    The argument specifies which dimension to take the derivative with respect\n    to.\n    \n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).diff(1)\n    GP(0, d(1) EQ())\n    ```\n    \n* Construct a finite difference estimate of the derivative of a GP.\n    See `Measure.diff_approx` for a description of the arguments.\n    \n    Example:\n    \n    ```python\n    \u003e\u003e\u003e GP(EQ()).diff_approx(deriv=1, order=2)\n    GP(50000000.0 * (0.5 * EQ() + 0.5 * ((-0.5 * (EQ() shift (0.0001414213562373095, 0))) shift (0, -0.0001414213562373095)) + 0.5 * ((-0.5 * (EQ() shift (0, 0.0001414213562373095))) shift (-0.0001414213562373095, 0))), 0)\n    ```\n    \n* Construct the Cartesian product of a collection of GPs.\n\n    Example:\n    \n    ```python\n    \u003e\u003e\u003e prior = Measure()\n\n    \u003e\u003e\u003e f1, f2 = GP(EQ(), measure=prior), GP(EQ(), measure=prior)\n\n    \u003e\u003e\u003e cross(f1, f2)\n    GP(MultiOutputMean(0, 0), MultiOutputKernel(EQ(), EQ()))\n    ```\n\n#### Displaying GPs\n\nGPs have a `display` method that accepts a formatter.\n\nExample:\n\n```python\n\u003e\u003e\u003e print(GP(2.12345 * EQ()).display(lambda x: f\"{x:.2f}\"))\nGP(2.12 * EQ(), 0)\n```\n\n#### Properties of GPs\n\n[Properties of kernels](https://github.com/wesselb/mlkernels#properties-of-kernels-and-means)\ncan be queried on GPs directly.\n\nExample:\n\n```python\n\u003e\u003e\u003e GP(EQ()).stationary\nTrue\n```\n\n#### Naming GPs\n\nIt is possible to give a name to a GP.\nNames must be strings.\nA measure then behaves like a two-way dictionary between GPs and their names.\n\nExample:\n\n```python\n\u003e\u003e\u003e prior = Measure()\n\n\u003e\u003e\u003e p = GP(EQ(), name=\"name\", measure=prior)\n\n\u003e\u003e\u003e p.name\n'name'\n\n\u003e\u003e\u003e p.name = \"alternative_name\"\n\n\u003e\u003e\u003e prior[\"alternative_name\"]\nGP(0, EQ())\n\n\u003e\u003e\u003e prior[p]\n'alternative_name'\n```\n\n### Finite-Dimensional Distributions\n\nSimply call a GP to construct a finite-dimensional distribution at some inputs.\nYou can give a second argument, which specifies the variance of additional additive\nnoise.\nAfter constructing a finite-dimensional distribution, you can compute the mean,\nthe variance, sample, or compute a logpdf.\n\nDefinition, where `f` is a `GP`:\n\n```python\nf(x)         # No additional noise\n\nf(x, noise)  # Additional noise with variance `noise`\n```\n\nThings you can do with a finite-dimensional distribution:\n\n* \n    Use `f(x).mean` to compute the mean.\n    \n* \n    Use `f(x).var` to compute the variance.\n \n* \n    Use `f(x).mean_var` to compute simultaneously compute the mean and variance.\n    This can be substantially more efficient than calling first `f(x).mean` and then\n    `f(x).var`.\n\n* \n    Use `Normal.sample` to sample.\n\n    Definition:\n  \n    ```python\n    f(x).sample()                # Produce one sample.\n  \n    f(x).sample(n)               # Produce `n` samples.\n  \n    f(x).sample(noise=noise)     # Produce one samples with additional noise variance `noise`.\n  \n    f(x).sample(n, noise=noise)  # Produce `n` samples with additional noise variance `noise`.\n    ```\n  \n* \n    Use `f(x).logpdf(y)` to compute the logpdf of some data `y`.\n    \n* \n    Use `means, variances = f(x).marginals()` to efficiently compute the marginal means\n    and marginal variances.\n    \n    Example:\n\n    ```python\n    \u003e\u003e\u003e f(x).marginals()\n    (array([0., 0., 0.]), np.array([1., 1., 1.]))\n    ```\n  \n* \n    Use `means, lowers, uppers = f(x).marginal_credible_bounds()` to efficiently compute\n    the means and the marginal lower and upper 95% central credible region bounds.\n\n    Example:\n\n    ```python\n    \u003e\u003e\u003e f(x).marginal_credible_bounds()\n    (array([0., 0., 0.]), array([-1.96, -1.96, -1.96]), array([1.96, 1.96, 1.96]))\n    ```\n  \n* \n    Use `Measure.logpdf` to compute the joint logpdf of multiple observations.\n\n    Definition, where `prior = Measure()`:\n\n    ```python\n    prior.logpdf(f(x), y)\n\n    prior.logpdf((f1(x1), y1), (f2(x2), y2), ...)\n    ```\n  \n* \n    Use `Measure.sample` to jointly sample multiple observations.\n\n    Definition, where `prior = Measure()`:\n\n    ```python\n    sample = prior.sample(f(x))\n\n    sample1, sample2, ... = prior.sample(f1(x1), f2(x2), ...)\n    ```\n\nExample:\n\n```python\n\u003e\u003e\u003e prior = Measure()\n\n\u003e\u003e\u003e f = GP(EQ(), measure=prior)\n\n\u003e\u003e\u003e x = np.array([0., 1., 2.])\n\n\u003e\u003e\u003e f(x)       # FDD without noise.\n\u003cFDD:\n process=GP(0, EQ()),\n input=array([0., 1., 2.]),\n noise=\u003czero matrix: shape=3x3, dtype=float64\u003e\n\n\u003e\u003e\u003e f(x, 0.1)  # FDD with noise.\n\u003cFDD:\n process=GP(0, EQ()),\n input=array([0., 1., 2.]),\n noise=\u003cdiagonal matrix: shape=3x3, dtype=float64\n        diag=[0.1 0.1 0.1]\u003e\u003e\n\n\u003e\u003e\u003e f(x).mean\narray([[0.],\n       [0.],\n       [0.]])\n\n\u003e\u003e\u003e f(x).var\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[1.    0.607 0.135]\n      [0.607 1.    0.607]\n      [0.135 0.607 1.   ]]\u003e\n       \n\u003e\u003e\u003e y1 = f(x).sample()\n\n\u003e\u003e\u003e y1\narray([[-0.45172746],\n       [ 0.46581948],\n       [ 0.78929767]])\n       \n\u003e\u003e\u003e f(x).logpdf(y1)\n-2.811609567720761\n\n\u003e\u003e\u003e y2 = f(x).sample(2)\narray([[-0.43771276, -2.36741858],\n       [ 0.86080043, -1.22503079],\n       [ 2.15779126, -0.75319405]]\n\n\u003e\u003e\u003e f(x).logpdf(y2)\n array([-4.82949038, -5.40084225])\n```\n\n### Prior and Posterior Measures\n\nConditioning a _prior_ measure on observations gives a _posterior_ measure.\nTo condition a measure on observations, use `Measure.__or__`.\n\nDefinition, where `prior = Measure()` and `f*` are `GP`s:\n\n```python\npost = prior | (f(x, [noise]), y)\n\npost = prior | ((f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)\n```\n\nYou can then obtain a posterior process with `post(f)` and a finite-dimensional\ndistribution under the posterior with `post(f(x))`.\nAlternatively, the posterior of a process `f` can be obtained by conditioning `f`\ndirectly.\n\nDefinition, where and `f*` are `GP`s:\n\n```python\nf_post = f | (f(x, [noise]), y)\n\nf_post = f | ((f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)\n```\n\nLet's consider an example.\nFirst, build a model and sample some values.\n\n```python\n\u003e\u003e\u003e prior = Measure()\n\n\u003e\u003e\u003e f = GP(EQ(), measure=prior)\n\n\u003e\u003e\u003e x = np.array([0., 1., 2.])\n\n\u003e\u003e\u003e y = f(x).sample()\n```\n\nThen compute the posterior measure.\n\n```python\n\u003e\u003e\u003e post = prior | (f(x), y)\n\n\u003e\u003e\u003e post(f)\nGP(PosteriorMean(), PosteriorKernel())\n\n\u003e\u003e\u003e post(f).mean(x)\n\u003cdense matrix: shape=3x1, dtype=float64\n mat=[[ 0.412]\n      [-0.811]\n      [-0.933]]\u003e\n\n\u003e\u003e\u003e post(f).kernel(x)\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[1.e-12 0.e+00 0.e+00]\n      [0.e+00 1.e-12 0.e+00]\n      [0.e+00 0.e+00 1.e-12]]\u003e\n\n\u003e\u003e\u003e post(f(x))\n\u003cFDD:\n process=GP(PosteriorMean(), PosteriorKernel()),\n input=array([0., 1., 2.]),\n noise=\u003czero matrix: shape=3x3, dtype=float64\u003e\u003e\n\n\u003e\u003e\u003e post(f(x)).mean\n\u003cdense matrix: shape=3x1, dtype=float64\n mat=[[ 0.412]\n      [-0.811]\n      [-0.933]]\u003e\n\n\u003e\u003e\u003e post(f(x)).var\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[1.e-12 0.e+00 0.e+00]\n      [0.e+00 1.e-12 0.e+00]\n      [0.e+00 0.e+00 1.e-12]]\u003e\n```\n\nWe can also obtain the posterior by conditioning `f` directly:\n\n```python\n\u003e\u003e\u003e f_post = f | (f(x), y)\n\n\u003e\u003e\u003e f_post\nGP(PosteriorMean(), PosteriorKernel())\n\n\u003e\u003e\u003e f_post.mean(x)\n\u003cdense matrix: shape=3x1, dtype=float64\n mat=[[ 0.412]\n      [-0.811]\n      [-0.933]]\u003e\n\n\u003e\u003e\u003e f_post.kernel(x)\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[1.e-12 0.e+00 0.e+00]\n      [0.e+00 1.e-12 0.e+00]\n      [0.e+00 0.e+00 1.e-12]]\u003e\n\n\u003e\u003e\u003e f_post(x)\n\u003cFDD:\n process=GP(PosteriorMean(), PosteriorKernel()),\n input=array([0., 1., 2.]),\n noise=\u003czero matrix: shape=3x3, dtype=float64\u003e\u003e\n\n\u003e\u003e\u003e f_post(x).mean\n\u003cdense matrix: shape=3x1, dtype=float64\n mat=[[ 0.412]\n      [-0.811]\n      [-0.933]]\u003e\n\n\u003e\u003e\u003e f_post(x).var\n\u003cdense matrix: shape=3x3, dtype=float64\n mat=[[1.e-12 0.e+00 0.e+00]\n      [0.e+00 1.e-12 0.e+00]\n      [0.e+00 0.e+00 1.e-12]]\u003e\n```\n\nWe can further extend our model by building on the posterior.\n\n```python\n\u003e\u003e\u003e g = GP(Linear(), measure=post)\n\n\u003e\u003e\u003e f_sum = post(f) + g\n\n\u003e\u003e\u003e f_sum\nGP(PosteriorMean(), PosteriorKernel() + Linear())\n```\n\nHowever, what we cannot do is mixing the prior and posterior.\n\n```python\n\u003e\u003e\u003e f + g\nAssertionError: Processes GP(0, EQ()) and GP(0, Linear()) are associated to different measures.\n```\n\n### Inducing Points\n\nStheno supports pseudo-point approximations of posterior distributions with\nvarious approximation methods:\n\n1. The Variational Free Energy (VFE;\n    [Titsias, 2009](http://proceedings.mlr.press/v5/titsias09a/titsias09a.pdf))\n    approximation.\n    To use the VFE approximation, use `PseudoObs`.\n\n2. The Fully Independent Training Conditional (FITC;\n    [Snelson \u0026 Ghahramani, 2006](http://www.gatsby.ucl.ac.uk/~snelson/SPGP_up.pdf))\n    approximation. \n    To use the FITC approximation, use `PseudoObsFITC`.\n \n3. The Deterministic Training Conditional (DTC;\n   [Csato \u0026 Opper, 2002](https://direct.mit.edu/neco/article/14/3/641/6594/Sparse-On-Line-Gaussian-Processes);\n   [Seeger et al., 2003](http://proceedings.mlr.press/r4/seeger03a/seeger03a.pdf))\n   approximation.\n   To use the DTC approximation, use `PseudoObsDTC`.\n\nThe VFE approximation (`PseudoObs`) is the approximation recommended to use.\nThe following definitions and examples will use the VFE approximation with `PseudoObs`,\nbut every instance of `PseudoObs` can be swapped out for `PseudoObsFITC` or \n`PseudoObsDTC`.\n\nDefinition:\n\n```python\nobs = PseudoObs(\n    u(z),               # FDD of inducing points\n    (f(x, [noise]), y)  # Observed data\n)\n                \nobs = PseudoObs(u(z), f(x, [noise]), y)\n\nobs = PseudoObs(u(z), (f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)\n\nobs = PseudoObs((u1(z1), u2(z2), ...), f(x, [noise]), y)\n\nobs = PseudoObs((u1(z1), u2(z2), ...), (f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)\n```\n\nThe approximate posterior measure can be constructed with `prior | obs`\nwhere `prior = Measure()` is the measure of your model.\nTo quantify the quality of the approximation, you can compute the ELBO with \n`obs.elbo(prior)`.\n\nLet's consider an example.\nFirst, build a model and sample some noisy observations.\n\n```python\n\u003e\u003e\u003e prior = Measure()\n\n\u003e\u003e\u003e f = GP(EQ(), measure=prior)\n\n\u003e\u003e\u003e x_obs = np.linspace(0, 10, 2000)\n\n\u003e\u003e\u003e y_obs = f(x_obs, 1).sample()\n```\n\nOuch, computing the logpdf is quite slow:\n\n```python\n\u003e\u003e\u003e %timeit f(x_obs, 1).logpdf(y_obs)\n219 ms ± 35.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n```\n\nLet's try to use inducing points to speed this up.\n\n```python\n\u003e\u003e\u003e x_ind = np.linspace(0, 10, 100)\n\n\u003e\u003e\u003e u = f(x_ind)   # FDD of inducing points.\n\n\u003e\u003e\u003e %timeit PseudoObs(u, f(x_obs, 1), y_obs).elbo(prior)\n9.8 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n```\n\nMuch better.\nAnd the approximation is good:\n\n```python\n\u003e\u003e\u003e PseudoObs(u, f(x_obs, 1), y_obs).elbo(prior) - f(x_obs, 1).logpdf(y_obs)\n-3.537934389896691e-10\n```\n\nWe finally construct the approximate posterior measure:\n\n```python\n\u003e\u003e\u003e post_approx = prior | PseudoObs(u, f(x_obs, 1), y_obs)\n\n\u003e\u003e\u003e post_approx(f(x_obs)).mean\n\u003cdense matrix: shape=2000x1, dtype=float64\n mat=[[0.469]\n      [0.468]\n      [0.467]\n      ...\n      [1.09 ]\n      [1.09 ]\n      [1.091]]\u003e\n```\n\n\n### Kernels and Means\n\nSee [MLKernels](https://github.com/wesselb/mlkernels).\n\n\n### Batched Computation\n\nStheno supports batched computation.\nSee [MLKernels](https://github.com/wesselb/mlkernels/#usage) for a description of how\nmeans and kernels work with batched computation.\n\nExample:\n\n```python\n\u003e\u003e\u003e f = GP(EQ())\n\n\u003e\u003e\u003e x = np.random.randn(16, 100, 1)\n\n\u003e\u003e\u003e y = f(x, 1).sample()\n\n\u003e\u003e\u003e logpdf = f(x, 1).logpdf(y)\n\n\u003e\u003e\u003e y.shape\n(16, 100, 1)\n\n\u003e\u003e\u003e f(x, 1).logpdf(y).shape\n(16,)\n```\n\n\n### Important Remarks\n\nStheno uses [LAB](https://github.com/wesselb/lab) to provide an implementation that is\nbackend agnostic.\nMoreover, Stheno uses [an extension of LAB](https://github.com/wesselb/matrix) to\naccelerate linear algebra with structured linear algebra primitives.\nYou will encounter these primitives:\n\n```python\n\u003e\u003e\u003e k = 2 * Delta()\n\n\u003e\u003e\u003e x = np.linspace(0, 5, 10)\n\n\u003e\u003e\u003e k(x)\n\u003cdiagonal matrix: shape=10x10, dtype=float64\n diag=[2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]\u003e\n```\n\nIf you're using [LAB](https://github.com/wesselb/lab) to further process these matrices,\nthen there is absolutely no need to worry:\nthese structured matrix types know how to add, multiply, and do other linear algebra\noperations.\n\n```python\n\u003e\u003e\u003e import lab as B\n\n\u003e\u003e\u003e B.matmul(k(x), k(x))\n\u003cdiagonal matrix: shape=10x10, dtype=float64\n diag=[4. 4. 4. 4. 4. 4. 4. 4. 4. 4.]\u003e\n```\n\nIf you're not using [LAB](https://github.com/wesselb/lab), you can convert these\nstructured primitives to regular NumPy/TensorFlow/PyTorch/JAX arrays by calling\n`B.dense` (`B` is from [LAB](https://github.com/wesselb/lab)):\n\n```python\n\u003e\u003e\u003e import lab as B\n\n\u003e\u003e\u003e B.dense(k(x))\narray([[2., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n       [0., 2., 0., 0., 0., 0., 0., 0., 0., 0.],\n       [0., 0., 2., 0., 0., 0., 0., 0., 0., 0.],\n       [0., 0., 0., 2., 0., 0., 0., 0., 0., 0.],\n       [0., 0., 0., 0., 2., 0., 0., 0., 0., 0.],\n       [0., 0., 0., 0., 0., 2., 0., 0., 0., 0.],\n       [0., 0., 0., 0., 0., 0., 2., 0., 0., 0.],\n       [0., 0., 0., 0., 0., 0., 0., 2., 0., 0.],\n       [0., 0., 0., 0., 0., 0., 0., 0., 2., 0.],\n       [0., 0., 0., 0., 0., 0., 0., 0., 0., 2.]])\n```\n\nFurthermore, before computing a Cholesky decomposition, Stheno always adds a minuscule\ndiagonal to prevent the Cholesky decomposition from failing due to positive\nindefiniteness caused by numerical noise.\nYou can change the magnitude of this diagonal by changing `B.epsilon`:\n\n```python\n\u003e\u003e\u003e import lab as B\n\n\u003e\u003e\u003e B.epsilon = 1e-12   # Default regularisation\n\n\u003e\u003e\u003e B.epsilon = 1e-8    # Strong regularisation\n```\n\n\n## Examples\n\nThe examples make use of [Varz](https://github.com/wesselb/varz) and some\nutility from [WBML](https://github.com/wesselb/wbml).\n\n\n### Simple Regression\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example1_simple_regression.png)\n\n```python\nimport matplotlib.pyplot as plt\nfrom wbml.plot import tweak\n\nfrom stheno import B, GP, EQ\n\n# Define points to predict at.\nx = B.linspace(0, 10, 100)\nx_obs = B.linspace(0, 7, 20)\n\n# Construct a prior.\nf = GP(EQ().periodic(5.0))\n\n# Sample a true, underlying function and noisy observations.\nf_true, y_obs = f.measure.sample(f(x), f(x_obs, 0.5))\n\n# Now condition on the observations to make predictions.\nf_post = f | (f(x_obs, 0.5), y_obs)\nmean, lower, upper = f_post(x).marginal_credible_bounds()\n\n# Plot result.\nplt.plot(x, f_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\nplt.savefig(\"readme_example1_simple_regression.png\")\nplt.show()\n```\n\n### Hyperparameter Optimisation with Varz\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example12_optimisation_varz.png)\n\n```python\nimport lab as B\nimport matplotlib.pyplot as plt\nimport torch\nfrom varz import Vars, minimise_l_bfgs_b, parametrised, Positive\nfrom wbml.plot import tweak\n\nfrom stheno.torch import EQ, GP\n\n# Increase regularisation because PyTorch defaults to 32-bit floats.\nB.epsilon = 1e-6\n\n# Define points to predict at.\nx = torch.linspace(0, 2, 100)\nx_obs = torch.linspace(0, 2, 50)\n\n# Sample a true, underlying function and observations with observation noise `0.05`.\nf_true = torch.sin(5 * x)\ny_obs = torch.sin(5 * x_obs) + 0.05**0.5 * torch.randn(50)\n\n\ndef model(vs):\n    \"\"\"Construct a model with learnable parameters.\"\"\"\n    p = vs.struct  # Varz handles positivity (and other) constraints.\n    kernel = p.variance.positive() * EQ().stretch(p.scale.positive())\n    return GP(kernel), p.noise.positive()\n\n\n@parametrised\ndef model_alternative(vs, scale: Positive, variance: Positive, noise: Positive):\n    \"\"\"Equivalent to :func:`model`, but with `@parametrised`.\"\"\"\n    kernel = variance * EQ().stretch(scale)\n    return GP(kernel), noise\n\n\nvs = Vars(torch.float32)\nf, noise = model(vs)\n\n# Condition on observations and make predictions before optimisation.\nf_post = f | (f(x_obs, noise), y_obs)\nprior_before = f, noise\npred_before = f_post(x, noise).marginal_credible_bounds()\n\n\ndef objective(vs):\n    f, noise = model(vs)\n    evidence = f(x_obs, noise).logpdf(y_obs)\n    return -evidence\n\n\n# Learn hyperparameters.\nminimise_l_bfgs_b(objective, vs)\n\nf, noise = model(vs)\n\n# Condition on observations and make predictions after optimisation.\nf_post = f | (f(x_obs, noise), y_obs)\nprior_after = f, noise\npred_after = f_post(x, noise).marginal_credible_bounds()\n\n\ndef plot_prediction(prior, pred):\n    f, noise = prior\n    mean, lower, upper = pred\n    plt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\n    plt.plot(x, f_true, label=\"True\", style=\"test\")\n    plt.plot(x, mean, label=\"Prediction\", style=\"pred\")\n    plt.fill_between(x, lower, upper, style=\"pred\")\n    plt.ylim(-2, 2)\n    plt.text(\n        0.02,\n        0.02,\n        f\"var = {f.kernel.factor(0):.2f}, \"\n        f\"scale = {f.kernel.factor(1).stretches[0]:.2f}, \"\n        f\"noise = {noise:.2f}\",\n        transform=plt.gca().transAxes,\n    )\n    tweak()\n\n\n# Plot result.\nplt.figure(figsize=(10, 4))\nplt.subplot(1, 2, 1)\nplt.title(\"Before optimisation\")\nplot_prediction(prior_before, pred_before)\nplt.subplot(1, 2, 2)\nplt.title(\"After optimisation\")\nplot_prediction(prior_after, pred_after)\nplt.savefig(\"readme_example12_optimisation_varz.png\")\nplt.show()\n```\n\n### Hyperparameter Optimisation with PyTorch\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example13_optimisation_torch.png)\n\n```python\nimport lab as B\nimport matplotlib.pyplot as plt\nimport torch\nfrom wbml.plot import tweak\n\nfrom stheno.torch import EQ, GP\n\n# Increase regularisation because PyTorch defaults to 32-bit floats.\nB.epsilon = 1e-6\n\n# Define points to predict at.\nx = torch.linspace(0, 2, 100)\nx_obs = torch.linspace(0, 2, 50)\n\n# Sample a true, underlying function and observations with observation noise `0.05`.\nf_true = torch.sin(5 * x)\ny_obs = torch.sin(5 * x_obs) + 0.05**0.5 * torch.randn(50)\n\n\nclass Model(torch.nn.Module):\n    \"\"\"A GP model with learnable parameters.\"\"\"\n\n    def __init__(self, init_var=0.3, init_scale=1, init_noise=0.2):\n        super().__init__()\n        # Ensure that the parameters are positive and make them learnable.\n        self.log_var = torch.nn.Parameter(torch.log(torch.tensor(init_var)))\n        self.log_scale = torch.nn.Parameter(torch.log(torch.tensor(init_scale)))\n        self.log_noise = torch.nn.Parameter(torch.log(torch.tensor(init_noise)))\n\n    def construct(self):\n        self.var = torch.exp(self.log_var)\n        self.scale = torch.exp(self.log_scale)\n        self.noise = torch.exp(self.log_noise)\n        kernel = self.var * EQ().stretch(self.scale)\n        return GP(kernel), self.noise\n\n\nmodel = Model()\nf, noise = model.construct()\n\n# Condition on observations and make predictions before optimisation.\nf_post = f | (f(x_obs, noise), y_obs)\nprior_before = f, noise\npred_before = f_post(x, noise).marginal_credible_bounds()\n\n# Perform optimisation.\nopt = torch.optim.Adam(model.parameters(), lr=5e-2)\nfor _ in range(1000):\n    opt.zero_grad()\n    f, noise = model.construct()\n    loss = -f(x_obs, noise).logpdf(y_obs)\n    loss.backward()\n    opt.step()\n\nf, noise = model.construct()\n\n# Condition on observations and make predictions after optimisation.\nf_post = f | (f(x_obs, noise), y_obs)\nprior_after = f, noise\npred_after = f_post(x, noise).marginal_credible_bounds()\n\n\ndef plot_prediction(prior, pred):\n    f, noise = prior\n    mean, lower, upper = pred\n    plt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\n    plt.plot(x, f_true, label=\"True\", style=\"test\")\n    plt.plot(x, mean, label=\"Prediction\", style=\"pred\")\n    plt.fill_between(x, lower, upper, style=\"pred\")\n    plt.ylim(-2, 2)\n    plt.text(\n        0.02,\n        0.02,\n        f\"var = {f.kernel.factor(0):.2f}, \"\n        f\"scale = {f.kernel.factor(1).stretches[0]:.2f}, \"\n        f\"noise = {noise:.2f}\",\n        transform=plt.gca().transAxes,\n    )\n    tweak()\n\n\n# Plot result.\nplt.figure(figsize=(10, 4))\nplt.subplot(1, 2, 1)\nplt.title(\"Before optimisation\")\nplot_prediction(prior_before, pred_before)\nplt.subplot(1, 2, 2)\nplt.title(\"After optimisation\")\nplot_prediction(prior_after, pred_after)\nplt.savefig(\"readme_example13_optimisation_torch.png\")\nplt.show()\n```\n\n### Decomposition of Prediction\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example2_decomposition.png)\n\n```python\nimport matplotlib.pyplot as plt\nfrom wbml.plot import tweak\n\nfrom stheno import Measure, GP, EQ, RQ, Linear, Delta, Exp, B\n\nB.epsilon = 1e-10\n\n# Define points to predict at.\nx = B.linspace(0, 10, 200)\nx_obs = B.linspace(0, 7, 50)\n\n\nwith Measure() as prior:\n    # Construct a latent function consisting of four different components.\n    f_smooth = GP(EQ())\n    f_wiggly = GP(RQ(1e-1).stretch(0.5))\n    f_periodic = GP(EQ().periodic(1.0))\n    f_linear = GP(Linear())\n    f = f_smooth + f_wiggly + f_periodic + 0.2 * f_linear\n\n    # Let the observation noise consist of a bit of exponential noise.\n    e_indep = GP(Delta())\n    e_exp = GP(Exp())\n    e = e_indep + 0.3 * e_exp\n\n    # Sum the latent function and observation noise to get a model for the observations.\n    y = f + 0.5 * e\n\n# Sample a true, underlying function and observations.\n(\n    f_true_smooth,\n    f_true_wiggly,\n    f_true_periodic,\n    f_true_linear,\n    f_true,\n    y_obs,\n) = prior.sample(f_smooth(x), f_wiggly(x), f_periodic(x), f_linear(x), f(x), y(x_obs))\n\n# Now condition on the observations and make predictions for the latent function and\n# its various components.\npost = prior | (y(x_obs), y_obs)\n\npred_smooth = post(f_smooth(x))\npred_wiggly = post(f_wiggly(x))\npred_periodic = post(f_periodic(x))\npred_linear = post(f_linear(x))\npred_f = post(f(x))\n\n\n# Plot results.\ndef plot_prediction(x, f, pred, x_obs=None, y_obs=None):\n    plt.plot(x, f, label=\"True\", style=\"test\")\n    if x_obs is not None:\n        plt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\n    mean, lower, upper = pred.marginal_credible_bounds()\n    plt.plot(x, mean, label=\"Prediction\", style=\"pred\")\n    plt.fill_between(x, lower, upper, style=\"pred\")\n    tweak()\n\n\nplt.figure(figsize=(10, 6))\n\nplt.subplot(3, 1, 1)\nplt.title(\"Prediction\")\nplot_prediction(x, f_true, pred_f, x_obs, y_obs)\n\nplt.subplot(3, 2, 3)\nplt.title(\"Smooth Component\")\nplot_prediction(x, f_true_smooth, pred_smooth)\n\nplt.subplot(3, 2, 4)\nplt.title(\"Wiggly Component\")\nplot_prediction(x, f_true_wiggly, pred_wiggly)\n\nplt.subplot(3, 2, 5)\nplt.title(\"Periodic Component\")\nplot_prediction(x, f_true_periodic, pred_periodic)\n\nplt.subplot(3, 2, 6)\nplt.title(\"Linear Component\")\nplot_prediction(x, f_true_linear, pred_linear)\n\nplt.savefig(\"readme_example2_decomposition.png\")\nplt.show()\n```\n\n### Learn a Function, Incorporating Prior Knowledge About Its Form\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example3_parametric.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport tensorflow as tf\nimport wbml.out as out\nfrom varz.spec import parametrised, Positive\nfrom varz.tensorflow import Vars, minimise_l_bfgs_b\nfrom wbml.plot import tweak\n\nfrom stheno.tensorflow import B, Measure, GP, EQ, Delta\n\n# Define points to predict at.\nx = B.linspace(tf.float64, 0, 5, 100)\nx_obs = B.linspace(tf.float64, 0, 3, 20)\n\n\n@parametrised\ndef model(\n    vs,\n    u_var: Positive = 0.5,\n    u_scale: Positive = 0.5,\n    noise: Positive = 0.5,\n    alpha: Positive = 1.2,\n):\n    with Measure():\n        # Random fluctuation:\n        u = GP(u_var * EQ().stretch(u_scale))\n        # Construct model.\n        f = u + (lambda x: x**alpha)\n    return f, noise\n\n\n# Sample a true, underlying function and observations.\nvs = Vars(tf.float64)\nf_true = x**1.8 + B.sin(2 * B.pi * x)\nf, y = model(vs)\npost = f.measure | (f(x), f_true)\ny_obs = post(f(x_obs)).sample()\n\n\ndef objective(vs):\n    f, noise = model(vs)\n    evidence = f(x_obs, noise).logpdf(y_obs)\n    return -evidence\n\n\n# Learn hyperparameters.\nminimise_l_bfgs_b(objective, vs, jit=True)\nf, noise = model(vs)\n\n# Print the learned parameters.\nout.kv(\"Prior\", f.display(out.format))\nvs.print()\n\n# Condition on the observations to make predictions.\nf_post = f | (f(x_obs, noise), y_obs)\nmean, lower, upper = f_post(x).marginal_credible_bounds()\n\n# Plot result.\nplt.plot(x, B.squeeze(f_true), label=\"True\", style=\"test\")\nplt.scatter(x_obs, B.squeeze(y_obs), label=\"Observations\", style=\"train\", s=20)\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example3_parametric.png\")\nplt.show()\n```\n\n### Multi-Output Regression\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example4_multi-output.png)\n\n```python\nimport matplotlib.pyplot as plt\nfrom wbml.plot import tweak\n\nfrom stheno import B, Measure, GP, EQ, Delta\n\n\nclass VGP:\n    \"\"\"A vector-valued GP.\"\"\"\n\n    def __init__(self, ps):\n        self.ps = ps\n\n    def __add__(self, other):\n        return VGP([f + g for f, g in zip(self.ps, other.ps)])\n\n    def lmatmul(self, A):\n        m, n = A.shape\n        ps = [0 for _ in range(m)]\n        for i in range(m):\n            for j in range(n):\n                ps[i] += A[i, j] * self.ps[j]\n        return VGP(ps)\n\n\n# Define points to predict at.\nx = B.linspace(0, 10, 100)\nx_obs = B.linspace(0, 10, 10)\n\n# Model parameters:\nm = 2\np = 4\nH = B.randn(p, m)\n\n\nwith Measure() as prior:\n    # Construct latent functions.\n    us = VGP([GP(EQ()) for _ in range(m)])\n\n    # Construct multi-output prior.\n    fs = us.lmatmul(H)\n\n    # Construct noise.\n    e = VGP([GP(0.5 * Delta()) for _ in range(p)])\n\n    # Construct observation model.\n    ys = e + fs\n\n# Sample a true, underlying function and observations.\nsamples = prior.sample(*(p(x) for p in fs.ps), *(p(x_obs) for p in ys.ps))\nfs_true, ys_obs = samples[:p], samples[p:]\n\n# Compute the posterior and make predictions.\npost = prior.condition(*((p(x_obs), y_obs) for p, y_obs in zip(ys.ps, ys_obs)))\npreds = [post(p(x)) for p in fs.ps]\n\n\n# Plot results.\ndef plot_prediction(x, f, pred, x_obs=None, y_obs=None):\n    plt.plot(x, f, label=\"True\", style=\"test\")\n    if x_obs is not None:\n        plt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\n    mean, lower, upper = pred.marginal_credible_bounds()\n    plt.plot(x, mean, label=\"Prediction\", style=\"pred\")\n    plt.fill_between(x, lower, upper, style=\"pred\")\n    tweak()\n\n\nplt.figure(figsize=(10, 6))\nfor i in range(4):\n    plt.subplot(2, 2, i + 1)\n    plt.title(f\"Output {i + 1}\")\n    plot_prediction(x, fs_true[i], preds[i], x_obs, ys_obs[i])\nplt.savefig(\"readme_example4_multi-output.png\")\nplt.show()\n```\n\n### Approximate Integration\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example5_integration.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport tensorflow as tf\nimport wbml.plot\n\nfrom stheno.tensorflow import B, Measure, GP, EQ, Delta\n\n# Define points to predict at.\nx = B.linspace(tf.float64, 0, 10, 200)\nx_obs = B.linspace(tf.float64, 0, 10, 10)\n\nwith Measure() as prior:\n    # Construct a model.\n    f = 0.7 * GP(EQ()).stretch(1.5)\n    e = 0.2 * GP(Delta())\n\n    # Construct derivatives.\n    df = f.diff()\n    ddf = df.diff()\n    dddf = ddf.diff() + e\n\n# Fix the integration constants.\nzero = B.cast(tf.float64, 0)\none = B.cast(tf.float64, 1)\nprior = prior | ((f(zero), one), (df(zero), zero), (ddf(zero), -one))\n\n# Sample observations.\ny_obs = B.sin(x_obs) + 0.2 * B.randn(*x_obs.shape)\n\n# Condition on the observations to make predictions.\npost = prior | (dddf(x_obs), y_obs)\n\n# And make predictions.\npred_iiif = post(f)(x)\npred_iif = post(df)(x)\npred_if = post(ddf)(x)\npred_f = post(dddf)(x)\n\n\n# Plot result.\ndef plot_prediction(x, f, pred, x_obs=None, y_obs=None):\n    plt.plot(x, f, label=\"True\", style=\"test\")\n    if x_obs is not None:\n        plt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\n    mean, lower, upper = pred.marginal_credible_bounds()\n    plt.plot(x, mean, label=\"Prediction\", style=\"pred\")\n    plt.fill_between(x, lower, upper, style=\"pred\")\n    wbml.plot.tweak()\n\n\nplt.figure(figsize=(10, 6))\n\nplt.subplot(2, 2, 1)\nplt.title(\"Function\")\nplot_prediction(x, np.sin(x), pred_f, x_obs=x_obs, y_obs=y_obs)\n\nplt.subplot(2, 2, 2)\nplt.title(\"Integral of Function\")\nplot_prediction(x, -np.cos(x), pred_if)\n\nplt.subplot(2, 2, 3)\nplt.title(\"Second Integral of Function\")\nplot_prediction(x, -np.sin(x), pred_iif)\n\nplt.subplot(2, 2, 4)\nplt.title(\"Third Integral of Function\")\nplot_prediction(x, np.cos(x), pred_iiif)\n\nplt.savefig(\"readme_example5_integration.png\")\nplt.show()\n```\n\n### Bayesian Linear Regression\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example6_blr.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport wbml.out as out\nfrom wbml.plot import tweak\n\nfrom stheno import B, Measure, GP\n\nB.epsilon = 1e-10  # Very slightly regularise.\n\n# Define points to predict at.\nx = B.linspace(0, 10, 200)\nx_obs = B.linspace(0, 10, 10)\n\nwith Measure() as prior:\n    # Construct a linear model.\n    slope = GP(1)\n    intercept = GP(5)\n    f = slope * (lambda x: x) + intercept\n\n# Sample a slope, intercept, underlying function, and observations.\ntrue_slope, true_intercept, f_true, y_obs = prior.sample(\n    slope(0), intercept(0), f(x), f(x_obs, 0.2)\n)\n\n# Condition on the observations to make predictions.\npost = prior | (f(x_obs, 0.2), y_obs)\nmean, lower, upper = post(f(x)).marginal_credible_bounds()\n\nout.kv(\"True slope\", true_slope[0, 0])\nout.kv(\"Predicted slope\", post(slope(0)).mean[0, 0])\nout.kv(\"True intercept\", true_intercept[0, 0])\nout.kv(\"Predicted intercept\", post(intercept(0)).mean[0, 0])\n\n# Plot result.\nplt.plot(x, f_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example6_blr.png\")\nplt.show()\n```\n\n### GPAR\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example7_gpar.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport tensorflow as tf\nfrom varz.spec import parametrised, Positive\nfrom varz.tensorflow import Vars, minimise_l_bfgs_b\nfrom wbml.plot import tweak\n\nfrom stheno.tensorflow import B, GP, EQ\n\n# Define points to predict at.\nx = B.linspace(tf.float64, 0, 10, 200)\nx_obs1 = B.linspace(tf.float64, 0, 10, 30)\ninds2 = np.random.permutation(len(x_obs1))[:10]\nx_obs2 = B.take(x_obs1, inds2)\n\n# Construction functions to predict and observations.\nf1_true = B.sin(x)\nf2_true = B.sin(x) ** 2\n\ny1_obs = B.sin(x_obs1) + 0.1 * B.randn(*x_obs1.shape)\ny2_obs = B.sin(x_obs2) ** 2 + 0.1 * B.randn(*x_obs2.shape)\n\n\n@parametrised\ndef model(\n    vs,\n    var1: Positive = 1,\n    scale1: Positive = 1,\n    noise1: Positive = 0.1,\n    var2: Positive = 1,\n    scale2: Positive = 1,\n    noise2: Positive = 0.1,\n):\n    # Build layers:\n    f1 = GP(var1 * EQ().stretch(scale1))\n    f2 = GP(var2 * EQ().stretch(scale2))\n    return (f1, noise1), (f2, noise2)\n\n\ndef objective(vs):\n    (f1, noise1), (f2, noise2) = model(vs)\n    x1 = x_obs1\n    x2 = B.stack(x_obs2, B.take(y1_obs, inds2), axis=1)\n    evidence = f1(x1, noise1).logpdf(y1_obs) + f2(x2, noise2).logpdf(y2_obs)\n    return -evidence\n\n\n# Learn hyperparameters.\nvs = Vars(tf.float64)\nminimise_l_bfgs_b(objective, vs)\n\n# Compute posteriors.\n(f1, noise1), (f2, noise2) = model(vs)\nx1 = x_obs1\nx2 = B.stack(x_obs2, B.take(y1_obs, inds2), axis=1)\nf1_post = f1 | (f1(x1, noise1), y1_obs)\nf2_post = f2 | (f2(x2, noise2), y2_obs)\n\n# Predict first output.\nmean1, lower1, upper1 = f1_post(x).marginal_credible_bounds()\n\n# Predict second output with Monte Carlo.\nsamples = [\n    f2_post(B.stack(x, f1_post(x).sample()[:, 0], axis=1)).sample()[:, 0]\n    for _ in range(100)\n]\nmean2 = np.mean(samples, axis=0)\nlower2 = np.percentile(samples, 2.5, axis=0)\nupper2 = np.percentile(samples, 100 - 2.5, axis=0)\n\n# Plot result.\nplt.figure()\n\nplt.subplot(2, 1, 1)\nplt.title(\"Output 1\")\nplt.plot(x, f1_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs1, y1_obs, label=\"Observations\", style=\"train\", s=20)\nplt.plot(x, mean1, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower1, upper1, style=\"pred\")\ntweak()\n\nplt.subplot(2, 1, 2)\nplt.title(\"Output 2\")\nplt.plot(x, f2_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs2, y2_obs, label=\"Observations\", style=\"train\", s=20)\nplt.plot(x, mean2, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower2, upper2, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example7_gpar.png\")\nplt.show()\n```\n\n### A GP-RNN Model\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example8_gp-rnn.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport tensorflow as tf\nfrom varz.spec import parametrised, Positive\nfrom varz.tensorflow import Vars, minimise_adam\nfrom wbml.net import rnn as rnn_constructor\nfrom wbml.plot import tweak\n\nfrom stheno.tensorflow import B, Measure, GP, EQ\n\n# Increase regularisation because we are dealing with `tf.float32`s.\nB.epsilon = 1e-6\n\n# Construct points which to predict at.\nx = B.linspace(tf.float32, 0, 1, 100)[:, None]\ninds_obs = B.range(0, int(0.75 * len(x)))  # Train on the first 75% only.\nx_obs = B.take(x, inds_obs)\n\n# Construct function and observations.\n#   Draw random modulation functions.\na_true = GP(1e-2 * EQ().stretch(0.1))(x).sample()\nb_true = GP(1e-2 * EQ().stretch(0.1))(x).sample()\n#   Construct the true, underlying function.\nf_true = (1 + a_true) * B.sin(2 * np.pi * 7 * x) + b_true\n#   Add noise.\ny_true = f_true + 0.1 * B.randn(*f_true.shape)\n\n# Normalise and split.\nf_true = (f_true - B.mean(y_true)) / B.std(y_true)\ny_true = (y_true - B.mean(y_true)) / B.std(y_true)\ny_obs = B.take(y_true, inds_obs)\n\n\n@parametrised\ndef model(vs, a_scale: Positive = 0.1, b_scale: Positive = 0.1, noise: Positive = 0.01):\n    # Construct an RNN.\n    f_rnn = rnn_constructor(\n        output_size=1, widths=(10,), nonlinearity=B.tanh, final_dense=True\n    )\n\n    # Set the weights for the RNN.\n    num_weights = f_rnn.num_weights(input_size=1)\n    weights = Vars(tf.float32, source=vs.get(shape=(num_weights,), name=\"rnn\"))\n    f_rnn.initialise(input_size=1, vs=weights)\n\n    with Measure():\n        # Construct GPs that modulate the RNN.\n        a = GP(1e-2 * EQ().stretch(a_scale))\n        b = GP(1e-2 * EQ().stretch(b_scale))\n\n        # GP-RNN model:\n        f_gp_rnn = (1 + a) * (lambda x: f_rnn(x)) + b\n\n    return f_rnn, f_gp_rnn, noise, a, b\n\n\ndef objective_rnn(vs):\n    f_rnn, _, _, _, _ = model(vs)\n    return B.mean((f_rnn(x_obs) - y_obs) ** 2)\n\n\ndef objective_gp_rnn(vs):\n    _, f_gp_rnn, noise, _, _ = model(vs)\n    evidence = f_gp_rnn(x_obs, noise).logpdf(y_obs)\n    return -evidence\n\n\n# Pretrain the RNN.\nvs = Vars(tf.float32)\nminimise_adam(objective_rnn, vs, rate=5e-3, iters=1000, trace=True, jit=True)\n\n# Jointly train the RNN and GPs.\nminimise_adam(objective_gp_rnn, vs, rate=1e-3, iters=1000, trace=True, jit=True)\n\n_, f_gp_rnn, noise, a, b = model(vs)\n\n# Condition.\npost = f_gp_rnn.measure | (f_gp_rnn(x_obs, noise), y_obs)\n\n# Predict and plot results.\nplt.figure(figsize=(10, 6))\n\nplt.subplot(2, 1, 1)\nplt.title(\"$(1 + a)\\\\cdot {}$RNN${} + b$\")\nplt.plot(x, f_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\nmean, lower, upper = post(f_gp_rnn(x)).marginal_credible_bounds()\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.subplot(2, 2, 3)\nplt.title(\"$a$\")\nmean, lower, upper = post(a(x)).marginal_credible_bounds()\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.subplot(2, 2, 4)\nplt.title(\"$b$\")\nmean, lower, upper = post(b(x)).marginal_credible_bounds()\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(f\"readme_example8_gp-rnn.png\")\nplt.show()\n```\n\n### Approximate Multiplication Between GPs\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example9_product.png)\n\n```python\nimport matplotlib.pyplot as plt\nfrom wbml.plot import tweak\n\nfrom stheno import B, Measure, GP, EQ\n\n# Define points to predict at.\nx = B.linspace(0, 10, 100)\n\nwith Measure() as prior:\n    f1 = GP(3, EQ())\n    f2 = GP(3, EQ())\n\n    # Compute the approximate product.\n    f_prod = f1 * f2\n\n# Sample two functions.\ns1, s2 = prior.sample(f1(x), f2(x))\n\n# Predict.\nf_prod_post = f_prod | ((f1(x), s1), (f2(x), s2))\nmean, lower, upper = f_prod_post(x).marginal_credible_bounds()\n\n# Plot result.\nplt.plot(x, s1, label=\"Sample 1\", style=\"train\")\nplt.plot(x, s2, label=\"Sample 2\", style=\"train\", ls=\"--\")\nplt.plot(x, s1 * s2, label=\"True product\", style=\"test\")\nplt.plot(x, mean, label=\"Approximate posterior\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example9_product.png\")\nplt.show()\n```\n\n### Sparse Regression\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example10_sparse.png)\n\n```python\nimport matplotlib.pyplot as plt\nimport wbml.out as out\nfrom wbml.plot import tweak\n\nfrom stheno import B, GP, EQ, PseudoObs\n\n# Define points to predict at.\nx = B.linspace(0, 10, 100)\nx_obs = B.linspace(0, 7, 50_000)\nx_ind = B.linspace(0, 10, 20)\n\n# Construct a prior.\nf = GP(EQ().periodic(2 * B.pi))\n\n# Sample a true, underlying function and observations.\nf_true = B.sin(x)\ny_obs = B.sin(x_obs) + B.sqrt(0.5) * B.randn(*x_obs.shape)\n\n# Compute a pseudo-point approximation of the posterior.\nobs = PseudoObs(f(x_ind), (f(x_obs, 0.5), y_obs))\n\n# Compute the ELBO.\nout.kv(\"ELBO\", obs.elbo(f.measure))\n\n# Compute the approximate posterior.\nf_post = f | obs\n\n# Make predictions with the approximate posterior.\nmean, lower, upper = f_post(x).marginal_credible_bounds()\n\n# Plot result.\nplt.plot(x, f_true, label=\"True\", style=\"test\")\nplt.scatter(\n    x_obs,\n    y_obs,\n    label=\"Observations\",\n    style=\"train\",\n    c=\"tab:green\",\n    alpha=0.35,\n)\nplt.scatter(\n    x_ind,\n    obs.mu(f.measure)[:, 0],\n    label=\"Inducing Points\",\n    style=\"train\",\n    s=20,\n)\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example10_sparse.png\")\nplt.show()\n```\n\n### Smoothing with Nonparametric Basis Functions\n\n![Prediction](https://raw.githubusercontent.com/wesselb/stheno/master/readme_example11_nonparametric_basis.png)\n\n```python\nimport matplotlib.pyplot as plt\nfrom wbml.plot import tweak\n\nfrom stheno import B, Measure, GP, EQ\n\n# Define points to predict at.\nx = B.linspace(0, 10, 100)\nx_obs = B.linspace(0, 10, 20)\n\nwith Measure() as prior:\n    w = lambda x: B.exp(-(x**2) / 0.5)  # Basis function\n    b = [(w * GP(EQ())).shift(xi) for xi in x_obs]  # Weighted basis functions\n    f = sum(b)\n\n# Sample a true, underlying function and observations.\nf_true, y_obs = prior.sample(f(x), f(x_obs, 0.2))\n\n# Condition on the observations to make predictions.\npost = prior | (f(x_obs, 0.2), y_obs)\n\n# Plot result.\nfor i, bi in enumerate(b):\n    mean, lower, upper = post(bi(x)).marginal_credible_bounds()\n    kw_args = {\"label\": \"Basis functions\"} if i == 0 else {}\n    plt.plot(x, mean, style=\"pred2\", **kw_args)\nplt.plot(x, f_true, label=\"True\", style=\"test\")\nplt.scatter(x_obs, y_obs, label=\"Observations\", style=\"train\", s=20)\nmean, lower, upper = post(f(x)).marginal_credible_bounds()\nplt.plot(x, mean, label=\"Prediction\", style=\"pred\")\nplt.fill_between(x, lower, upper, style=\"pred\")\ntweak()\n\nplt.savefig(\"readme_example11_nonparametric_basis.png\")\nplt.show()\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwesselb%2Fstheno","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwesselb%2Fstheno","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwesselb%2Fstheno/lists"}