💾 Archived View for epi.benthic.zone › posts › 2021-07-23_GaussianRandomSampling.gmi captured on 2022-01-08 at 13:53:50. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Drawing samples from a Gaussian process, part 1

2021-07-23

Gaussian processes are useful models for seafloor texture, and they are commonly used by seafloor modelers because it is quite easy to draw random textures from a Gaussian process. I wanted to document a few of the ways one can sample a Gaussian process in two dimensions, and touch on some of the advantages and disadvantages of different methods.

Intro to Gaussian processes

The way I like to think of a Gaussian process is as a random function of space: Y(s). However, we are only ever going to observe the Gaussian process at a collection of points [s₀,s₁,...,sₙ], which means that Y is a vector [Y₀,Y₁,...,Yₙ]. The basic property of any Gaussian process is that all of its finite-dimensional marginal distributions are multivariate Gaussians, which means that any vector Y consisting of observations from a Gaussian process has a multivariate normal distribution: Y ~ N(μ,Σ). We can ignore the mean of this distribution because we can always add it back in later, and we can focus on the covariance matrix.

The covariance matrix has elements, Σᵢⱼ = R(sᵢ,sⱼ), where R is the covariance function that defines the Gaussian process. There is a lot of mathematical subtlety about the definition of the covariance function, but let's say for now that we are going to work with one of a few well-known forms of a covariance function such as the squared exponential:

R(s,t;σ=1.0,L=1.0) = σ^2 * exp(-0.5*sum(abs2,s .- t)/L^2)

If we know the covariance function, we can compute the covariance matrix by applying the function to every pair of points. Drawing samples from a Gaussian process with this covariance function just means drawing a sample from a multivariate Gaussian distribution with a given covariance.

The Cholesky decomposition method

The most general way to draw samples from a multivariate Gaussian distribution is based on the following identity. If Z ~ N(0,R) and Y = A*Z for a matrix A, then Y ~ N(0,A*R*A'). If Z ~ N(0,I) (it is a standard normal multivariate random variable), then Y ~ N(0,A*A'). This means that if you can find a matrix A such that A*A' = Σ, then you can draw a standard normal vector Z and compute Y = A*Z to sample Y from the desired distribution.

Covariance matrices are positive semidefinite, and well-behaved ones are usually positive definite, which means that a covariance matrix has a unique decomposition into a matrix L that is lower triangular (all the entries above the main diagonal are zero). This is called the Cholesky decomposition and you can compute it in Julia using the `cholesky

` function from the LinearAlgebra standard library module.

To sample from a Gaussian process with a desired covariance function

Construct the covariance matrix Σᵢⱼ = R(sᵢ,sⱼ) by evaluating the covariance function at every pair of observation points.
Compute the Cholesky decomposition of the covariance matrix Σ = L*L'
Sample a standard normal random vector Z
Multiply Y = L*Z

using LinearAlgebra

R(s,t;σ=1.0,L=1.0) = σ^2 * exp(-0.5*sum(abs2,s .- t)/L^2)
N = 1024
s = range(0,10.0,length=N)
Σ = [R(s,t) for s in s, t in s]
L = cholesky(Σ).L
Z = randn(N)
Y = L*Z

Advantages of the Cholesky decomposition method

It works for basically any Gaussian process
It is straightforward to implement

Disadvantage

The general Cholesky decomposition has a computational cost of O(N^3), which means that it quickly becomes infeasible for large numbers of observations.

Working with precision

Some Gaussian processes are specified in terms of their precision matrix Λ = Σ⁻¹. If you have a precision matrix, the Cholesky decomposition method works similarly. You find the Cholesky decomposition of Λ = L*L', only this time you solve the linear system of equations LY = Z for Y. In Julia

L = cholesky(Λ).L
Z = randn(N)
Y = L\Z

This has all the advantage and disadvantages of the Cholesky decomposition of the covariance matrix. However, there is an additional advantage if precision matrix is sparse. It is possible to calculate the Cholesky decomposition of a sparse matrix faster than for a dense matrix, so the curse of dimensionality is not such a problem. You obtain a sparse precision matrix when the Gaussian process has a Markov structure such that the process only depends on the process at neighboring points and not points that are farther away. We'll come back to Gaussian Markov random fields a bit later.

Summary

Gaussian processes are just multivariate Gaussian distributions with a particular covariance matrix. The most general way to sample from a multivariate Gaussian distribution relies on the Cholesky decomposition of the covariance matrix. This works for all Gaussian processes, but it scales poorly with the number of observations. Other methods of sampling Gaussian processes have to make some assumptions on the covariance function but can be much faster. We'll look at some other methods in future posts.

Posts

Home

This page is licensed under a CC-BY-SA license.