Gaussian Process

By Prof. Seungchul Lee
Industrial AI Lab at POSTECH

Table of Contents

1. Gaussian Random Vectors

Suppose $x \sim N(\mu, \Sigma)$, here $\Sigma = \Sigma^T$ and $\Sigma > 0$.

$$ p(x) = \frac{1}{(2\pi)^{\frac{n}{2}}(\text{det}\,\Sigma)^{\frac{1}{2}}} \exp \left( -\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu) \right)$$

1.1. The marginal pdf of a Gaussian (is Gaussian)

Suppose $x \sim N(\mu, \Sigma)$, and

$$ x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \quad \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix} \quad \Sigma = \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} $$

Let's look at the component $x_1 = \begin{bmatrix} I & 0 \end{bmatrix} x$

  • mean of $x_1$

$$E x_1 = \begin{bmatrix} I & 0 \end{bmatrix} \mu= \mu_1$$

  • convariance of $x_1$

$$\text{cov}(x_1) = \begin{bmatrix} I & 0 \end{bmatrix} \Sigma \begin{bmatrix} I \\ 0 \end{bmatrix} = \Sigma_{11}$$
  • In fact, the random variable $x_1$ is Gaussian, (this is not obvious.)

1.2. Linear transformation of Gaussian (is Gaussian)

Suppose $x \sim N(\mu_x, \Sigma_x)$. Consider the linear function of $x$

$$y = Ax + b$$

  • We already know how means and covariances transform. We have

$$\mathbf{E}y = A \, \mathbf{E}x + b \qquad \mathbf{cov}(y) = A \, \mathbf{cov}(x) \, A^{T}$$

$$\mu_y = A \mu_x + b \quad \qquad \Sigma_y = A \Sigma_x A^{T}$$

  • The amazing fact is that $y$ is Gaussian.

1.3. Components of a Gaussian random vector (is Gaussian)

Suppose $x \sim N(0, \Sigma)$, and let $c \in \mathbb{R}^n$ be a unit vector

Let $y = c^T x$

  • $y$ is the component of $x$ in the direction $c$
  • $y$ is Gaussian, with $\mathbf{E} y = 0$ and $\mathbf{cov}= c^T \Sigma c$
  • So $\mathbf{E}\left(y^2\right) = c^T \Sigma c$
  • (PCA) The unit vector $c$ that maximizes $c^T \Sigma c$ is the eigenvector of $\Sigma$ with the largest eigenvalue. Then

$$\mathbf{E}\left(y^2\right) = \lambda_{\text{max}}$$

1.4. Conditional pdf for a Gaussian (is Gaussian)

Suppose $x \sim N(\mu, \Sigma)$, and

$$ x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \quad \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix} \quad \Sigma = \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} $$

Suppose we measure $x_2 = y$. We would like to find the conditional pdf of $x_1$ given $x_2 = y$

  • Is it Gaussian ?
  • What is the conditional mean $\mathbf{E}(x_1 \mid x_2= y)$ ?
  • What is the conditional covariance $\mathbf{cov}(x_1 \mid x_2= y)$ ?

By the completion of squares formula

$$\Sigma^{-1} = \begin{bmatrix} I & 0 \\ -\Sigma^{-1}_{22}\Sigma_{21} & I \end{bmatrix} \begin{bmatrix} \left( \Sigma_{11} - \Sigma_{12}\Sigma^{-1}_{22}\Sigma_{21} \right)^{-1} & 0 \\ 0 & \Sigma^{-1}_{22} \end{bmatrix} \begin{bmatrix} I & -\Sigma_{12}\Sigma^{-1}_{22} \\ 0 & I \end{bmatrix}$$

If $x \sim N(0, \Sigma)$, then the conditional pdf of $x_1$ given $x_2 = y$ is Gaussian

  • The conditional mean is

$$\mathbf{E}(x_1 \mid x_2= y) = \Sigma_{12} \Sigma^{-1}_{22}\,y$$

$\quad \;$ It is a linear function of $y$.

  • The conditional convariance is

$$\mathbf{cov}(x_1 \mid x_2= y) = \Sigma_{11} - \Sigma_{12}\Sigma^{-1}_{22}\Sigma_{21} < \Sigma_{11}$$

$\quad \;$ It is not a function of $y$. Instead, it is constant.

$\quad \;$ Conditional confidence intervals are narrower. i.e., measuring $x_2$ gives information about $x_1$