Ellipse and Gaussian Distribution

By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

# 1. CoordinatesÂ¶

• Coordinates with basis
• basis $\{\hat x_1 \;\hat x_2 \}$ or basis $\{\hat y_1 \;\hat y_2 \}$

\begin{align*} \vec{r}_X &= \begin{bmatrix}x_1\\x_2\end{bmatrix} \text{: coordinate of} \; \vec r \text{ in basis} \; \{\hat x_1 \;\hat x_2 \} \\ \vec{r}_Y &= \begin{bmatrix}y_1\\y_2\end{bmatrix} \text{: coordinate of} \;\vec r \text{ in basis} \;\{\hat y_1 \;\hat y_2 \} \end{align*}

• Coordinate transformation
\begin{align*} \vec r = x_1\hat{x}_1 + x_2\hat{x}_2 & = y_1\hat{y}_1 + y_2\hat{y}_2 \\\\ \begin{bmatrix}\hat{x}_1&\hat{x}_2\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix} &=\begin{bmatrix}\hat{y}_1&\hat{y}_2\end{bmatrix}\begin{bmatrix}y_1\\y_2\end{bmatrix} \\ \\ U\begin{bmatrix}x_1\\x_2\end{bmatrix} &= \begin{bmatrix}y_1\\y_2\end{bmatrix} \quad \left( U = \begin{bmatrix}\hat{x}_1 & \hat{x}_2\end{bmatrix}, I = \begin{bmatrix}\hat{y}_1 & \hat{y}_2\end{bmatrix} \right)\\\\ \begin{bmatrix}x_1\\x_2\end{bmatrix} &= U^{-1}\begin{bmatrix}y_1\\y_2\end{bmatrix}= U^{T}\begin{bmatrix}y_1\\y_2\end{bmatrix} \end{align*}
• Coordinate change

\begin{align*} \begin{bmatrix}x_1\\x_2\end{bmatrix} \quad &\underrightarrow{U} \quad \begin{bmatrix}y_1\\y_2\end{bmatrix} \\\\ \begin{bmatrix}y_1\\y_2\end{bmatrix} \quad &\underrightarrow{U^T} \quad \begin{bmatrix}x_1\\x_2\end{bmatrix} \end{align*}

# 2. Equation of an EllipseÂ¶

• Unit circle

\begin{align*} & x_1^2 + x_2^2 = 1 \implies \\ \\ &\begin{bmatrix}x_1 & x_2 \end{bmatrix} \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = 1 \end{align*}
          
• Independent ellipse

\begin{align*} \frac{x_1^2}{a^2} + \frac{x_2^2}{b^2} = 1 &\implies \begin{bmatrix}x_1 & x_2 \end{bmatrix} \begin{bmatrix}\frac{1}{a^2} & 0 \\ 0 & \frac{1}{b^2} \end{bmatrix} \begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = 1 \\ & \implies \begin{bmatrix}x_1 & x_2 \end{bmatrix} \Sigma_x^{-1} \begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = 1\\ \\ &\text{where} \; \Sigma_x^{-1} =\begin{bmatrix}\frac{1}{a^2} & 0\\ 0 & \frac{1}{b^2} \end{bmatrix}, \,\Sigma_x = \begin{bmatrix}a^2 & 0\\ 0 & b^2 \end{bmatrix} \end{align*}
• Dependent ellipse (Rotated ellipse)
• Coordinate changes

\begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = U^T \begin{bmatrix}y_1 \\ y_2 \end{bmatrix},\quad \begin{array}{Icr} \begin{align} &x= U^Ty \\ &Ux=y \end{align} \end{array}
• Now we know in basis $\{\hat x_1 \;\hat x_2 \}$

$$x^T \Sigma_x^{-1}x = 1 \quad \text{and} \quad \Sigma_x = \begin{bmatrix}a^2 & 0\\ 0 & b^2 \end{bmatrix}$$

• Then, we can find $\Sigma_y$ such that

\begin{align*} y^T \Sigma_y^{-1}y &= 1 \quad \text{and} \quad \Sigma_y = ?\\\\ \implies x^T \Sigma_x^{-1} x &= y^T U\Sigma_x^{-1} U^Ty = 1 \quad (\Sigma_y^{-1} \; \text{: similar matrix to } \Sigma_x^{-1}) \\\\ \therefore \;\Sigma_y^{-1} &= U\Sigma_x^{-1}U^T \quad \text{or}\\\\ \Sigma_y &= U\Sigma_x U^T \end{align*}

• The quation of dependent ellipse

$$y^T \Sigma_y^{-1}y = 1 \quad \text{where} \quad \Sigma_y = U\Sigma_x U^T$$
InÂ [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

InÂ [2]:
theta = np.arange(0,2*np.pi,0.01)
x1 = np.cos(theta)
x2 = np.sin(theta)

plt.figure(figsize=(6,6))
plt.plot(x1, x2)
plt.xlim([-4,4])
plt.ylim([-4,4])
plt.show()

InÂ [3]:
x1 = 3*np.cos(theta);
x2 = np.sin(theta);

plt.figure(figsize=(6,6))
plt.plot(x1, x2)
plt.xlim([-4,4])
plt.ylim([-4,4])
plt.show()

$$U = \left[\hat x_1 \;\hat x_2 \right] = \begin{bmatrix}\frac{1}{2} & -\frac{\sqrt{3}}{2}\\ \frac{\sqrt{3}}{2} & \frac{1}{2} \end{bmatrix}$$
InÂ [4]:
U = np.matrix([[1/2, -np.sqrt(3)/2],
[np.sqrt(3)/2, 1/2]])

x = np.matrix([x1, x2])

y = U*x
print("U = \n", U)

plt.figure(figsize=(6,6))
plt.plot(y[0,:].T, y[1,:].T)
plt.xlim([-4,4])
plt.ylim([-4,4])
plt.show()

U =
[[ 0.5       -0.8660254]
[ 0.8660254  0.5      ]]

InÂ [5]:
Sx = np.matrix([[9, 0],
[0, 1]])

Sy = U*Sx*U.T

print ("Sx = \n", Sx, "\n")
print ("Sy = \n", Sy)

Sx =
[[9 0]
[0 1]]

Sy =
[[3.         3.46410162]
[3.46410162 7.        ]]


## 2.1. Question (Reverse Problem)Â¶

• Given $\Sigma_y^{-1}$ (or $\Sigma_y)$,
• how to find $a$ (major axis) and $b$ (minor axis)
• how to find the $\Sigma_x$
• how to find the proper matrix $U$
• eigenvectors of $\Sigma$

$$A=S \Lambda S^T \qquad \text{where } S=[\upsilon_1 \; \upsilon_2] \text{ eigenvector of } A, \; \text{and }\Lambda=\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}$$

$$\text{here, }\, \Sigma_y = U\Sigma_x U^T = U \Lambda U^T \qquad \text{where } U = \begin{bmatrix}\hat{x}_1 & \hat{x}_2 \end{bmatrix} \text{ eigenvector of } \Sigma_y, \; \text{and }\Lambda=\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix} = \begin{bmatrix} a^2 & 0 \\ 0 & b^2 \end{bmatrix}$$

\begin{align*} \text{eigen-analysis} \begin{cases}\Sigma_y\hat{x}_1=\lambda_1 \hat{x}_1\\ \Sigma_y\hat{x}_2=\lambda_2 \hat{x}_2\\ \end{cases} \; \implies \; & \Sigma_y \underbrace{\begin{bmatrix}\hat{x}_1 & \hat{x}_2 \end{bmatrix}}_{U} = \underbrace{\begin{bmatrix} \hat{x}_1 & \hat{x}_2 \end{bmatrix}}_{U} \underbrace{\begin{bmatrix}\lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}}_{\Lambda} \\ \\ & \Sigma_y U = U\Lambda \\ & \Sigma_y = U \Lambda U^T = U \Sigma_x U^T \end{align*}

\begin{array}{Icr} \\x=U^Ty \\\\ \begin{bmatrix}x_1 \\ x_2 \end{bmatrix} = U^T \begin{bmatrix} y_1 \\ y_2 \end{bmatrix} \end{array} \quad \quad \begin{array}{Icr} \begin{align*} & a=\sqrt{\lambda_1}\\ & b = \sqrt{\lambda_2}\\\\ & \text{major axis}=\hat{x}_1\\ & \text{minor axis}=\hat{x}_2 \end{align*} \end{array}

• Again, it is an eigenvalue and eigenvector problem.
InÂ [6]:
D, U = np.linalg.eig(Sy)

idx = np.argsort(-D)
D = D[idx]
U = U[:,idx]

print ("D = \n", np.diag(D))
print ("U = \n", U)

D =
[[9. 0.]
[0. 1.]]
U =
[[-0.5       -0.8660254]
[-0.8660254  0.5      ]]


## 2.2. SummaryÂ¶

\begin{align*} x &=U^Ty \\\\ U &=\begin{bmatrix}\hat{x}_1 & \hat{x}_2\end{bmatrix} \end{align*}
• Independent ellipse in $\{\hat{x}_1, \hat{x}_2 \}$

• Dependent ellipse in $\{ \hat{y}_1, \hat{y}_2 \}$

• Decouple

• Diagonalize
• Eigen-analysis

# 3. Gaussian DistributionÂ¶

## 3.1. Standard Univariate Normal Distribution $\sim \mathcal{N}(0,1^2)$Â¶

$$P_Y\left(Y=y\right)=\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{1}{2}y^2\right)$$

$$\frac{1}{2} y^2 = \text{const} \implies \text{prob. contour}$$

InÂ [7]:
y = np.arange(-10,10,0.1)

ProbG = 1/np.sqrt(2*np.pi)*np.exp(-1/2*y**2)

plt.figure(figsize=(10,6))
plt.plot(y, ProbG)
plt.show()

InÂ [8]:
from scipy.stats import norm

ProbG2 = norm.pdf(y)

plt.figure(figsize=(10,6))
plt.plot(y, ProbG2)
plt.show()

InÂ [9]:
x = np.random.randn(5000,1)

plt.figure(figsize=(10,6))
plt.hist(x, bins=51, normed=True)
plt.plot(y, ProbG2, label='G2')
plt.show()

c:\users\seungchul\appdata\local\programs\python\python35\lib\site-packages\matplotlib\axes\_axes.py:6521: MatplotlibDeprecationWarning:
The 'normed' kwarg was deprecated in Matplotlib 2.1 and will be removed in 3.1. Use 'density' instead.
alternative="'density'", removal="3.1")


## 3.2. Univariate Normal Distribution $\sim \mathcal{N}\left(\mu,\sigma^2\right)$Â¶

• Gaussian or normal distribution, 1D (mean $\mu$, variance $\sigma^2$)

$$\mathcal{N}(x;\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{1}{2}\frac{\left(x-\mu\right)^2}{\sigma^2}\right)$$

\begin{align*}x &\sim \mathcal{N}\left(\mu,\sigma^2\right) \\ & \implies P_Y\left(y\right) = P_X\left(x\right), \quad y = \frac{x-\mu}{\sigma}, \quad x=\sigma y + \mu\\\\ P_X\left(X=x\right) &\sim \exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right) \\ &= \exp\left(-\frac{1}{2}\frac{\left(x-\mu\right)^2}{\sigma^2}\right) \\\\ E[x] &= \mu \\ \text{var}(x) &= E[(x-\mu)^2] = \sigma^2 \end{align*}

InÂ [10]:
mu = 2
sigma = 3

x = np.arange(-10, 10, 0.1)

ProbG3 = 1/(np.sqrt(2*np.pi)*sigma) * np.exp(-1/2*(x-mu)**2/(sigma**2))

plt.figure(figsize=(10,6))
plt.plot(y,ProbG, label='G1')
plt.plot(x,ProbG3, label='G3')
plt.legend()
plt.show()

InÂ [11]:
ProbG2 = norm.pdf(x, 2, 3)

plt.figure(figsize=(10,6))
plt.plot(y,ProbG, label='G1')
plt.plot(x,ProbG2, label='G2')
plt.legend()
plt.show()

InÂ [12]:
x = mu + sigma*np.random.randn(5000,1)

plt.figure(figsize=(10,6))
plt.hist(x, bins=51, normed=True)
plt.plot(y,ProbG2, label='G2')
plt.ylim([0,0.4])
plt.show()


## 3.3. Multivariate Gaussian ModelsÂ¶

• Similar to a univariate case, but in a matrix form

\begin{align*} \mathcal{N}\big(x;\, \mu,\Sigma\big) &= \frac{1}{(2\pi)^{\frac{n}{2}}\lvert\Sigma\rvert^{\frac{1}{2}}}\exp\left(-\frac{1}{2}\left(x-\mu\right)^T\Sigma^{-1}\left(x-\mu\right)\right) \\ \\ \mu &= \text{length}\,\, n \,\,\text{column vector} \\ \Sigma &= n \times n \,\,\text{matrix (covariance matrix)} \\ \lvert\Sigma\rvert &= \text{matrix determinant}\\\\ E[x] &= \mu \\ \text{cov}(x) &= E[(x-\mu)(x-\mu)^T] = \Sigma \end{align*}

• Multivariate Gaussian models and ellipse
• Ellipse shows constant $\Delta^2$ value...

$$\Delta^2 = (x-\mu)^T\Sigma^{-1} (x-\mu)$$

### 3.3.1. Two Independent VariablesÂ¶

\begin{align*} P\left(X_1=x_1,X_2=x_2\right) &= P_{X_1}\left(x_1\right)P_{X_2}\left(x_2\right) \\\\ &\sim \exp \left(-\frac{1}{2} \frac{\left(x_1-\mu_{x_1}\right)^2}{\sigma_{x_1}^2}\right) \cdot \exp \left(-\frac{1}{2} \frac{\left(x_2-\mu_{x_2}\right)^2}{\sigma_{x_2}^2}\right) \\\\ &\sim\exp \left(-\frac{1}{2}\left(\frac{\left(x_1-\mu_{x_1}\right)^2}{\sigma_{x_1}^2} + \frac{\left(x_2-\mu_{x_2}\right)^2}{\sigma_{x_2}^2}\right)\right)\\ \end{align*}

• In a matrix form
$$P(x_1) \cdot P(x_2)=\frac{1}{Z_1Z_2}\exp\left(-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\right)$$

$$\left ( x=\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}, \quad \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}, \quad \Sigma = \begin{bmatrix}\sigma_{x_1}^2 & 0 \\ 0 & \sigma_{x_2}^2 \end{bmatrix} \right )$$

\begin{align*}\frac{x_1^2}{\sigma_{x_1}^2} + \frac{x_2^2}{\sigma_{x_2}^2} &= c \;\;\;\text{(ellipse)} \\ \\ \begin{bmatrix} x_1 & x_2\end{bmatrix} \begin{bmatrix} \frac{1}{\sigma_{x_1}^2} & 0 \\ 0 & \frac{1}{\sigma_{x_2}^2} \end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} &= c \qquad (\sigma_{x_1} < \sigma_{x_2}) \end{align*}

• Summary in a matrix form
\begin{align*} \mathcal{N}\left(0,\Sigma_x\right) &\sim \exp\left(-\frac{1}{2}x^T \Sigma_x^{-1} x\right) \\\\ \mathcal{N}\left(\mu_x,\Sigma_x\right) &\sim \exp\left(-\frac{1}{2}\left(x-\mu_x\right)^T \Sigma_x^{-1} \left(x-\mu_x\right)\right) \end{align*}
InÂ [13]:
mu = np.array([0, 0])
sigma = np.eye(2)

m = 1000
x = np.random.multivariate_normal(mu, sigma, m)
print(x.shape)

plt.figure(figsize=(10,6))
plt.plot(x[:,0], x[:,1], '.')
plt.axis('equal')
plt.ylim([-8,8])
plt.show()

(1000, 2)

InÂ [14]:
mu = np.array([0, 0])
sigma = np.array([[1, 0], [0, 4]])

m = 1000
x = np.random.multivariate_normal(mu, sigma, m)

plt.figure(figsize=(10,6))
plt.plot(x[:,0], x[:,1], '.')
plt.axis('equal')
plt.ylim([-8,8])
plt.show()


### 3.3.2. Two Dependent VariablesÂ¶

• Compute $P_Y(y)$ from $P_X(x)$
\begin{align*} P_X(x) = P_Y(y) \;\ \text{where}\;\; & x = \begin{bmatrix}x_1 \\ x_2\end{bmatrix}, y = \begin{bmatrix}y_1 \\ y_2 \end{bmatrix} \end{align*}
• Relationship between $y$ and $x$
$$x = \begin{bmatrix}\hat{x}_1 & \hat{x}_2 \end{bmatrix}^T y = U^Ty$$

$$x^T\Sigma_x^{-1} x = y^T U\Sigma_x^{-1}U^Ty = y^T \Sigma_y^{-1} y\\$$

\begin{align*} \therefore \;\; \Sigma_y^{-1} &= U\Sigma_x^{-1} U^T \\\\ \Sigma_y &= U \Sigma_x U^T \end{align*}
• $\Sigma_x$ : covariance matrix of $x$

• $\Sigma_y$ : covariance matrix of $y$

• If $U$ is an eigenvector matrix of $\Sigma_y$, then $\Sigma_x$ is a diagonal matrix
InÂ [15]:
mu = np.array([0, 0])
sigma = 1./2.*np.array([[5, -3], [-3, 5]])

m = 1000
x = np.random.multivariate_normal(mu, sigma, m)

plt.figure(figsize=(10,6))
plt.plot(x[:,0], x[:,1], '.')
plt.axis('equal')
plt.ylim([-8,8])
plt.show()


Remark

\begin{align*} & x \sim \mathcal{N}(\mu_x,\Sigma_x) \text{ and } y = Ax + b \; \text{affine transformation}\\ \\ \implies \; & y \sim \mathcal{N}(\mu_y,\Sigma_y) = \mathcal{N}(A\mu_x+b,A\Sigma_xA^T)\\\\ \implies \; & y \;\; \text{is also Gaussian with} \;\; \mu_y = A\mu_x+b, \;\; \Sigma_y = A \Sigma_x A^T \end{align*}

## 3.4. Decouple using Covariance MatrixÂ¶

Given data, how to find $\Sigma_y$ and major (or minor) axis (assume $\mu_y = 0$)

• This is a statistical problem
• Estimating a Gussian model to given data

$$\Sigma_y = \begin{bmatrix} \text{var}(y_1) & \text{cov}(y_1,y_2)\\ \text{cov}(y_2,y_1) & \text{var}(y_2) \end{bmatrix}$$

$$\begin{array}{Icr} \text{eigen-analysis} \\ \\ \Sigma_y\hat{x}_1 = \lambda_1\hat{x}_1 \\ \Sigma_y\hat{x}_2 = \lambda_2\hat{x}_2 \end{array} \quad \quad \begin{array}{Icr} \Sigma_x^{-1} = \begin{bmatrix} \frac{1}{\sqrt{\lambda_1}^2} & 0 \\ 0 & \frac{1}{\sqrt{\lambda_2}^2} \end{bmatrix} \\ \Sigma_x = \begin{bmatrix}\sqrt{\lambda_1}^2 & 0 \\ 0 & \sqrt{\lambda_2}^2 \end{bmatrix} \end{array}$$

\begin{array}{Icr} \begin{align*} \Sigma_y \begin{bmatrix} \hat{x}_1 & \hat{x}_2 \end{bmatrix} &= \begin{bmatrix} \hat{x}_1 & \hat{x}_2 \end{bmatrix} \begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}\\ &= \begin{bmatrix}\hat{x}_1 & \hat{x}_2 \end{bmatrix} \Sigma_{x} \\ \\ \Sigma_y &= U \Sigma_x U^T \end{align*} \end{array} \quad \quad \quad \begin{array}{Icr} y = U x \implies U^Ty = x \\ \begin{bmatrix} \hat{x}_1 & \hat{x}_2 \end{bmatrix} = U \\ \\ \\ \end{array}
InÂ [16]:
S = np.cov(x.T)
print ("S = \n", S)

S =
[[ 2.52002523 -1.52473469]
[-1.52473469  2.5267346 ]]

InÂ [17]:
D, U = np.linalg.eig(S)

idx = np.argsort(-D)
D = D[idx]
U = U[:,idx]

print ("U = \n", U)
print ("D = \n", D)

U =
[[ 0.70632848 -0.70788423]
[-0.70788423 -0.70632848]]
D =
[4.0481183  0.99864153]

InÂ [18]:
xp = np.arange(-10, 10)

plt.figure(figsize=(10,6))
plt.plot(x[:,0],x[:,1],'.')
plt.plot(xp, U[1,0]/U[0,0]*xp, label='u1')
plt.plot(xp, U[1,1]/U[0,1]*xp, label='u2')
plt.axis('equal')
plt.ylim([-8, 8])
plt.legend()
plt.show()


# 4. Properties of Gaussian DistributionÂ¶

• Uncorrelated $\Longrightarrow$ independent
• Conditioning (slice) $\Longrightarrow$ Highly related to inference