Dimension Reduction

By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KASIT

Table of Contents

1. Multivariate Statistics¶

Correlation of Two Random Variables

$$ \begin{align*} \text{Sample Variance} : S_x &= \frac{1}{m-1} \sum\limits_{i=1}^{m}\left(x^{(i)}-\bar x\right)^2 \\ \text{Sample Covariance} : S_{xy} &= \frac{1}{m-1} \sum\limits_{i=1}^{m}\left(x^{(i)}-\bar x\right)\left(y^{(i)}-\bar y \right)\\ \text{Sample Covariance matrix} : S &= \begin{bmatrix} S_x & S_{xy} \\ S_{yx} & S_y \end{bmatrix}\\ \text{sample correlation coefficient} : r &= \frac{S_{xy}}{ \sqrt {S_{xx}\cdot S_{yy}} } \end{align*}$$

Strength of linear relationship between two variables, $x$ and $y$

Correlation Coefficient Plot

Plots correlation coefficients among pairs of variables
http://rpsychologist.com/d3/correlation/

Covariance Matrix

$$ \sum = \begin{bmatrix} E[(X_1-\mu_1)(X_1-\mu_1)]& E[(X_1-\mu_1)(X_2-\mu_2)] & \cdots &E[(X_1-\mu_1)(X_n-\mu_n)]\\ E[(X_2-\mu_2)(X_1-\mu_1)]& E[(X_2-\mu_2)(X_2-\mu_2)] & \cdots &E[(X_2-\mu_2)(X_n-\mu_n)]\\ \vdots & \vdots & \ddots & \vdots\\ E[(X_n-\mu_n)(X_1-\mu_1)]& E[(X_n-\mu_n)(X_2-\mu_2)] & \cdots &E[(X_n-\mu_n)(X_n-\mu_n)]\\ \end{bmatrix}$$

2. Principal Component Analysis (PCA)¶

2.1. Motivation¶

Can we describe high-dimensional data in a "simpler" way?

$\quad \rightarrow$ Dimension reduction without losing too much information
$\quad \rightarrow$ Find a low-dimensional, yet useful representation of the data

How?

idea: highly correlated data contains redundant features

Now consider a change of axes
- Each example $x$ has 2 features $\{u_1,u_2\}$
- Consider ignoring the feature $u_2$ for each example
- Each 2-dimensional example $x$ now becomes 1-dimensional $x = \{u_1\}$
- Are we losing much information by throwing away $u_2$ ?
- No. Most of the data spread is along $u_1$ (very little variance along $u_2$)

Data $\rightarrow$ projection onto unit vector $\hat{u}_1$
- PCA is used when we want projections capturing maximum variance directions
- Principal Components (PC): directions of maximum variability in the data
- Roughly speaking, PCA does a change of axes that can represent the data in a succinct manner

2.2. PCA Algorithm¶

2.2.1. Pre-processing¶

Given data

$$ x^{(i)} = \begin{bmatrix} x^{(i)}_1 \\ \vdots \\x^{(i)}_n \end{bmatrix},\qquad X = \begin{bmatrix} \cdots & (x^{(1)})^T & \cdots\\ \cdots & (x^{(2)})^T & \cdots\\ & \vdots & \\ \cdots & (x^{(m)})^T & \cdots\\ \end{bmatrix}$$

Shifting (zero mean) and rescaling (unit variance)

Shift to zero mean

$$ \begin{align*} \mu &= \frac{1}{m}\sum\limits_{i=1}^{m}x^{(i)} \\ x^{(i)} &\leftarrow x^{(i)} - \mu \quad \text{(zero mean)} \end{align*} $$

[optional] Rescaling (unit variance)

$$\begin{align*} \sigma^2_j &= \frac{1}{m-1}\sum\limits_{i=1}{m}\left(x_j^{(i)}\right)^2 \\ x^{(i)}_j &\leftarrow \frac{x^{(i)}_j}{\sigma_j} \\ \end{align*}$$

2.2.2. Maximize Variance¶

Find unit vector $u$ such that maximizes variance of projections

Note: $m \approx m-1$ for big data

$$\begin{align*} \text{variance of projected data} & = \frac{1}{m}\sum\limits_{i=1}^{m}\big(u^Tx^{(i)}\big)^2 = \frac{1}{m}\sum\limits_{i=1}^{m}\big( {x^{(i)}}^Tu\big)^2 \\ & = \frac{1}{m}\sum\limits_{i=1}^{m}\big( {x^{(i)}}^Tu\big)^T\big( {x^{(i)}}^Tu\big) = \frac{1}{m}\sum\limits_{i=1}^{m}u^Tx^{(i)}{x^{(i)}}^Tu \\ & = u^T\left(\frac{1}{m}\sum\limits_{i=1}^{m}x^{(i)}{x^{(i)}}^T \right) u \\ & =u^TSu \qquad (S=\frac{1}{m}X^T X: \text{sample covariance matrix}) \end{align*}$$

In an optimization form

$$\begin{align*} \text{maximize} \quad & u^TSu \\ \text{subject to} \quad & u^Tu = 1\end{align*}$$
$$\begin{align*} & u^TSu = u^T\lambda u = \lambda u^Tu = \lambda \quad (\text{Eigen analysis} : Su = \lambda u) \\ \\ & \implies \text{pick the largest eigenvalue } \lambda _1 \text{ of covariance matrix } S\\ & \implies u = u_1 \, \text{ is the } \,\lambda_1's \,\text{ corresponding eigenvector}\\ & \implies u_1 \text{ is the first principal component (direction of highest variance in the data)} \end{align*}$$

2.2.3. Dimension Reduction Method ($n \rightarrow k$)¶

Choose top $k$ (orthonormal) eigenvectors, $U = [u_1, u_2, \cdots, u_k]$
Project $x_i$ onto span $\{ u_1, u_2, \cdots, u_k\}$

$$z^{(i)} = \begin{bmatrix} u_1^Tx^{(i)}\\ u_2^Tx^{(i)}\\ \vdots \\ u_k^Tx^{(i)}\\ \end{bmatrix} \;\text{ or }\; z = U^{T}x $$

Pictorial summary of PCA

$\qquad \qquad \qquad \qquad x^{(i)} \rightarrow$ projection onto unit vector $u \implies u^Tx^{(i)} = $ distance from the origin along $u$

2.3. Python Codes¶

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# data generation
m = 5000
mu = np.array([0, 0])
sigma = np.array([[3, 1.5],
                  [1.5, 1]])

X = np.random.multivariate_normal(mu, sigma, m)
X = np.asmatrix(X)

fig = plt.figure(figsize = (6, 4))
plt.plot(X[:,0], X[:,1], 'k.', alpha = 0.3)
plt.axis('equal')
plt.grid(alpha = 0.3)
plt.show()

S = 1/(m-1)*X.T*X

D, U = np.linalg.eig(S)

# idx = np.argsort(-D)
# D = D[idx]
# U = U[:,idx]

print(D, '\n')
print(U)

[3.81139963 0.20344419] 

[[ 0.88117168 -0.47279645]
 [ 0.47279645  0.88117168]]

h = U[1,0]/U[0,0]
xp = np.arange(-6, 6, 0.1)
yp = h*xp

plt.figure(figsize = (6, 4))
plt.plot(X[:,0], X[:,1], 'k.', alpha = 0.3)
plt.plot(xp, yp, 'r', linewidth = 3)
plt.axis('equal')
plt.grid(alpha = 0.3)
plt.show()

Z = X*U[:,0]

plt.figure(figsize = (6, 4))
plt.hist(Z, 51)
plt.xlim([-8, 8])
plt.show()

Z = X*U[:,1]

plt.figure(figsize = (6, 4))
plt.hist(Z, 51)
plt.xlim([-8, 8])
plt.show()

from sklearn.decomposition import PCA

X = np.array(X)

pca = PCA(n_components = 1)
pca.fit(X)

PCA(n_components=1)

PCA(n_components=1)

u = pca.transform(X)

plt.figure(figsize = (6, 4))
plt.hist(u, 51)
plt.xlim([-8, 8])
plt.show()

2.4. PCA Example¶

Multiple video camera records of a spring and mass system

source:
- https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf

%%html
<center><iframe
width="560" height="315" src="https://www.youtube.com/embed/Pkit-64g0eU" frameborder="0" allowfullscreen>
</iframe></center>

%%html
<center><iframe
width="560" height="315" src="https://www.youtube.com/embed/x4lvjVjUUqg" frameborder="0" allowfullscreen>
</iframe></center>

%%html
<center><iframe
width="560" height="315" src="https://www.youtube.com/embed/2t62WkNIqxY" frameborder="0" allowfullscreen>
</iframe></center>

$$ x^{(i)} = \begin{bmatrix} x \text{ in camera 1} \\ y \text{ in camera 1} \\ x \text{ in camera 2} \\ y \text{ in camera 2} \\ x \text{ in camera 3} \\ y \text{ in camera 3} \end{bmatrix},\qquad X_{m \times 6} = \begin{bmatrix} \cdots & (x^{(1)})^T & \cdots \\ \cdots & (x^{(2)})^T & \cdots \\ & \vdots & \\ & \vdots & \\ \cdots & (x^{(m)})^T & \cdots \\ \end{bmatrix} $$

Download files

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

from six.moves import cPickle

X = cPickle.load(open('/content/drive/MyDrive/ML_Colab/ML_data/pca_spring.pkl','rb'))
X = np.asmatrix(X.T)

print(X.shape)
m = X.shape[0]

(273, 6)

plt.figure(figsize = (6, 4))
plt.subplot(1,3,1)
plt.plot(X[:, 0], -X[:, 1], 'r')
plt.axis('equal')
plt.title('Camera 1')

plt.subplot(1,3,2)
plt.plot(X[:, 2], -X[:, 3], 'b')
plt.axis('equal')
plt.title('Camera 2')

plt.subplot(1,3,3)
plt.plot(X[:, 4], -X[:, 5], 'k')
plt.axis('equal')
plt.title('Camera 3')

plt.show()

X = X - np.mean(X, axis = 0)

S = 1/(m-1)*X.T*X

D, U = np.linalg.eig(S)

idx = np.argsort(-D)
D = D[idx]
U = U[:,idx]

print(D, '\n')
print(U)

[2.46033089e+04 3.22747042e+02 8.73851124e+01 8.19527660e+01
 3.19467195e+01 7.42861585e+00] 

[[ 0.36881064  0.62298194 -0.12571821 -0.42647348  0.52121775 -0.0807439 ]
 [ 0.35632379  0.57286174  0.132303    0.59881765 -0.40143215  0.08734045]
 [ 0.58419477 -0.22610057 -0.20325551 -0.47751523 -0.58153918  0.00857804]
 [ 0.08652315 -0.02671281  0.75692234 -0.14177391 -0.06010869 -0.62861422]
 [ 0.4159798  -0.29900638  0.49374948  0.05637591  0.32442517  0.62075559]
 [-0.46389987  0.37746931  0.32963322 -0.45633202 -0.34660023  0.45308403]]

plt.figure(figsize = (6, 4))
plt.stem(np.sqrt(D))
plt.grid(alpha = 0.3)
plt.show()

# relative magnitutes of the principal components

Z = X*U
xp = np.arange(0, m)/24    # 24 frame rate

plt.figure(figsize = (6, 4))
plt.plot(xp, Z)
plt.yticks([])
plt.show()

## projected onto the first principal component
# 6 dim -> 1 dim (dim reduction)
# relative magnitute of the first principal component

Z = X*U[:,0]

plt.figure(figsize = (6, 4))
plt.plot(Z)
plt.yticks([])
plt.show()

3. Singular Value Decomposition (SVD)¶

Geometry of Linear Maps

Matrix $A$ (or linear transformation) = rotate + stretch/compress

an extremely important fact:

every matrix $A \in \mathbb{R}^{m \times n}$ maps the unit ball in $\mathbb{R}^n$ to an ellipsoid in $\mathbb{R}^m$
$$ \begin{align*} S &= \{ x \in \mathbb{R}^n \mid \; \Vert x \Vert \leq 1 \}\\ AS &= \{Ax \mid x\in \mathbf{S}\} \end{align*} $$

3.1. SVD Algorithm¶

Singular Values and Singular Vectors

the numbers $\sigma_1, \cdots, \sigma_n$ are called the singular values of $A$ by convention, $\sigma_i > 0$
the vectors $u_1, \cdots, u_n$ are unit vectors along the principal semiaxes of $AS$
the vectors $v_1, \cdots, v_n$ are the preimages of the principal semiaxes, defined so that

$$ A v_i = \sigma_i u_i$$

$$\begin{align*} A\begin{bmatrix} v_1& v_2 & \cdots & v_r \end{bmatrix} & =\begin{bmatrix} A v_1 & A v_2 & \cdots & A v_r \end{bmatrix} \\ & =\begin{bmatrix} \sigma_1u_1& \sigma_2u_2 & \cdots & \sigma_ru_r \end{bmatrix} \\ & =\begin{bmatrix} u_1& u_2 & \cdots & u_r \end{bmatrix}\begin{bmatrix} \sigma_1& & & \\ & \sigma_2 & & \\ & & \ddots & \\ & & & \sigma_r \end{bmatrix} \\ \\ \therefore \; AV & = U\Sigma \quad (r \leq m, n) \end{align*}$$

Thin Singular Value Decomposition

We have $A \in \mathbb{R}^{m \times n}$, skinny and full rank (i.e., $r = n$), and

$$A v_i = \sigma_i u_i \quad \text{for } 1 \leq i \leq n$$

let

$$ \begin{align*} \hat U &= \begin{bmatrix}u_1 & u_2 & \cdots & u_n \end{bmatrix}\\ \\ \quad \hat \Sigma &= \begin{bmatrix}\sigma_1& & & \\ & \sigma_2 &&\\ &&\ddots&\\ &&&\sigma_n \end{bmatrix}\\ \\ \quad V &= \begin{bmatrix} v_1 & v_2 & \cdots & v_n \end{bmatrix} \end{align*} $$

in matrix form, the above equation is $AV = \hat U \hat \Sigma$ and since $V$ is orthogonal

$$ A = \hat U \hat \Sigma V^T $$

called the thin (or reduced) SVD of $A$

Full Singular Value Decomposition

We can add extra orthonormal columns to $U$ to make an orthogonal matrix $(m \geq n)$

$$ U = \begin{bmatrix} u_1 & u_2 & \cdots & u_m \end{bmatrix}$$

We also add extra rows of zeros to $\hat \Sigma$, so

$$ A = U\Sigma V^T$$

This is the (full) singular value decomposition of $A$

Every matrix $A$ has a singular value decomposition.
If $A$ is not full rank, then some of the diagonal entries of $\Sigma$ will be zero

SVD: Matrix factorization

for any matrix $A$

$$ A = U\Sigma V^T $$

Example

$$ A = \begin{bmatrix} 4 & 4 \\ -3 & 3 \end{bmatrix}$$

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

[U, S, VT] = np.linalg.svd(A, full_matrices = True)

print(U.shape, S.shape, VT.shape, '\n')
print(U, '\n')
print(S, '\n')
print(VT)

(2, 2) (2,) (2, 2) 

[[-1.  0.]
 [ 0.  1.]] 

[5.65685425 4.24264069] 

[[-0.70710678 -0.70710678]
 [-0.70710678  0.70710678]]

3.2. PCA and SVD¶

Any real symmetric and positive definite matrix $B$ has a eigen decomposition

$$B = S\Lambda S^T$$

A real matrix $(m\times n)$ $A$, where $m>n$, has the decompsition

$$A = U\Sigma V^T$$

From $A$ (skinny and full rank) we can construct two positive-definite symmetric matrices, $AA^T$ and $A^TA$

$$\begin{align*} AA^T &= U\Sigma V^T V \Sigma^T U^T = U \Sigma \Sigma^T U^T \quad (m \times m, \,n\, \text{eigenvalues and }\, m-n \; \text{zeros},\; U\; \text{eigenvectors})\\ A^TA &= V \Sigma^T U^T U\Sigma V^T = V \Sigma^T \Sigma V^T \quad (n \times n, \,n \;\text{eigenvalues},\; V\; \text{eigenvectors}) \end{align*}$$

PCA by SVD

$$B = A^TA = V \Sigma^T \Sigma V^T = V \Lambda V^T$$

$V$ is eigenvectors of $B = A^TA$
$\sigma_i^2 = \lambda_i$
Note that in PCA,

$$ VS V^T \quad \left(\text{where }S = \frac{1}{m}X^TX = \left(\frac{X}{\sqrt{m}}\right)^T \frac{X}{\sqrt{m}} = A^TA \right) $$

Example

$$\begin{array}{Icr} A = \begin{bmatrix} 4 & 4 \\ -3 & 3 \end{bmatrix} \end{array} \quad \quad \begin{array}{Icr} \begin{align*} &\textbf{orthonormal basis} \\ &v_1, v_2 \text{ in row space } \mathbb{R}^2 \\ &u_1, u_2 \text{ in column space } \mathbb{R}^2 \\ &\sigma_1 > 0 \\ &\sigma_2 > 0 \end{align*} \end{array} $$
$$ \begin{array}{lcr} A v_1 = \sigma_1 u_1\\ A v_2 = \sigma_2 u_2 \end{array} \; \implies \; \begin{array}{lcr} AV = U\Sigma \\ A = U\Sigma V^{-1} = U\Sigma V^T \end{array}$$

$A^TA$: (small) positive definite & symmetric

$$A^TA = V\Sigma^TU^TU\Sigma V^T = V\Sigma^T\Sigma V^T = V\begin{bmatrix} \sigma_1^2& & \\ & \sigma_2^2 & \\ & & \ddots \\ \end{bmatrix}V^T$$

A = np.array([[4, 4],
              [-3, 3]])

A = np.asmatrix(A)

[D, V] = np.linalg.eig(A.T*A)

print(V, '\n')
print(D)

[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]] 

[32. 18.]

$AA^T$: (large) positive definite & symmetric

$$AA^T = U\Sigma V^TV\Sigma^TU^T = U\Sigma\Sigma^TU^T = U\begin{bmatrix} \sigma_1^2& & \\ & \sigma_2^2 & \\ & & \ddots \\ \end{bmatrix}U^T$$

[D, U] = np.linalg.eig(A*A.T)

print(U, '\n')
print(D)

[[1. 0.]
 [0. 1.]] 

[32. 18.]

%%html
<center><iframe src="https://www.youtube.com/embed/Nx0lRBaXoz4?rel=0"
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

3.3. Low Rank Approximation: Dimension Reduction¶

3.3.1. Low Rank Approximation¶

in 2D case

$$ \begin{align*} x & = c_1 v_1 + c_2 v_2 \\ & = \langle x, v_1 \rangle v_1 + \langle x, v_2 \rangle v_2\\ & = ( v_1^{T} x) v_1 + ( v_2^T x) v_2 \\ \\ Ax & = c_1 A v_1 + c_2 A v_2 \\ & = c_1\sigma_1 u_1 + c_2 \sigma_2 u_2 \\ & = u_1 \sigma_1 v_1^Tx + u_2\sigma_2 v_2^T x \\ \\ & = \begin{bmatrix} u_1 & u_2 \end{bmatrix}\begin{bmatrix}\sigma_1 & 0\\ 0 & \sigma_2 \end{bmatrix}\begin{bmatrix} v_1^T \\ v_2^T \end{bmatrix} x \\ \\ & = U \Sigma V^Tx \\\\ & \approx u_1\sigma_1 v_1^Tx \quad (\text{if } \sigma_1 \gg \sigma_2) \end{align*}$$

Low rank approximation of matrix $A$ in general

$$\begin{align*} A &= U_r \Sigma_r V_r^T = \sum_{i=1}^{r}\sigma_i u_i v_i^T\\ \\ \tilde A &= U_k \Sigma_k V_k^T = \sum_{i=1}^{k}\sigma_i u_i v_i^T \qquad (k \leq r) \end{align*}$$

3.3.2. Example: Image Approximation¶

Learn how to reduce the amount of storage needed for an image using the singular value decomposition.
Approximation of $A$

$$\tilde A = U_k \Sigma_k V_k^T = \sum_{i=1}^{k}\sigma_i u_i v_i^T \qquad (k \leq r)$$

Downlaod the image file

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

import cv2

img = cv2.imread('/content/drive/MyDrive/ML_Colab/ML_data/kaist_me.jpg', 0)
print(img.shape)

(1459, 2593)

plt.figure(figsize = (6, 4))
plt.imshow(img, 'gray')
plt.axis('off')
plt.show()

A = img

[U, S, VT] = np.linalg.svd(A, full_matrices = False)

U = np.asmatrix(U)
S = np.asmatrix(np.diag(S))
VT = np.asmatrix(VT)

print(U.shape, S.shape, VT.shape, '\n')

(1459, 1459) (1459, 1459) (1459, 2593)

sig_val = np.diag(S)

plt.figure(figsize = (6, 4))
plt.stem(sig_val)
plt.title('Singular Values')
plt.show()

Approximated image when k = 20

k = 20

low_U = U[:, :k]
low_S = S[:k, :k]
low_VT = VT[:k, :]

A_hat = low_U*low_S*low_VT

plt.figure(figsize = (8, 4))
plt.subplot(1,2,1), plt.imshow(A, 'gray'), plt.axis('off'), plt.title('Original image')
plt.subplot(1,2,2), plt.imshow(A_hat, 'gray'), plt.axis('off'), plt.title('Approximated image w/ rank = 20')
plt.show()

Approximated images with k varied

K = [1, 2, 3, 4, 10, 20, 30, 100]

plt.figure(figsize = (10, 4))

for i in range(len(K)):
    low_U = U[:, :K[i]]
    low_S = S[:K[i], :K[i]]
    low_VT = VT[:K[i], :]

    A_hat = low_U*low_S*low_VT

    plt.subplot(2,4,i+1), plt.imshow(A_hat, 'gray'), plt.axis('off')
    plt.title('rank {}'.format(K[i]))

plt.show()

3.3.3. Example: To Get Rid of People in the Pictures¶

The data provided in this problem is a set of sequential pictures (or frames) taken from a video of people walking. The total number of pictures is 20. In this problem, you are asked to get rid of people in the pictures and only extract background from pictures.

Download and unzip data

(a) Plot 'capimage_0.jpg' image in grayscale.

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline

data_path = '/content/drive/MyDrive/ML_Colab/ML_data'
img = cv2.imread(data_path + '/capimage_0.jpg', 0)

plt.figure(figsize = (6, 4))
plt.imshow(img, 'gray')
plt.axis('off')
plt.show()

print(img.shape)
row = img.shape[0]
col = img.shape[1]

(338, 600)

(b) Load the image into grayscale and reshape each picture as (row $\times$ col, 1), then horizontally stack them to form matrix $A$ with a shape of (row $\times$ col, 20).

for i in range(20):
  img = cv2.imread(data_path + '/capimage_' + str(i) + '.jpg', 0)
  flat_img = img.reshape(-1, 1)
  if i == 0:
    A = flat_img
  else:
    A = np.hstack((A, flat_img))

A.shape

(202800, 20)

(c) Compute SVD of matrix $A$ and plot the sigular values.

U, S, VT = np.linalg.svd(A, full_matrices = False)
U = np.asmatrix(U)
S = np.asmatrix(np.diag(S))
VT = np.asmatrix(VT)

print(U.shape, S.shape, VT.shape, '\n')

plt.figure(figsize = (6, 4))
plt.stem(np.diag(S))
plt.title('Singular Values')
plt.show()

(202800, 20) (20, 20) (20, 20)

(d) Apply low rank approximation to matrix $A$ with $k=1$, and then plot $\hat A[:,0]$ with a reshape of (row, col).

$$\hat A = U_k \Sigma_k V_k^T = \sum_{i=1}^{k}\sigma_i u_i v_i^T \qquad (k = 1)$$

A_hat = U[:,0] * S[0,0] * VT[0,:]

plt.figure(figsize = (6, 8))

plt.subplot(2,1,1)
plt.imshow(A[:, 0].reshape(row, col), 'gray')
plt.axis('off')

plt.subplot(2,1,2)
plt.imshow(A_hat[:, 0].reshape(row, col), 'gray')
plt.axis('off')
plt.show()

4. Proper Orthogonal Decomposition (POD)¶

Suppose we have data that is a function of both space and time
How can use this data to decompose this system into its building blocks?
To start to answer these questions, we can start with a simple separation of variables:

$$y(x,t) = \sum_{j = 1}^{m}u_j(x) a_j(t)$$

It is reasonable to want this decomposition to be as efficient as possible, in the sense that we can approximate most of the data with smallest “𝑚”

Arrangement of Data

Suppose we collect data at spatial locations $x_1, x_2, \cdots, x_n$
Suppose we collect data at times $t_1, t_2, \cdots, t_m$
We can start by assembling the data $y(x,t)$ into an $n \times m$ matrix

$\quad\;\;\;$(typically after first subtracting a mean or equilibrium state)

$$Y = \begin{bmatrix} u(x_1, t_1) & u(x_1, t_2) & \cdots & u(x_1, t_m) \\ u(x_2, t_1) & u(x_2, t_2) & \cdots & u(x_2, t_m) \\ \vdots & \vdots & \ddots & \vdots \\ u(x_n, t_1) & u(x_n, t_2) & \cdots & u(x_n, t_m) \\ \end{bmatrix} $$

POD Algorithm

$$ \begin{align*} Y_r &= U_r \Sigma_r V^T_r \\ &= \sum^{r}_{j=1}u_j \; \sigma_j v_j^T \\ & = \sum^{r}_{j=1}u_j(x) \; \sigma_j v_j(t)^T = \sum^{r}_{j=1}u_j (x) a_j(t) \end{align*} $$

4.1. Example 1¶

$$y(x,t) = \sin(\omega t - k x) + \varepsilon $$

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# data generated from the function

x = np.linspace(0, 5, 100)
Nx = np.size(x)

dt = 0.01
t = np.arange(0, 5, dt)

k = 2
omega = 4*np.pi

Y = np.zeros([np.size(x), np.size(t)], dtype = 'd')

for i in range(0, np.size(x)):
  for j in range(0, np.size(t)):
    Y[i, j] = np.sin(omega*t[j] - k*x[i]) + 0.1*np.random.randn()

# Let's see the signal in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)

def animate(i):
  ax1.clear()
  ax1.plot(x, Y[:,i], linewidth = 2)
  ax1.set_xlabel('x')
  ax1.set_xlim(0, 5)
  ax1.set_ylim(-2, 2)
  ax1.grid()

ani = FuncAnimation(fig, animate, frames = int(np.size(t)/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

U, S, VT = np.linalg.svd(Y, full_matrices = False)

print(U.shape)
print(S.shape)
print(VT.shape)

(100, 100)
(100,)
(100, 500)

plt.semilogy(S, '-o')
plt.xlim(0, 5)
plt.ylabel('Singular Value', fontsize = 15)
plt.xlabel('Index', fontsize = 15)
plt.show()

$$ y(x,t) \approx u_1(x)\sigma_1 v_1(t) + u_2(x)\sigma_2 v_2(t)$$

plt.plot(x, U[:,0])
plt.title('POD Mode 1', fontsize = 12)
plt.xlabel('x', fontsize = 15)
plt.ylim([-2, 2])
plt.grid()
plt.show()

plt.plot(x, U[:,1])
plt.title('POD Mode 2', fontsize = 12)
plt.xlabel('x', fontsize = 15)
plt.ylim([-2, 2])
plt.grid()
plt.show()

$$\begin{align*} y(x,t) &\approx u_1(x)\sigma_1 v_1(t) + u_2(x)\sigma_2 v_2(t) \\ &\approx u_1(x)a_1(t) + u_2(x)a_2(t) \end{align*} $$

plt.plot(t, S[0]*VT[0,:], label = 'a1(t)', linewidth = 2)
plt.plot(t, S[1]*VT[1,:], label = 'a2(t)', linewidth = 2)
plt.xlim(0, 5)
plt.xlabel('time', fontsize = 15)
plt.legend()
plt.title('Mode Coefficients', fontsize = 12)
plt.grid()
plt.show()

U = np.asmatrix(U)
diagS = np.asmatrix(np.diag(S))
VT = np.asmatrix(VT)

print(U.shape, diagS.shape, VT.shape, '\n')

(100, 100) (100, 100) (100, 500)

k = 2

low_U = U[:, :k]
low_S = diagS[:k, :k]
low_VT = VT[:k, :]

low_Y = low_U*low_S*low_VT

print(low_U.shape, low_S.shape, low_VT.shape, '\n')

(100, 2) (2, 2) (2, 500)

# Let's see the signal in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)

def animate(i):
  ax1.clear()
  ax1.plot(x, low_Y[:,i], linewidth = 2)
  ax1.set_xlim(0, 5)
  ax1.set_ylim(-2, 2)
  ax1.grid()

ani = FuncAnimation(fig, animate, frames = int(np.size(t)/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

$$\begin{align*} y(x,t) &\approx u_1(x)\sigma_1 v_1(t) + u_2(x)\sigma_2 v_2(t)\\\\ y(x,t) = \sin(\omega t - k x) &= \sin(\omega t)\cos(-kx) + \cos(\omega t) \sin(-kx) \end{align*} $$

4.2. Example 2: Fluid flow past a circular cylinder at low Reynolds number¶

from "Data-driven Science & Engineering" by Steven L. Brunton and J. Nathan Kutz

Data Desciription and its Visualization

ALL.mat - 151 snapshots of u and v velocity as well as vorticity (UALL, VALL, VORTALL).
Datat set for fluid flow past at cylinder at RE = 100
Steven L. Brunton
May 12, 2016

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

from scipy import io

flows = io.loadmat('/content/drive/MyDrive/tutorials/CAE및응용역학/24년_춘계학술대회/CYLINDER_ALL.mat')

flows_mat_u = flows['UALL']
flows_mat_v = flows['VALL']
flows_mat_vort = flows['VORTALL']

flows_mat_u.shape

(89351, 151)

nx = 449
ny = 199

flows_mat_u[:,0].reshape(nx, ny),

(array([[1.01243 , 1.01186 , 1.01119 , ..., 1.01816 , 1.01862 , 1.019   ],
        [1.01394 , 1.01323 , 1.01255 , ..., 1.01956 , 1.02001 , 1.02046 ],
        [1.01516 , 1.01458 , 1.01393 , ..., 1.02097 , 1.02139 , 1.02175 ],
        ...,
        [1.01202 , 1.01168 , 1.01129 , ..., 0.888106, 0.891065, 0.894012],
        [1.00918 , 1.00873 , 1.00829 , ..., 0.890758, 0.893679, 0.896936],
        [1.00618 , 1.00579 , 1.00532 , ..., 0.893477, 0.896224, 0.898832]]),)

plt.subplot(1,3,1)
plt.imshow(flows_mat_u[:,150].reshape(nx, ny), 'gray')
plt.title('u velocity')
plt.axis('off')

plt.subplot(1,3,2)
plt.imshow(flows_mat_v[:,150].reshape(nx, ny), 'gray')
plt.title('v velocity')
plt.axis('off')

plt.subplot(1,3,3)
plt.imshow(flows_mat_vort[:,150].reshape(nx, ny), 'gray')
plt.title('vorticity')
plt.axis('off')

plt.show()

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)

def animate(i):
  ax1.clear()
  ax1.imshow(flows_mat_u[:,i].reshape(nx, ny), 'gray')
  ax1.set_title('u velocity')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(flows_mat_v[:,i].reshape(nx, ny), 'gray')
  ax2.set_title('v velocity')
  ax2.axis('off')
  ax3.clear()
  ax3.imshow(flows_mat_vort[:,i].reshape(nx, ny), 'gray')
  ax3.set_title('vorticity')
  ax3.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

Typically after first subtracting a mean or equilibrium state

flows_mat_vort_mean = flows_mat_vort.mean(axis = 1)
flows_mat_vort_centered = flows_mat_vort - flows_mat_vort_mean[:, None]
flows_mat_vort_centered_normalized = (flows_mat_vort_centered - flows_mat_vort_centered.min()) / (flows_mat_vort_centered.max() - flows_mat_vort_centered.min())

plt.imshow(flows_mat_vort_mean.reshape(nx, ny), 'gray')
plt.title('vorticity')
plt.axis('off')
plt.show()

flows_mat = flows_mat_vort_centered_normalized

U, S, VT = np.linalg.svd(flows_mat, full_matrices = False)
print(U.shape)
print(S.shape)
print(VT.shape)

(89351, 151)
(151,)
(151, 151)

plt.semilogy(S, '-o')
plt.ylabel('Singular Value', fontsize = 15)
plt.xlabel('Index', fontsize = 15)
plt.show()

plt.subplot(1,4,1)
plt.imshow(U[:,0].reshape(nx, ny), 'gray')
plt.title('POD Mode 1')
plt.axis('off')

plt.subplot(1,4,2)
plt.imshow(U[:,1].reshape(nx, ny), 'gray')
plt.title('POD Mode 2')
plt.axis('off')

plt.subplot(1,4,3)
plt.imshow(U[:,4].reshape(nx, ny), 'gray')
plt.title('POD Mode 5')
plt.axis('off')

plt.subplot(1,4,4)
plt.imshow(U[:,12].reshape(nx, ny), 'gray')
plt.title('POD Mode 13')
plt.axis('off')

plt.show()

U = np.asmatrix(U)
diagS = np.asmatrix(np.diag(S))
VT = np.asmatrix(VT)

print(U.shape, diagS.shape, VT.shape, '\n')

(89351, 151) (151, 151) (151, 151)

k = 5

low_U = U[:, :k]
low_S = diagS[:k, :k]
low_VT = VT[:k, :]

print(low_U.shape, low_S.shape, low_VT.shape, '\n')

low_flows_mat = low_U*low_S*low_VT

(89351, 5) (5, 5) (5, 151)

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

def animate(i):
  ax1.clear()
  ax1.imshow(flows_mat[:,i].reshape(nx, ny), 'gray')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(low_flows_mat[:,i].reshape(nx, ny), 'gray')
  ax2.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

5. Autoencoder¶

MNIST example
Use only (1, 5, 6) digits to visualize in 2-D

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

# Load Data

mnist = tf.keras.datasets.mnist

(train_x, train_y), (test_x, test_y) = mnist.load_data()
train_x, test_x = train_x.reshape(-1, 784)/255.0, test_x.reshape(-1, 784)/255.0

# Use only 1,5,6 digits to visualize latent sapce

train_idx1 = np.array(np.where(train_y == 1))
train_idx5 = np.array(np.where(train_y == 5))
train_idx6 = np.array(np.where(train_y == 6))
train_idx = np.sort(np.concatenate((train_idx1, train_idx5, train_idx6), axis = None))

test_idx1 = np.array(np.where(test_y == 1))
test_idx5 = np.array(np.where(test_y == 5))
test_idx6 = np.array(np.where(test_y == 6))
test_idx = np.sort(np.concatenate((test_idx1, test_idx5, test_idx6), axis = None))

train_imgs = train_x[train_idx]
train_labels = train_y[train_idx]
test_imgs = test_x[test_idx]
test_labels = test_y[test_idx]

n_train = train_imgs.shape[0]
n_test = test_imgs.shape[0]

print ("The number of training images : {}, shape : {}".format(n_train, train_imgs.shape))
print ("The number of testing images : {}, shape : {}".format(n_test, test_imgs.shape))

The number of training images : 18081, shape : (18081, 784)
The number of testing images : 2985, shape : (2985, 784)

5.1. Train Autoencoder¶

Define a Structure of an Autoencoder

Input shape and latent variable shape
Encoder shape
Decoder shape

# Define Structure

# Encoder Structure
encoder = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 500, activation = 'relu', input_dim = 784),
    tf.keras.layers.Dense(units = 300, activation = 'relu'),
    tf.keras.layers.Dense(units = 2, activation = None)
    ])

# Decoder Structure
decoder = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 300, activation = 'relu', input_shape = (2,)),
    tf.keras.layers.Dense(units = 500, activation = 'relu'),
    tf.keras.layers.Dense(units = 28*28, activation = None)
    ])

# Autoencoder = Encoder + Decoder
autoencoder = tf.keras.models.Sequential([encoder, decoder])

Define Loss and Optimizer

Loss

Squared loss

$$ \frac{1}{m}\sum_{i=1}^{m} (t_{i} - y_{i})^2 $$

Optimizer

AdamOptimizer: the most popular optimizer

autoencoder.compile(optimizer = tf.keras.optimizers.Adam(0.001),
                    loss = 'mean_squared_error',
                    metrics = ['mse'])

# Train Model & Evaluate Test Data

training = autoencoder.fit(train_imgs, train_imgs, batch_size = 50, epochs = 10)

Epoch 1/10
362/362 [==============================] - 6s 6ms/step - loss: 0.0372 - mse: 0.0372
Epoch 2/10
362/362 [==============================] - 2s 6ms/step - loss: 0.0304 - mse: 0.0304
Epoch 3/10
362/362 [==============================] - 2s 5ms/step - loss: 0.0293 - mse: 0.0293
Epoch 4/10
362/362 [==============================] - 3s 8ms/step - loss: 0.0283 - mse: 0.0283
Epoch 5/10
362/362 [==============================] - 3s 8ms/step - loss: 0.0276 - mse: 0.0276
Epoch 6/10
362/362 [==============================] - 4s 10ms/step - loss: 0.0272 - mse: 0.0272
Epoch 7/10
362/362 [==============================] - 3s 9ms/step - loss: 0.0268 - mse: 0.0268
Epoch 8/10
362/362 [==============================] - 3s 7ms/step - loss: 0.0265 - mse: 0.0265
Epoch 9/10
362/362 [==============================] - 3s 8ms/step - loss: 0.0261 - mse: 0.0261
Epoch 10/10
362/362 [==============================] - 3s 9ms/step - loss: 0.0260 - mse: 0.0260

Test or Evaluate

Test reconstruction performance of the autoencoder

test_scores = autoencoder.evaluate(test_imgs, test_imgs, verbose = 2)

print('Test loss: {}'.format(test_scores[0]))
print('Mean Squared Error: {} %'.format(test_scores[1]*100))

94/94 - 0s - loss: 0.0263 - mse: 0.0263 - 322ms/epoch - 3ms/step
Test loss: 0.026335440576076508
Mean Squared Error: 2.6335440576076508 %

# Visualize Evaluation on Test Data

rand_idx = np.random.randint(1, test_imgs.shape[0])
# rand_idx = 6

test_img = test_imgs[rand_idx]
reconst_img = autoencoder.predict(test_img.reshape(1, 28*28))

plt.figure(figsize = (8, 4))
plt.subplot(1,2,1)
plt.imshow(test_img.reshape(28,28), 'gray')
plt.title('Input Image', fontsize = 12)

plt.xticks([])
plt.yticks([])
plt.subplot(1,2,2)
plt.imshow(reconst_img.reshape(28,28), 'gray')
plt.title('Reconstructed Image', fontsize = 12)
plt.xticks([])
plt.yticks([])

plt.show()

1/1 [==============================] - 0s 98ms/step

5.2. Latent Space and Generation¶

To see the distribution of latent variables, we make a projection of 784-dimensional image space onto 2-dimensional latent space

idx = np.random.randint(0, len(test_labels), 500)
test_x, test_y = test_imgs[idx], test_labels[idx]

test_latent = encoder.predict(test_x)

plt.figure(figsize = (6, 6))
plt.scatter(test_latent[test_y == 1,0], test_latent[test_y == 1,1], label = '1')
plt.scatter(test_latent[test_y == 5,0], test_latent[test_y == 5,1], label = '5')
plt.scatter(test_latent[test_y == 6,0], test_latent[test_y == 6,1], label = '6')
plt.title('Latent Space', fontsize = 12)
plt.xlabel('Z1', fontsize = 12)
plt.ylabel('Z2', fontsize = 12)
plt.legend(fontsize = 12)
plt.axis('equal')
plt.show()

16/16 [==============================] - 0s 2ms/step

Data Generation

It generates something that makes sense.
These results are unsatisfying, because the density model used on the latent space ℱ is too simple and inadequate.
Building a “good” model amounts to our original problem of modeling an empirical distribution, although it may now be in a lower dimension space.
This is a motivation to VAE, GAN, or diffusion model.

new_data = np.array([[2, -4]])

fake_image = decoder.predict(new_data)

plt.figure(figsize = (9, 4))
plt.subplot(1,2,1)
plt.scatter(test_latent[test_y == 1,0], test_latent[test_y == 1,1], label = '1')
plt.scatter(test_latent[test_y == 5,0], test_latent[test_y == 5,1], label = '5')
plt.scatter(test_latent[test_y == 6,0], test_latent[test_y == 6,1], label = '6')
plt.scatter(new_data[:,0], new_data[:,1], c = 'k', marker = 'o', s = 200, label = 'new data')
plt.title('Latent Space', fontsize = 10)
plt.xlabel('Z1', fontsize = 10)
plt.ylabel('Z2', fontsize = 10)
plt.legend(loc = 2, fontsize = 10)
plt.axis('equal')
plt.subplot(1,2,2)
plt.imshow(fake_image.reshape(28,28), 'gray')
plt.title('Generated Fake Image', fontsize = 10)
plt.xticks([])
plt.yticks([])
plt.show()

1/1 [==============================] - 0s 201ms/step

5.3. Example: Fluid flow past a circular cylinder at low Reynolds number¶

Convolutional Autoencoders

Latent space dimensions: 30, 10, 5, 2
Train each autoencoder separately with the same training dataset

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

from scipy import io

flows = io.loadmat('/content/drive/MyDrive/tutorials/CAE및응용역학/24년_춘계학술대회/CYLINDER_ALL.mat')

flows_mat_u = flows['UALL']
flows_mat_v = flows['VALL']
flows_mat_vort = flows['VORTALL']

flows_mat_u.shape

(89351, 151)

import numpy as np
from numpy import linalg
import matplotlib.pyplot as plt
%matplotlib inline

flows_mat_u_mean = flows_mat_u.mean(axis = 1)
flows_mat_u_centered = flows_mat_u - flows_mat_u_mean[:, None]

print(flows_mat_u_centered.shape)

(89351, 151)

nx = 449
ny = 199

plt.imshow(flows_mat_u_mean.reshape(nx, ny), 'gray')
plt.title('u velocity')
plt.axis('off')

(-0.5, 198.5, 448.5, -0.5)

flows_mat_u_centered_normalized = (flows_mat_u_centered - flows_mat_u_centered.min()) / (flows_mat_u_centered.max() - flows_mat_u_centered.min())
train_x = flows_mat_u_centered_normalized.T.reshape(-1, nx, ny, 1)

import tensorflow as tf

def create_and_train_model(lat_dim):
    encoder = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(filters = 64,
                               kernel_size = (11, 11),
                               strides = (5, 5),
                               activation = 'relu',
                               padding = 'SAME',
                               input_shape = (449, 199, 1)),

        tf.keras.layers.Conv2D(filters = 64,
                               kernel_size = (7, 7),
                               strides = (5, 5),
                               activation = 'relu',
                               padding = 'SAME'),

        tf.keras.layers.Conv2D(filters = 32,
                               kernel_size = (5, 5),
                               strides = (3, 3),
                               activation = 'relu',
                               padding = 'SAME'),

        tf.keras.layers.Conv2D(filters = lat_dim,
                               kernel_size = (6, 3),
                               strides = (6, 3),
                               padding = 'VALID')
    ])

    decoder = tf.keras.models.Sequential([
        tf.keras.layers.Conv2DTranspose(filters = 32,
                                        kernel_size = (6, 3),
                                        strides = (6, 3),
                                        activation = 'relu',
                                        padding = 'VALID',
                                        input_shape = (1, 1, lat_dim)),

        tf.keras.layers.Conv2DTranspose(filters = 64,
                                        kernel_size = (5, 5),
                                        strides = (3, 3),
                                        activation = 'relu',
                                        padding = 'SAME'),

        tf.keras.layers.Conv2DTranspose(filters = 64,
                                        kernel_size = (7, 7),
                                        strides = (5, 5),
                                        activation = 'relu',
                                        padding = 'SAME'),

        tf.keras.layers.Conv2DTranspose(filters = 1,
                                        kernel_size = (11, 11),
                                        strides = (5, 5),
                                        padding = 'SAME'),

        tf.keras.layers.Cropping2D(cropping = ((1, 0), (13, 13)))
    ])

    latent = encoder.output
    result = decoder(latent)
    model = tf.keras.Model(inputs = encoder.input, outputs = result)
    model.compile(optimizer = 'adam', loss = 'mean_squared_error')
    model.fit(train_x, train_x, epochs = 500, verbose = 0)

    return model, encoder

# model30, encoder30 = create_and_train_model(30)
# model10, encoder10 = create_and_train_model(10)
# model5, encoder5 = create_and_train_model(5)
model2, encoder2 = create_and_train_model(2)

Encode Datasets to the Latent Space and Reconstruct to the Original Image with Decoders

# predict30 = model30.predict(train_x)
# predict30 = predict30.reshape(151, 89351)
# train_30 = train_x.reshape(151, 89351)

# predict10 = model10.predict(train_x)
# predict10 = predict10.reshape(151, 89351)
# train_10 = train_x.reshape(151, 89351)

# predict5 = model5.predict(train_x)
# predict5 = predict5.reshape(151, 89351)
# train_5 = train_x.reshape(151, 89351)

predict2 = model2.predict(train_x)
predict2 = predict2.reshape(151, 89351)
train_2 = train_x.reshape(151, 89351)

5/5 [==============================] - 2s 284ms/step

Reconstruction Results from the Latent Space

We just need 30, 10, 5, 2 dimensional data and the decoder for reconstructing the original image with the dimension of 89,351

Latent Space with 30 Dimensions

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

def animate(i):
  ax1.clear()
  ax1.imshow(train_30[i,:].reshape(nx, ny), 'gray')
  ax1.set_title('Ground Truth')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(predict30[i,:].reshape(nx, ny), 'gray')
  ax2.set_title('Reconstructed')
  ax2.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

Latent Space with 10 Dimensions

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

def animate(i):
  ax1.clear()
  ax1.imshow(train_10[i,:].reshape(nx, ny), 'gray')
  ax1.set_title('Ground Truth')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(predict10[i,:].reshape(nx, ny), 'gray')
  ax2.set_title('Reconstructed')
  ax2.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

Latent Space with 5 Dimensions

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure()
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

def animate(i):
  ax1.clear()
  ax1.imshow(train_5[i,:].reshape(nx, ny), 'gray')
  ax1.set_title('Ground Truth')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(predict5[i,:].reshape(nx, ny), 'gray')
  ax2.set_title('Reconstructed')
  ax2.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 10)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

Latent Space with 2 Dimensions

With the help of the latent space, we can easily notice that the cylinder flow has period of 30 time steps.
Rotation around the circle in the latent space takes 30 time steps.

# Let's see the flow in time

from matplotlib.animation import FuncAnimation
import time
from IPython import display

fig = plt.figure(figsize = (8, 10))
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 1, 2)

lat = encoder2.predict(train_x)
lat = lat.reshape(151, 2)

def animate(i):
  ax1.clear()
  ax1.imshow(train_2[i,:].reshape(nx, ny), 'gray')
  ax1.set_title('Ground Truth')
  ax1.axis('off')
  ax2.clear()
  ax2.imshow(predict2[i,:].reshape(nx, ny), 'gray')
  ax2.set_title('Reconstructed Time Step: {}'.format(i))
  ax2.axis('off')
  ax3.clear()
  ax3.scatter(lat[:,0], lat[:,1], color = 'k')
  ax3.scatter(lat[i,0], lat[i,1], color = 'r', s = 50)
  ax3.set_title('Latent Space Visualization (Time Step {})'.format(i))
  ax3.set_xticks([])
  ax3.set_yticks([])
  #ax3.axis('off')

ani = FuncAnimation(fig, animate, frames = int(150/1), interval = 100)
video = ani.to_html5_video()
html = display.HTML(video)
display.display(html)
plt.close()

5/5 [==============================] - 1s 173ms/step

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')