Deep Learning for Mechanical Engineering

Problem Set 01

Instructor: Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

For your handwritten solutions, please scan or take a picture of them. Alternatively, you can write them in markdown if you prefer.
Only .ipynb files will be graded for your code.
- Ensure that your NAME and student ID are included in your .ipynb files. For example, 'SeungchulLee_20231234_HW01.ipynb'
Compress all the files into a single .zip file.
- In the .zip file's name, include your NAME and student ID For example, 'SeungchulLee_20231234_HW01.zip'
- Submit this .zip file on KLMS
Do not submit a printed version of your code, as it will not be graded.

Problem 1: Optimization and Gradiet Descent¶

You will find the optimal solution using gradient descent.

First, run the cell below. It contains pre-defined utility functions.

# Do NOT change this cell !!

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

def plotHistory(hist):
    for i in range(len(hist)-1):
        plt.arrow(hist[i][0][0], hist[i][1][0], hist[i+1][0][0]-hist[i][0][0], hist[i+1][1][0]-hist[i][1][0], alpha = 0.3)
        plt.scatter(hist[i][0][0], hist[i][1][0])
    plt.title('Initial (x1, x2): (0.0, 0.0)')
    plt.grid('on')
    plt.xlabel('x1')
    plt.ylabel('x2')
    plt.show()

def update(x):
    pass

def train(alpha, n_iter):
    def saveHistory(a = None, hist = []):
        if a is not None: hist.append(a.tolist())
        return hist

    x =  np.zeros((2,1)) # inintial value
    saveHistory(x)

    for i in range(n_iter):
        x = update(x)
        saveHistory(x)

    hist = saveHistory()
    plotHistory(hist)
    print('(x1, x2) = ({:.3f}, {:.3f})'.format(hist[-1][0][0], hist[-1][1][0]))
    print('f(x1, x2) = {:.3f}'.format((0.5*x.T*H*x + g.T*x + 1).tolist()[0][0]))

# Do NOT change this cell !!

Here is the objective function you need to solve.

min_{x_{1}, x_{2}} (x_{1} - 1)^{2} + (2 x_{1} - x_{2})^{2}

$\min_{x_1, \,x_2} \, (x_1 - 1)^2 + (2x_1 - x_2)^2$

(a) Find $H$ and $g$ to transform the objective function into matrix form as the below. Use np.matrix()

f = \frac{1}{2} X^{T} H X + g X + c

$f = \frac{1}{2} X^T H X + g X + c$

# your code here
H =
g =

(b) Define a function to update $x$ based on the following eqautions:

\nabla f = H X + g

$\nabla{f} = HX+g$

$X_{i+1} = X_i - \alpha_i \nabla f (X_i)$

def update(x):
    # Your code here
    new_x =

    return new_x

(c) Find a learning rate $\alpha$ to make it converge within 150 iterations. (round to the 3rd decimal place)

(d) Adjust the training parameters to obtain the figures shown below:

(d-1) Stably converge

# Your code here
alpha =
n_iter =

train(alpha, n_iter)

(d-2) Unstably converge

# Your code here
alpha =
n_iter =

train(alpha, n_iter)

(d-3) Diverge

# Your code here
alpha =
n_iter =

train(alpha, n_iter)

Problem 2: Optimization and Gradiet Descent with Constraints¶

You will find the optimal solution using gradient descent.

\begin{array}{cr} \begin{aligned} max_{x} & 3 x_{1} + \frac{3}{2} x_{2} \\ subject to & - 1 \leq x_{1} \leq 2 \\ 0 \leq x_{2} \leq 3 \end{aligned} \end{array} ⟹ \begin{matrix} min_{x} & - {[\begin{matrix} 3 \\ 3 / 2 \end{matrix}]}^{T} [\begin{matrix} x_{1} \\ x_{2} \end{matrix}] \\ subject to & [\begin{matrix} - 1 \\ 0 \end{matrix}] \leq [\begin{matrix} x_{1} \\ x_{2} \end{matrix}] \leq [\begin{matrix} 2 \\ 3 \end{matrix}] \end{matrix}

$\begin{array}{Icr}\begin{align*} \max_{x} \quad & 3x_1 + {3 \over 2}x_2 \\ \text{subject to} \quad & -1 \leq x_1 \leq 2 \\ & \quad 0 \leq x_2 \leq 3 \end{align*}\end{array} \quad\implies\quad \begin{array}{I} \quad \min_{x} \quad & - \begin{bmatrix} 3 \\ 3 / 2 \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\ \text{subject to} \quad & \begin{bmatrix} -1 \\ 0 \end{bmatrix} \leq \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \leq \begin{bmatrix} 2 \\ 3 \\ \end{bmatrix} \end{array}$

df =
x =
alpha =

lower_bound =
upper_bound =

for i in range(25):
    x =

    # lb constraints
    lb_TF = lower_bound < x
    x =

    # ub constraints
    ub_TF = x < upper_bound
    x =

print(x)

Problem 3: Image Panorama with Regression¶

We want to demonstrate an image panorama as an example of linear regression. A panorama is any wide-angle view or representation of a physical space.

%%html
<center><iframe src="https://www.youtube.com/embed/86rnwu3ZFbE?rel=0"
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

You need to install opencv module and download the images.

opencv install
- https://pypi.python.org/pypi/opencv-python
Image download
- https://www.dropbox.com/s/vlmp67zlq4mo0um/HW1.zip?dl=1

# import library
import numpy as np
import matplotlib.pyplot as plt
import cv2

# load images
imag1 = cv2.imread('./data_files/1.jpg')
imag1 = cv2.cvtColor(imag1, cv2.COLOR_BGR2RGB)
imag2 = cv2.imread('./data_files/2.jpg')
imag2 = cv2.cvtColor(imag2, cv2.COLOR_BGR2RGB)
imag3 = cv2.imread('./data_files/3.jpg')
imag3 = cv2.cvtColor(imag3, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(10, 6))
plt.imshow(imag1)
plt.axis('off')
plt.show()

plt.figure(figsize=(10, 6))
plt.imshow(imag2)
plt.axis('off')
plt.show()

plt.figure(figsize=(10, 6))
plt.imshow(imag3)
plt.axis('off')
plt.show()

Here, we are explaining the basic concept of homography (i.e., perspective transformation).

Any wide-angle view or representation of a physical space
images with horizontally elongated fields of view
idea: projecting images onto a common plane

Camera rotating about its center

Two image planes are related by a homography $H$

Do not worry about a homography transformation. (out of this course's scope)

[\begin{matrix} x^{'} \\ y^{'} \\ 1 \end{matrix}] \sim [\begin{matrix} ω x^{'} \\ ω y^{'} \\ ω \end{matrix}] = [\begin{matrix} a & b & c \\ d & e & f \\ g & h & 1 \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}]

$\begin{bmatrix} x'\\y'\\1 \end{bmatrix} \sim \begin{bmatrix} \omega x'\\\omega y'\\\omega \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} \begin{bmatrix} x\\y\\1 \end{bmatrix}$

Find key points between two images

Suppose these matching points are given.
- We have manually found the matching points for you, although there is a technique to do this automatically.
pos1 and pos2 are matching points between img01 and img02
pos3 and pos4 are matching points between img02 and img03

pos1 = np.array([[2121, 2117, 2749, 3095, 3032, 3375, 3677, 3876],
                 [1431, 2034, 2033, 1885, 2017, 2037, 1885, 2279]], dtype=np.int64)
pos2 = np.array([[188, 58, 828, 1203, 1121, 1437, 1717, 1817],
                 [1217, 1909, 1952, 1827, 1952, 1991, 1870, 2226]], dtype=np.int64)
pos3 = np.array([[2338, 2379, 2658, 2899, 2977, 3272, 2716, 2786],
                 [1948, 1874, 2000, 1837, 1964, 1966, 2143, 2317]], dtype=np.int64)
pos4 = np.array([[109, 178, 497, 795, 851, 1144, 534, 580],
                 [1907, 1828, 1988, 1834, 1971, 1993, 2145, 2333]], dtype=np.int64)

(a) Visualization of key points

## your code here
## Write down your own code to mark the key points (red dots) on the locations of the given images

Estimation of homography H

X^{'} = H X

$X' = HX$

where $X$ and $X'$ are position vectors of key points, and $H$ is a Perspective Transformation

Goal: we need to estimate homography $H$ via matching points between two images

[\begin{matrix} x^{'} \\ y^{'} \\ 1 \end{matrix}] \sim [\begin{matrix} ω x^{'} \\ ω y^{'} \\ ω \end{matrix}] = [\begin{matrix} θ_{1} & θ_{2} & θ_{3} \\ θ_{4} & θ_{5} & θ_{6} \\ θ_{7} & θ_{8} & 1 \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}]

$\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} \sim \begin{bmatrix} \omega x' \\ \omega y' \\ \omega \end{bmatrix} = \begin{bmatrix} \theta_{1} & \theta_{2} & \theta_{3} \\ \theta_{4} & \theta_{5} & \theta_{6} \\ \theta_{7} & \theta_{8} & 1 \end{bmatrix} \begin{bmatrix} x\\ y\\ 1 \end{bmatrix}$
(b) Show the following equations from the above homography

H

$H$

\begin{aligned} x^{'} & = \frac{θ_{1} x + θ_{2} y + θ_{3}}{θ_{7} x + θ_{8} y + 1} \\ y^{'} & = \frac{θ_{4} x + θ_{5} y + θ_{6}}{θ_{7} x + θ_{8} y + 1} \end{aligned}

$\begin{align*} x' &= \frac{\theta_1 x+\theta_2 y+\theta_3}{\theta_7 x+\theta_8 y+1} \\ y' &= \frac{\theta_4 x+\theta_5 y+\theta_6}{\theta_7 x+\theta_8 y+1} \end{align*}$

\begin{aligned} θ_{1} x + θ_{2} y + θ_{3} - θ_{7} x^{'} x - θ_{8} x^{'} y - x^{'} & = 0 \\ θ_{4} x + θ_{5} y + θ_{6} - θ_{7} y^{'} x - θ_{8} y^{'} y - y^{'} & = 0 \end{aligned}

$\begin{align*} \theta_1 x+\theta_2 y+\theta_3 -\theta_7 x'x-\theta_8 x'y-x' &= 0 \\ \theta_4 x+\theta_5 y+\theta_6 -\theta_7 y'x-\theta_8 y'y-y' &= 0 \end{align*}$
(c) For

m

$m$ pairs of matching potins, show that a feature matrix

Φ

$\Phi$ can be expressed as follows:

$\Phi$ is a feature matrix

Φ = [\begin{matrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & - x_{1}^{'} x_{1} & - x_{1}^{'} y_{1} \\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & - y_{1}^{'} x_{1} & - y_{1}^{'} y_{1} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ x_{m} & y_{m} & 1 & 0 & 0 & 0 & - x_{m}^{'} x_{m} & - x_{m}^{'} y_{m} \\ 0 & 0 & 0 & x_{m} & y_{m} & 1 & - y_{m}^{'} x_{m} & - y_{m}^{'} y_{m} \end{matrix}]

$\Phi = \begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x'_{1}x_{1} & -x'_{1}y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y'_{1}x_{1} & -y'_{1}y_{1}\\ \vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots\\ x_{m} & y_{m} & 1 & 0 & 0 & 0 & -x'_{m}x_{m} & -x'_{m}y_{m}\\ 0 & 0 & 0 & x_{m} & y_{m} & 1 & -y'_{m}x_{m} & -y'_{m}y_{m}\end{bmatrix}$

$\theta$ is a column vector for unknown parameters in a perspective transformation $H$

θ = [\begin{matrix} θ_{1} \\ θ_{2} \\ θ_{3} \\ θ_{4} \\ θ_{5} \\ θ_{6} \\ θ_{7} \\ θ_{8} \end{matrix}]

$\theta = \begin{bmatrix} \theta_{1} \\ \theta_{2} \\ \theta_{3} \\ \theta_{4} \\ \theta_{5} \\ \theta_{6} \\ \theta_{7} \\ \theta_{8} \end{bmatrix}$

$b$ is a column vector for corresponding positions in the base image

b = [\begin{matrix} x_{1}^{'} \\ y_{1}^{'} \\ x_{2}^{'} \\ y_{2}^{'} \\ ⋮ \\ x_{m}^{'} \\ y_{m}^{'} \end{matrix}]

$b = \begin{bmatrix} x'_{1} \\ y'_{1} \\ x'_{2} \\ y'_{2} \\ \vdots \\ x'_{m} \\ y'_{m} \end{bmatrix}$

It ends up becoming a linear regression problem

min_{θ} ‖ Φ θ - b ‖_{2}^{2}

$\min\limits_{\theta} \lVert \Phi\theta - b \rVert _2^2$

θ^{*} = (Φ^{T} Φ)^{- 1} Φ^{T} b

$\theta^* = (\Phi^T\Phi)^{-1}\Phi^T b$

(d) Perspective homography for image 1 and image 2

## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2



## your code here
## Define perspective_theta using linear regression

perspective_theta =

(e) Perspective homography for image 2 and image 3

## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2



## your code here
## Define perspective_theta3 using linear regression

perspective_theta3 =

Image warping¶

Again, do not worry about the image warping (outside lecture's scope)

cv2.warpPerspective?

## Apply image warping on image 1 & image 3 using cv2.warpPerspective function

## do translation to fit the warping image into a size of (18000, 6500) screen.
translation = np.matrix([[1, 0, 6000],
                         [0, 1, 2500],
                         [0, 0, 1]])

warpedImage = cv2.warpPerspective(imag1, translation*perspective_theta, (18000, 6500))
warpedImage3 = cv2.warpPerspective(imag3, translation*perspective_theta3, (18000, 6500))

screen = warpedImage.copy()
screen[screen==0] = warpedImage3[screen==0]
screen[2500:3024+2500,6000:4032+6000] = imag2

## Visualize panorama image
plt.figure(figsize=(20, 12))
plt.imshow(screen)
plt.show()