Deep Learning for Mechanical Engineering

Homework 01

Due Monday, 09/18/2023, 4:00 PM

Instructor: Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
• For your handwritten solution, scan or take a picture of them.

• Please compress all the files to make a single .zip file

• Submit it to KLSM
• Do not submit a printed version of your code. It will not be graded.

# Problem 1: Optimization and Gradiet Descent¶

You will find an optimal solution using the gradient descent.

(0) Run the below cell first. These are pre-defined utility functions.

In [ ]:
# Do NOT change this cell !!

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

def plotHistory(hist):
for i in range(len(hist)-1):
plt.arrow(hist[i][0][0], hist[i][1][0], hist[i+1][0][0]-hist[i][0][0], hist[i+1][1][0]-hist[i][1][0], alpha=0.3)
plt.scatter(hist[i][0][0], hist[i][1][0])
plt.title('Initial (x1, x2): (0.0, 0.0)'); plt.grid('on'); plt.xlabel('x1'), plt.ylabel('x2'); plt.show()

def update(x):
pass

def train(alpha, n_iter):
def saveHistory(a=None, hist=[]):
if a is not None: hist.append(a.tolist())
return hist

x =  np.zeros((2,1)) # inintial value
saveHistory(x)

for i in range(n_iter):
x = update(x)
saveHistory(x)

hist = saveHistory()
plotHistory(hist)
print('(x1, x2) = ({:.3f}, {:.3f})'.format(hist[-1][0][0], hist[-1][1][0]))
print('f(x1, x2) = {:.3f}'.format((0.5*x.T*H*x + g.T*x + 1).tolist()[0][0]))

# Do NOT change this cell !!


Here is the objective function you need to solve.

$$\min_{x_1, \,x_2} \, (x_1 - 1)^2 + (2x_1 - x_2)^2$$

(a) Find $H$ and $g$ to transform the objective function into a matrix form as the below. Use np.matrix()

$$f = \frac{1}{2} X^T H X + g X + c$$
In [ ]:
# your code here
H =
g =


(b) Define a function to update $x$ based on the following eqautions

$$\nabla{f} = HX+g$$

$$X_{i+1} = X_i - \alpha_i \nabla f (X_i)$$

In [ ]:
def update(x):
new_x =

return new_x


(c) Find a learning rate $\alpha$ to make it converge within 150 iterations. (round to the 3rd decimal place)

(d) Adjust the traning parameters to obtain the figures as below:

(d-1) Stably converge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)


(d-2) Unstably converge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)


(d-3) Diverge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)


# Problem 2: Optimization and Gradiet Descent with Constraints¶

You will find an optimal solution using the gradient descent.

\begin{array}{Icr}\begin{align*} \max_{x} \quad & 3x_1 + {3 \over 2}x_2 \\ \text{subject to} \quad & -1 \leq x_1 \leq 2 \\ & \quad 0 \leq x_2 \leq 3 \end{align*}\end{array} \quad\implies\quad \begin{array}{I} \quad \min_{x} \quad & - \begin{bmatrix} 3 \\ 3 / 2 \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\ \text{subject to} \quad & \begin{bmatrix} -1 \\ 0 \end{bmatrix} \leq \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \leq \begin{bmatrix} 2 \\ 3 \\ \end{bmatrix} \end{array}
In [ ]:
df =
x =
alpha =

lower_bound =
upper_bound =

for i in range(25):
x =

# lb constraints
lb_TF = lower_bound < x
x =

# ub constraints
ub_TF = x < upper_bound
x =

print(x)


# Problem 3: Image Panorama with Regression¶

We want to demonstrate an image panorama as an example of linear regression. A panorama is any wide-angle view or representation of a physical space.

In [1]:
%%html
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>


• opencv install

In [ ]:
# import library
import numpy as np
import matplotlib.pyplot as plt
import cv2

In [ ]:
# load images
imag1 = cv2.cvtColor(imag1, cv2.COLOR_BGR2RGB)
imag2 = cv2.cvtColor(imag2, cv2.COLOR_BGR2RGB)
imag3 = cv2.cvtColor(imag3, cv2.COLOR_BGR2RGB)

In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag1)
plt.axis('off')
plt.show()

In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag2)
plt.axis('off')
plt.show()

In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag3)
plt.axis('off')
plt.show()


Here, we are explaining the basic concept of homography (i.e., perspective transformation).

• Any wide-angle view or representation of a physical space

• images with horizontally elongated fields of view

• idea: projecting images onto a common plane

• Camera rotating about its center
• Two image planes are related by a homography $H$

Do not worry about a homography transformation. (out of this course's scope)

$$\begin{bmatrix} x'\\y'\\1 \end{bmatrix} \sim \begin{bmatrix} \omega x'\\\omega y'\\\omega \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} \begin{bmatrix} x\\y\\1 \end{bmatrix}$$

Find key points between two images
• Suppose these matching points are given.

• We have manually found the matching points for you, although there is a technique to do this automatically.
• pos1 and pos2 are matching points between img01 and img02

• pos3 and pos4 are matching points between img02 and img03

In [ ]:
pos1 = np.array([[2121, 2117, 2749, 3095, 3032, 3375, 3677, 3876],
[1431, 2034, 2033, 1885, 2017, 2037, 1885, 2279]], dtype=np.int64)
pos2 = np.array([[188, 58, 828, 1203, 1121, 1437, 1717, 1817],
[1217, 1909, 1952, 1827, 1952, 1991, 1870, 2226]], dtype=np.int64)
pos3 = np.array([[2338, 2379, 2658, 2899, 2977, 3272, 2716, 2786],
[1948, 1874, 2000, 1837, 1964, 1966, 2143, 2317]], dtype=np.int64)
pos4 = np.array([[109, 178, 497, 795, 851, 1144, 534, 580],
[1907, 1828, 1988, 1834, 1971, 1993, 2145, 2333]], dtype=np.int64)


(a) Visualization of key points

In [ ]:
## your code here
## Write down your own code to mark the key points (red dots) on the locations of the given images


Estimation of homography H

$$X' = HX$$

where $X$ and $X'$ are position vectors of key points, and $H$ is a Perspective Transformation

Goal: we need to estimate homography $H$ via matching points between two images

$$\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} \sim \begin{bmatrix} \omega x' \\ \omega y' \\ \omega \end{bmatrix} = \begin{bmatrix} \theta_{1} & \theta_{2} & \theta_{3} \\ \theta_{4} & \theta_{5} & \theta_{6} \\ \theta_{7} & \theta_{8} & 1 \end{bmatrix} \begin{bmatrix} x\\ y\\ 1 \end{bmatrix}$$
(b) Show the following equations from the above homography $H$
\begin{align*} x' &= \frac{\theta_1 x+\theta_2 y+\theta_3}{\theta_7 x+\theta_8 y+1} \\ y' &= \frac{\theta_4 x+\theta_5 y+\theta_6}{\theta_7 x+\theta_8 y+1} \end{align*}
\begin{align*} \theta_1 x+\theta_2 y+\theta_3 -\theta_7 x'x-\theta_8 x'y-x' &= 0 \\ \theta_4 x+\theta_5 y+\theta_6 -\theta_7 y'x-\theta_8 y'y-y' &= 0 \end{align*}
(c) For $m$ pairs of matching potins, show that a feature matrix $\Phi$ can be expressed as follows:
• $\Phi$ is a feature matrix
$$\Phi = \begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x'_{1}x_{1} & -x'_{1}y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y'_{1}x_{1} & -y'_{1}y_{1}\\ \vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots\\ x_{m} & y_{m} & 1 & 0 & 0 & 0 & -x'_{m}x_{m} & -x'_{m}y_{m}\\ 0 & 0 & 0 & x_{m} & y_{m} & 1 & -y'_{m}x_{m} & -y'_{m}y_{m}\end{bmatrix}$$
• $\theta$ is a column vector for unknown parameters in a perspective transformation $H$

$$\theta = \begin{bmatrix} \theta_{1} \\ \theta_{2} \\ \theta_{3} \\ \theta_{4} \\ \theta_{5} \\ \theta_{6} \\ \theta_{7} \\ \theta_{8} \end{bmatrix}$$
• $b$ is a column vector for corresponding positions in the base image

$$b = \begin{bmatrix} x'_{1} \\ y'_{1} \\ x'_{2} \\ y'_{2} \\ \vdots \\ x'_{m} \\ y'_{m} \end{bmatrix}$$
• It ends up becoming a linear regression problem

$$\min\limits_{\theta} \lVert \Phi\theta - b \rVert _2^2$$
$$\theta^* = (\Phi^T\Phi)^{-1}\Phi^T b$$

(d) Perspective homography for image 1 and image 2

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2

## Define perspective_theta using linear regression

perspective_theta =


(e) Perspective homography for image 2 and image 3

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2

## Define perspective_theta3 using linear regression

perspective_theta3 =


## Image warping¶

• Again, do not worry about the image warping (outside lecture's scope)
In [ ]:
cv2.warpPerspective?

In [ ]:
## Apply image warping on image 1 & image 3 using cv2.warpPerspective function

## do translation to fit the warping image into a size of (18000, 6500) screen.
translation = np.matrix([[1, 0, 6000],
[0, 1, 2500],
[0, 0, 1]])

warpedImage = cv2.warpPerspective(imag1, translation*perspective_theta, (18000, 6500))
warpedImage3 = cv2.warpPerspective(imag3, translation*perspective_theta3, (18000, 6500))

In [ ]:
screen = warpedImage.copy()
screen[screen==0] = warpedImage3[screen==0]
screen[2500:3024+2500,6000:4032+6000] = imag2

## Visualize panorama image
plt.figure(figsize=(20, 12))
plt.imshow(screen)
plt.show()