Deep Learning for Mechanical Engineering

Homework 01

Due Monday, 09/18/2023, 4:00 PM

Instructor: Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
  • For your handwritten solution, scan or take a picture of them.

  • For your code, only .ipynb file will be graded.

    • Please write your NAME and student ID on your .ipynb files. ex) KangsanLee_20202467_HW02.ipynb
  • Please compress all the files to make a single .zip file

    • Please write your NAME and student ID on your .zip files. ex) NamjeongLee_20202467_HW02.zip
    • Submit it to KLSM
  • Do not submit a printed version of your code. It will not be graded.

Problem 1: Optimization and Gradiet Descent

You will find an optimal solution using the gradient descent.

(0) Run the below cell first. These are pre-defined utility functions.

In [ ]:
# Do NOT change this cell !!

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

def plotHistory(hist):
    for i in range(len(hist)-1):
        plt.arrow(hist[i][0][0], hist[i][1][0], hist[i+1][0][0]-hist[i][0][0], hist[i+1][1][0]-hist[i][1][0], alpha=0.3)
        plt.scatter(hist[i][0][0], hist[i][1][0])
    plt.title('Initial (x1, x2): (0.0, 0.0)'); plt.grid('on'); plt.xlabel('x1'), plt.ylabel('x2'); plt.show()

def update(x):
    pass

def train(alpha, n_iter):
    def saveHistory(a=None, hist=[]):
        if a is not None: hist.append(a.tolist())
        return hist

    x =  np.zeros((2,1)) # inintial value
    saveHistory(x)

    for i in range(n_iter):    
        x = update(x)
        saveHistory(x)

    hist = saveHistory()
    plotHistory(hist)
    print('(x1, x2) = ({:.3f}, {:.3f})'.format(hist[-1][0][0], hist[-1][1][0]))
    print('f(x1, x2) = {:.3f}'.format((0.5*x.T*H*x + g.T*x + 1).tolist()[0][0]))

# Do NOT change this cell !!

Here is the objective function you need to solve.


$$\min_{x_1, \,x_2} \, (x_1 - 1)^2 + (2x_1 - x_2)^2$$

(a) Find $H$ and $g$ to transform the objective function into a matrix form as the below. Use np.matrix()


$$ f = \frac{1}{2} X^T H X + g X + c $$
In [ ]:
# your code here
H = 
g = 

(b) Define a function to update $x$ based on the following eqautions


$$ \nabla{f} = HX+g $$

$$ X_{i+1} = X_i - \alpha_i \nabla f (X_i) $$

In [ ]:
def update(x):
    # Your code here
    new_x = 
    
    return new_x

(c) Find a learning rate $\alpha$ to make it converge within 150 iterations. (round to the 3rd decimal place)

(d) Adjust the traning parameters to obtain the figures as below:

(d-1) Stably converge

In [ ]:
# Your code here
alpha = 
n_iter = 

train(alpha, n_iter)

(d-2) Unstably converge

In [ ]:
# Your code here
alpha = 
n_iter = 

train(alpha, n_iter)

(d-3) Diverge

In [ ]:
# Your code here
alpha = 
n_iter = 

train(alpha, n_iter)

Problem 2: Optimization and Gradiet Descent with Constraints

You will find an optimal solution using the gradient descent.


$$ \begin{array}{Icr}\begin{align*} \max_{x} \quad & 3x_1 + {3 \over 2}x_2 \\ \text{subject to} \quad & -1 \leq x_1 \leq 2 \\ & \quad 0 \leq x_2 \leq 3 \end{align*}\end{array} \quad\implies\quad \begin{array}{I} \quad \min_{x} \quad & - \begin{bmatrix} 3 \\ 3 / 2 \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\ \text{subject to} \quad & \begin{bmatrix} -1 \\ 0 \end{bmatrix} \leq \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \leq \begin{bmatrix} 2 \\ 3 \\ \end{bmatrix} \end{array} $$
In [ ]:
df = 
x = 
alpha = 

lower_bound = 
upper_bound = 

for i in range(25): 
    x = 
    
    # lb constraints 
    lb_TF = lower_bound < x
    x = 
    
    # ub constraints 
    ub_TF = x < upper_bound
    x = 
    
print(x)

Problem 3: Image Panorama with Regression

We want to demonstrate an image panorama as an example of linear regression. A panorama is any wide-angle view or representation of a physical space.



In [1]:
%%html
<center><iframe src="https://www.youtube.com/embed/86rnwu3ZFbE?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

You need to install opencv module and download the images.

In [ ]:
# import library
import numpy as np
import matplotlib.pyplot as plt
import cv2
In [ ]:
# load images
imag1 = cv2.imread('./data_files/1.jpg')
imag1 = cv2.cvtColor(imag1, cv2.COLOR_BGR2RGB)
imag2 = cv2.imread('./data_files/2.jpg')
imag2 = cv2.cvtColor(imag2, cv2.COLOR_BGR2RGB)
imag3 = cv2.imread('./data_files/3.jpg')
imag3 = cv2.cvtColor(imag3, cv2.COLOR_BGR2RGB)
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag1)
plt.axis('off')
plt.show()
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag2)
plt.axis('off')
plt.show()
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag3)
plt.axis('off')
plt.show()

Here, we are explaining the basic concept of homography (i.e., perspective transformation).

  • Any wide-angle view or representation of a physical space

  • images with horizontally elongated fields of view

  • idea: projecting images onto a common plane



  • Camera rotating about its center
  • Two image planes are related by a homography $H$

Do not worry about a homography transformation. (out of this course's scope)

$$ \begin{bmatrix} x'\\y'\\1 \end{bmatrix} \sim \begin{bmatrix} \omega x'\\\omega y'\\\omega \end{bmatrix} = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix} \begin{bmatrix} x\\y\\1 \end{bmatrix} $$


Find key points between two images
  • Suppose these matching points are given.

    • We have manually found the matching points for you, although there is a technique to do this automatically.
  • pos1 and pos2 are matching points between img01 and img02

  • pos3 and pos4 are matching points between img02 and img03

In [ ]:
pos1 = np.array([[2121, 2117, 2749, 3095, 3032, 3375, 3677, 3876], 
                 [1431, 2034, 2033, 1885, 2017, 2037, 1885, 2279]], dtype=np.int64)
pos2 = np.array([[188, 58, 828, 1203, 1121, 1437, 1717, 1817], 
                 [1217, 1909, 1952, 1827, 1952, 1991, 1870, 2226]], dtype=np.int64)
pos3 = np.array([[2338, 2379, 2658, 2899, 2977, 3272, 2716, 2786], 
                 [1948, 1874, 2000, 1837, 1964, 1966, 2143, 2317]], dtype=np.int64)
pos4 = np.array([[109, 178, 497, 795, 851, 1144, 534, 580], 
                 [1907, 1828, 1988, 1834, 1971, 1993, 2145, 2333]], dtype=np.int64)

(a) Visualization of key points

In [ ]:
## your code here
## Write down your own code to mark the key points (red dots) on the locations of the given images

Estimation of homography H


$$ X' = HX $$

where $ X $ and $X'$ are position vectors of key points, and $ H $ is a Perspective Transformation

Goal: we need to estimate homography $H$ via matching points between two images


$$\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} \sim \begin{bmatrix} \omega x' \\ \omega y' \\ \omega \end{bmatrix} = \begin{bmatrix} \theta_{1} & \theta_{2} & \theta_{3} \\ \theta_{4} & \theta_{5} & \theta_{6} \\ \theta_{7} & \theta_{8} & 1 \end{bmatrix} \begin{bmatrix} x\\ y\\ 1 \end{bmatrix}$$
(b) Show the following equations from the above homography $H$
$$ \begin{align*} x' &= \frac{\theta_1 x+\theta_2 y+\theta_3}{\theta_7 x+\theta_8 y+1} \\ y' &= \frac{\theta_4 x+\theta_5 y+\theta_6}{\theta_7 x+\theta_8 y+1} \end{align*} $$
$$ \begin{align*} \theta_1 x+\theta_2 y+\theta_3 -\theta_7 x'x-\theta_8 x'y-x' &= 0 \\ \theta_4 x+\theta_5 y+\theta_6 -\theta_7 y'x-\theta_8 y'y-y' &= 0 \end{align*} $$
(c) For $m$ pairs of matching potins, show that a feature matrix $\Phi$ can be expressed as follows:
  • $ \Phi $ is a feature matrix
$$ \Phi = \begin{bmatrix} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x'_{1}x_{1} & -x'_{1}y_{1}\\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y'_{1}x_{1} & -y'_{1}y_{1}\\ \vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots &\vdots\\ x_{m} & y_{m} & 1 & 0 & 0 & 0 & -x'_{m}x_{m} & -x'_{m}y_{m}\\ 0 & 0 & 0 & x_{m} & y_{m} & 1 & -y'_{m}x_{m} & -y'_{m}y_{m}\end{bmatrix} $$
  • $ \theta $ is a column vector for unknown parameters in a perspective transformation $H$

$$ \theta = \begin{bmatrix} \theta_{1} \\ \theta_{2} \\ \theta_{3} \\ \theta_{4} \\ \theta_{5} \\ \theta_{6} \\ \theta_{7} \\ \theta_{8} \end{bmatrix} $$
  • $b$ is a column vector for corresponding positions in the base image

$$ b = \begin{bmatrix} x'_{1} \\ y'_{1} \\ x'_{2} \\ y'_{2} \\ \vdots \\ x'_{m} \\ y'_{m} \end{bmatrix} $$
  • It ends up becoming a linear regression problem

$$ \min\limits_{\theta} \lVert \Phi\theta - b \rVert _2^2 $$
$$ \theta^* = (\Phi^T\Phi)^{-1}\Phi^T b $$

(d) Perspective homography for image 1 and image 2

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2 



## your code here
## Define perspective_theta using linear regression

perspective_theta = 

(e) Perspective homography for image 2 and image 3

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2



## your code here
## Define perspective_theta3 using linear regression

perspective_theta3 = 

Image warping

  • Again, do not worry about the image warping (outside lecture's scope)
In [ ]:
cv2.warpPerspective?
In [ ]:
## Apply image warping on image 1 & image 3 using cv2.warpPerspective function 

## do translation to fit the warping image into a size of (18000, 6500) screen. 
translation = np.matrix([[1, 0, 6000],
                         [0, 1, 2500],
                         [0, 0, 1]])

warpedImage = cv2.warpPerspective(imag1, translation*perspective_theta, (18000, 6500))
warpedImage3 = cv2.warpPerspective(imag3, translation*perspective_theta3, (18000, 6500))
In [ ]:
screen = warpedImage.copy()
screen[screen==0] = warpedImage3[screen==0]
screen[2500:3024+2500,6000:4032+6000] = imag2

## Visualize panorama image
plt.figure(figsize=(20, 12))
plt.imshow(screen)
plt.show()