Deep Learning for Mechanical Engineering

Problem Set 01


Instructor: Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
  • For your handwritten solutions, please scan or take a picture of them. Alternatively, you can write them in markdown if you prefer.

  • Only .ipynb files will be graded for your code.

    • Ensure that your NAME and student ID are included in your .ipynb files. For example, 'SeungchulLee_20231234_HW01.ipynb'
  • Compress all the files into a single .zip file.

    • In the .zip file's name, include your NAME and student ID For example, 'SeungchulLee_20231234_HW01.zip'
    • Submit this .zip file on KLMS
  • Do not submit a printed version of your code, as it will not be graded.

Problem 1: Optimization and Gradiet Descent

You will find the optimal solution using gradient descent.

  • First, run the cell below. It contains pre-defined utility functions.
In [ ]:
# Do NOT change this cell !!

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

def plotHistory(hist):
    for i in range(len(hist)-1):
        plt.arrow(hist[i][0][0], hist[i][1][0], hist[i+1][0][0]-hist[i][0][0], hist[i+1][1][0]-hist[i][1][0], alpha = 0.3)
        plt.scatter(hist[i][0][0], hist[i][1][0])
    plt.title('Initial (x1, x2): (0.0, 0.0)')
    plt.grid('on')
    plt.xlabel('x1')
    plt.ylabel('x2')
    plt.show()

def update(x):
    pass

def train(alpha, n_iter):
    def saveHistory(a = None, hist = []):
        if a is not None: hist.append(a.tolist())
        return hist

    x =  np.zeros((2,1)) # inintial value
    saveHistory(x)

    for i in range(n_iter):
        x = update(x)
        saveHistory(x)

    hist = saveHistory()
    plotHistory(hist)
    print('(x1, x2) = ({:.3f}, {:.3f})'.format(hist[-1][0][0], hist[-1][1][0]))
    print('f(x1, x2) = {:.3f}'.format((0.5*x.T*H*x + g.T*x + 1).tolist()[0][0]))

# Do NOT change this cell !!

Here is the objective function you need to solve.


minx1,x2(x11)2+(2x1x2)2

(a) Find H and g to transform the objective function into matrix form as the below. Use np.matrix()


f=12XTHX+gX+c
In [ ]:
# your code here
H =
g =

(b) Define a function to update x based on the following eqautions:


f=HX+g

Xi+1=Xiαif(Xi)

In [ ]:
def update(x):
    # Your code here
    new_x =

    return new_x

(c) Find a learning rate α to make it converge within 150 iterations. (round to the 3rd decimal place)

(d) Adjust the training parameters to obtain the figures shown below:

(d-1) Stably converge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)

(d-2) Unstably converge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)

(d-3) Diverge

In [ ]:
# Your code here
alpha =
n_iter =

train(alpha, n_iter)

Problem 2: Optimization and Gradiet Descent with Constraints

You will find the optimal solution using gradient descent.


maxx3x1+32x2subject to1x120x23minx[33/2]T[x1x2]subject to[10][x1x2][23]
In [ ]:
df =
x =
alpha =

lower_bound =
upper_bound =

for i in range(25):
    x =

    # lb constraints
    lb_TF = lower_bound < x
    x =

    # ub constraints
    ub_TF = x < upper_bound
    x =

print(x)

Problem 3: Image Panorama with Regression

We want to demonstrate an image panorama as an example of linear regression. A panorama is any wide-angle view or representation of a physical space.



In [ ]:
%%html
<center><iframe src="https://www.youtube.com/embed/86rnwu3ZFbE?rel=0"
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

You need to install opencv module and download the images.

In [ ]:
# import library
import numpy as np
import matplotlib.pyplot as plt
import cv2
In [ ]:
# load images
imag1 = cv2.imread('./data_files/1.jpg')
imag1 = cv2.cvtColor(imag1, cv2.COLOR_BGR2RGB)
imag2 = cv2.imread('./data_files/2.jpg')
imag2 = cv2.cvtColor(imag2, cv2.COLOR_BGR2RGB)
imag3 = cv2.imread('./data_files/3.jpg')
imag3 = cv2.cvtColor(imag3, cv2.COLOR_BGR2RGB)
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag1)
plt.axis('off')
plt.show()
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag2)
plt.axis('off')
plt.show()
In [ ]:
plt.figure(figsize=(10, 6))
plt.imshow(imag3)
plt.axis('off')
plt.show()

Here, we are explaining the basic concept of homography (i.e., perspective transformation).

  • Any wide-angle view or representation of a physical space

  • images with horizontally elongated fields of view

  • idea: projecting images onto a common plane



  • Camera rotating about its center
  • Two image planes are related by a homography H

Do not worry about a homography transformation. (out of this course's scope)

[xy1][ωxωyω]=[abcdefgh1][xy1]

Find key points between two images
  • Suppose these matching points are given.

    • We have manually found the matching points for you, although there is a technique to do this automatically.
  • pos1 and pos2 are matching points between img01 and img02

  • pos3 and pos4 are matching points between img02 and img03

In [ ]:
pos1 = np.array([[2121, 2117, 2749, 3095, 3032, 3375, 3677, 3876],
                 [1431, 2034, 2033, 1885, 2017, 2037, 1885, 2279]], dtype=np.int64)
pos2 = np.array([[188, 58, 828, 1203, 1121, 1437, 1717, 1817],
                 [1217, 1909, 1952, 1827, 1952, 1991, 1870, 2226]], dtype=np.int64)
pos3 = np.array([[2338, 2379, 2658, 2899, 2977, 3272, 2716, 2786],
                 [1948, 1874, 2000, 1837, 1964, 1966, 2143, 2317]], dtype=np.int64)
pos4 = np.array([[109, 178, 497, 795, 851, 1144, 534, 580],
                 [1907, 1828, 1988, 1834, 1971, 1993, 2145, 2333]], dtype=np.int64)

(a) Visualization of key points

In [ ]:
## your code here
## Write down your own code to mark the key points (red dots) on the locations of the given images

Estimation of homography H


X=HX

where X and X are position vectors of key points, and H is a Perspective Transformation

Goal: we need to estimate homography H via matching points between two images


[xy1][ωxωyω]=[θ1θ2θ3θ4θ5θ6θ7θ81][xy1]
(b) Show the following equations from the above homography H
x=θ1x+θ2y+θ3θ7x+θ8y+1y=θ4x+θ5y+θ6θ7x+θ8y+1
θ1x+θ2y+θ3θ7xxθ8xyx=0θ4x+θ5y+θ6θ7yxθ8yyy=0
(c) For m pairs of matching potins, show that a feature matrix Φ can be expressed as follows:
  • Φ is a feature matrix
Φ=[x1y11000x1x1x1y1000x1y11y1x1y1y1xmym1000xmxmxmym000xmym1ymxmymym]
  • θ is a column vector for unknown parameters in a perspective transformation H

θ=[θ1θ2θ3θ4θ5θ6θ7θ8]
  • b is a column vector for corresponding positions in the base image

b=[x1y1x2y2xmym]
  • It ends up becoming a linear regression problem

minθΦθb22
θ=(ΦTΦ)1ΦTb

(d) Perspective homography for image 1 and image 2

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2



## your code here
## Define perspective_theta using linear regression

perspective_theta =

(e) Perspective homography for image 2 and image 3

In [ ]:
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2



## your code here
## Define perspective_theta3 using linear regression

perspective_theta3 =

Image warping

  • Again, do not worry about the image warping (outside lecture's scope)
In [ ]:
cv2.warpPerspective?
In [ ]:
## Apply image warping on image 1 & image 3 using cv2.warpPerspective function

## do translation to fit the warping image into a size of (18000, 6500) screen.
translation = np.matrix([[1, 0, 6000],
                         [0, 1, 2500],
                         [0, 0, 1]])

warpedImage = cv2.warpPerspective(imag1, translation*perspective_theta, (18000, 6500))
warpedImage3 = cv2.warpPerspective(imag3, translation*perspective_theta3, (18000, 6500))
In [ ]:
screen = warpedImage.copy()
screen[screen==0] = warpedImage3[screen==0]
screen[2500:3024+2500,6000:4032+6000] = imag2

## Visualize panorama image
plt.figure(figsize=(20, 12))
plt.imshow(screen)
plt.show()