For your handwritten solutions, please scan or take a picture of them. Alternatively, you can write them in markdown if you prefer.
Only .ipynb files will be graded for your code.
Compress all the files into a single .zip file.
ex) SeungchulLee_20241234_HW01.zip
Do not submit a printed version of your code, as it will not be graded.
Suppose you receive the binary signal. The signal is corrupted with noises while transmitting through channels. We want to get rid of noise from the original signal through the $L_1$ optimization. The mathematical problem statement is given:
(a) First, generate the original signal of 200 data points as shown in the below figure. This can be simply done using ones and zeros commands in Python.
# Write your code here
#
(b) Next, corrupt the original signal with Gaussian noise. This can be done using randn command in Python. One of realizations of the corrupted signal will be given:
# Write your code here
#
Note that the signal is sparse (in a sense of frequency domain). Therefore we can apply the $L_1$ optimization. We will optimize the $L_1$ loss function with the $L_2$ constraints.
(c) Plot the reconstructed signal. In addition, by changing $\beta$ in the lecture slide, explain how $\beta$ affects in an optimization process. One of reconstructed signal with a proper value of $\beta$ is shown below:
# Write your code here
#
The regularized least-squares problem has the form
(a) Show that the solution is given by
$$ \hat{\theta} = \left( A^T A + \lambda I_n \right)^{-1} A^T y $$
(b) Write down a gradient descent algorithm for the given optimization problem.
(c) Based on the result of (b), describe the role of regularizer term.
(d) Describe results of (a) and (b) have the same meaning.
(e) Find and draw an approximated curve of the given data points below in Python using your gradient descent algorithm.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# 10 data points
n = 10
x = np.linspace(-4.5, 4.5, 10).reshape(-1, 1)
y = np.array([0.9819, 0.7973, 1.9737, 0.1838, 1.3180, -0.8361, -0.6591, -2.4701, -2.8122, -6.2512]).reshape(-1, 1)
plt.figure(figsize = (10, 8))
plt.plot(x, y, 'o', label = 'Data')
plt.xlabel('X', fontsize = 15)
plt.ylabel('Y', fontsize = 15)
plt.grid(alpha = 0.3)
plt.show()
# Write your code here
#
We want to demonstrate an image panorama as an example of linear regression. A panorama is any wide-angle view or representation of a physical space.
%%html
<center><iframe src="https://www.youtube.com/embed/86rnwu3ZFbE?rel=0"
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
You need to install the opencv module and download the images.
opencv install
Image download
# import library
import numpy as np
import matplotlib.pyplot as plt
import cv2
# load images
imag1 = cv2.imread('./data_files/1.jpg')
imag1 = cv2.cvtColor(imag1, cv2.COLOR_BGR2RGB)
imag2 = cv2.imread('./data_files/2.jpg')
imag2 = cv2.cvtColor(imag2, cv2.COLOR_BGR2RGB)
imag3 = cv2.imread('./data_files/3.jpg')
imag3 = cv2.cvtColor(imag3, cv2.COLOR_BGR2RGB)
## your code here
#
## your code here
#
## your code here
#
Here, we are explaining the basic concept of homography (i.e., perspective transformation).
Any wide-angle view or representation of a physical space
images with horizontally elongated fields of view
idea: projecting images onto a common plane
Do not worry about a homography transformation. (out of this course's scope)
%%html
<center><iframe src="https://www.youtube.com/embed/pU4NorC7lb0?rel=0"
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
pos1 = np.array([[2121, 2117, 2749, 3095, 3032, 3375, 3677, 3876],
[1431, 2034, 2033, 1885, 2017, 2037, 1885, 2279]], dtype=np.int64)
pos2 = np.array([[188, 58, 828, 1203, 1121, 1437, 1717, 1817],
[1217, 1909, 1952, 1827, 1952, 1991, 1870, 2226]], dtype=np.int64)
pos3 = np.array([[2338, 2379, 2658, 2899, 2977, 3272, 2716, 2786],
[1948, 1874, 2000, 1837, 1964, 1966, 2143, 2317]], dtype=np.int64)
pos4 = np.array([[109, 178, 497, 795, 851, 1144, 534, 580],
[1907, 1828, 1988, 1834, 1971, 1993, 2145, 2333]], dtype=np.int64)
(a) Visualization of key points
## your code here
## Write down your own code to mark the key points (red dots) on the locations of the given images
#
Estimation of homography H
$$ X' = HX $$where $ X $ and $X'$ are position vectors of key points, and $ H $ is a Perspective Transformation
Goal: we need to estimate homography $H$ via matching points between two images
$$
\begin{align*}
x' &= \frac{\theta_1 x+\theta_2 y+\theta_3}{\theta_7 x+\theta_8 y+1} \\
y' &= \frac{\theta_4 x+\theta_5 y+\theta_6}{\theta_7 x+\theta_8 y+1}
\end{align*}
$$
(d) Perspective homography for image 1 and image 2
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2
## your code here
## Define perspective_theta using linear regression
perspective_theta = np.vstack([perspective_theta, [1]])
perspective_theta = perspective_theta.reshape(3, 3)
(e) Perspective homography for image 2 and image 3
## your code here
## Construct feature matrix using homography H, and a vector having entries of matching points in image 2
## your code here
## Define perspective_theta using linear regression
perspective_theta3 = np.vstack([perspective_theta3, [1]])
perspective_theta3 = perspective_theta3.reshape(3, 3)
cv2.warpPerspective?
## Apply image warping on image 1 & image 3 using cv2.warpPerspective function
## do translation to fit the warping image into a size of (18000, 6500) screen.
translation = np.matrix([[1, 0, 6000],
[0, 1, 2500],
[0, 0, 1]])
warpedImage = cv2.warpPerspective(imag1, translation*perspective_theta, (18000, 6500))
warpedImage3 = cv2.warpPerspective(imag3, translation*perspective_theta3, (18000, 6500))
screen = warpedImage.copy()
screen[screen==0] = warpedImage3[screen==0]
screen[2500:3024+2500,6000:4032+6000] = imag2
## Visualize panorama image
plt.figure(figsize=(20, 12))
plt.imshow(screen)
plt.show()
We saw many optimization problems in a machine learning context and witnessed that $L_1$ norm reguarlization provides not only small, but also sparse decision variables. In this problem we will use a linear programming to solve $L_1$ norm optimization problems instead of simply using cvx.
(a) Show that
(b) Show that, by introducing slack variable $t_i$, the absolute value optimization problem can be coverted to
(c) Suppose $A\theta-b = r$ and $t = \begin{bmatrix} t_1 & t_2 & \cdots & t_m \end{bmatrix}^T $, then show that $L_1$ norm fitting/approximation problem, i.e., sum of (absolute) residuals can be coverted to
(d) Change the above optimization problem to a linear programming (LP) in matrix form where $\mathbb{1} = \begin{bmatrix} 1& \cdots & 1\end{bmatrix}^T$ and $\mathbb{0} = \begin{bmatrix} 0& \cdots & 0\end{bmatrix}^T$
(e) Use cvxpy
in Python to find and plot a linear regression for data points with the above formulation
import numpy as np
x = np.array([0.1, 0.4, 0.7, 1.2, 1.3, 1.7, 2.2, 2.8, 3.0, 4.0, 4.3, 4.4, 4.9])
y = np.array([0.5, 0.9, 1.1, 1.5, 1.5, 2.0, 2.2, 2.8, 2.7, 3.0, 3.5, 3.7, 3.9])
## your code here
#
import cv2
import matplotlib.pyplot as plt
imbw = cv2.imread('./image_files/dog.jpg', 0)
row = 150
col = 150
resized_imbw = cv2.resize(imbw, (row, col))
plt.figure(figsize = (8,8))
plt.imshow(resized_imbw, 'gray')
plt.axis('off')
plt.show()
(a) Apply $L_1$ norm regularization to the image by solving
n = row*col
imbws = resized_imbw.reshape(-1, 1)
mu = 100
## your code here
#
(b) However, the output image is not exactly what we want since the 2D image (or matrix) is flattened to make it an 1D column vector (or signal). We now consider 2D images. Apply a similar method like (a) by solving the following objective function.
$$ \text{minimize} \quad \sum_{i=1}^{n} \sum_{j=1}^{n} \left( x_{i,j} - x_{i,j}^{cor} \right)^2 + \mu \left[ \sum_{i=1}^{n-1} \sum_{j=1}^{n} \lvert x_{i+1,j}-x_{i,j} \rvert + \sum_{i=1}^{n}\sum_{j=1}^{n-1} \lvert x_{i,j+1}-x_{i,j} \rvert \right]$$
mu = 1500
x = cvx.Variable([row,col])
## your code here
#
Consider the elastic-net regression problem:
[6pt] Show how one can turn this into a Lasso problem, using an augmented version of $A$ and $y$.
Suppose we have a linear regression model with two weights and no bias term:
and the usual loss function $\ell(y,\hat y) = \frac{1}{2}(y− \hat y)^2$ and cost $\mathcal{C}(\omega_1, \omega_2) = \frac{1}{m} \sum_{i}\ell(y^{(i)},\hat y ^{(i)})$. Suppose we have a training set consisting of $m = 3$ examples:
$x^{(1)} = [1,0]^T,\; y^{(1)} = 2$
$x^{(2)} = [0,1]^T,\; y^{(2)} = 2$
$x^{(3)} = [\sqrt{3},0]^T,\; y^{(3)} = 0$
Let's sketch one of the contours in weight space.
(a) Write the cost in the form below using $\ell(y,\hat y) = \frac{1}{2}(y− \hat y)^2$ , $\mathcal{C}(\omega_1, \omega_2) = \frac{1}{m} \sum_{i}\ell(y^{(i)},\hat y ^{(i)})$ and find $c_1,d_1,c_2,d_2,c_0$
(b) Since $c_1, c_2 > 0$, this corresponds to an axis-aligned ellipse of $\omega_1$ and $\omega_2$. Sketch the ellipse by hands for $\mathcal{C} = 4$. Label the center and radii of the ellipse.
This problem is motivated by the response surface methodology. The response surface methodology (RSM) is a collection of statistical and mathematical techniques used for the purpose of
Setting up a series of experiments (design) for adequate predictions of a response y.
Fitting a hypothesized (empirical) model to data obtained under the chosen design.
Determining optimum conditions on the model's input (control) variables that lead to maximum or minimum response within a region of interest.
(a) Find and plot an approximated curve of the given data points below in Python using your non-linear regression algorithm. Use the polynomial basis functions with a degree of 9. Your curve should become non-convex function with several local minima. Do not worry about overfitting.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
'''
Given data x,y
'''
y = np.array([10, 8.5, 7, 8, 6.5, 3, 2, 4, 4.5, 3, 4, 5.2, 4.4, 7, 9])
x = np.linspace(-4.5, 4.5, y.shape[0])
'''
plt x and y
'''
plt.figure(figsize = (10, 8))
plt.plot(x, y, 'o', label = 'Data')
plt.xlabel('X', fontsize = 15)
plt.ylabel('Y', fontsize = 15)
plt.grid(alpha = 0.3)
plt.show()
# Write your code here
'''
gradient descent method
Please solve only below #
do not modify except #
'''
m = np.shape(x)[0]
x = np.reshape(x,(-1,1))
y = np.reshape(y,(-1,1))
X = np.hstack([x**i for i in range(10)])
X = np.asmatrix(X)
y = np.asmatrix(y)
theta = np.ones([10,1])
## solve theta
theta =
print(theta)
(b) [6pt] Then we like to find the optimal point $x^*$ from the estimated response surface. Find the minimum point by the gradient descent algorithm in this non-convex problem. Remember that there might exist several local minima.
def output(x, theta):
y = theta[0,0] + theta[1,0]*x + theta[2,0]*x**2 + theta[3,0]*x**3 + \
theta[4,0]*x**4 + theta[5,0]*x**5 + theta[6,0]*x**6 + theta[7,0]*x**7 + theta[8,0]*x**8 + theta[9,0]*x**9
return y
# Write your code here
#
'''
gradient descent method
Please solve only below #
do not modify except #
'''
alpha = 0.001
y = 20
for i in range(10):
x = -4.5 + 9*np.random.rand()
for i in range(1000):
gradient = 0
for i in range(9):
#Update your gradeint using i and theta
gradient +=
## Update your x using alpha and gradient
x =
result = output(x, theta)
if (result < y):
y = result
print('y: ')
print(y)
Load lasso_input_data.pkl and lasso_target_data.pkl using the following command.
from six.moves import cPickle
x = cPickle.load(open('./data_files/lasso_input_data.pkl', 'rb'))
y = cPickle.load(open('./data_files/lasso_target_data.pkl', 'rb'))
# x.shape
# y.shape
In this problem, you are asked to predict a target value $\hat y$ from 13 features (input data). Assume that your model has the following relationship ($\omega_0 = 0$):
(a) Build the linear regression model using the Lasso regularizer.
(Hint : You can use cvxpy module to solve the above optimization problem)
# Write your code here
#
(b) After applying Lasso, we can select meaningful features. Plot the magnitudes of all $\omega$'s and tell which features are important.
# Write your code here
#
(c) Update the linear regresion model with the selected features only.
# Write your code here
#