Jessica has been working on formulating the temperature (= $y$) of a semiconductor these days. She discovered that a semiconductor's temperature was related to the distance between atoms (= $x$). She assumed that the temperature of a semiconductor had an exponential relationship with the distance between atoms (i.e., $\exp(x)$). She also assumed that the temperature of the semiconductor had a periodic relationship with the distance between atoms (i.e., $\sin(x)$ or $\cos(x)$). Therefore, she tries to formulate the semiconductor temperature with a linear combination of $\sin(x)$, $\cos(x)$ and $\exp(x)$.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-5,5,30)
y = np.array([15.16043501, 14.93790538, 14.3694888, 13.52210637, 12.49552255,
11.41059962, 10.39506827, 9.56848955, 9.02817864, 8.8377477,
9.01961664, 9.55237357, 10.37329569, 11.38573378, 12.47049097,
13.49985608, 14.35263938, 14.92844055, 15.15946909, 15.01852543,
14.52220321, 13.72893569, 12.73211624, 11.64910292, 10.60740159,
9.72965436, 9.1192006, 8.84791056, 8.94772391, 9.40688936])
plt.figure(figsize = (6, 4))
plt.plot(x, y, 'o', label = "Data")
plt.xlabel("X", fontsize = 12)
plt.ylabel("Y", fontsize = 12)
plt.grid(alpha = 0.3)
plt.show()
m = np.shape(x)[0]
x = np.reshape(x,(-1,1))
y = np.reshape(y,(-1,1))
## write your code here to get coefficients of regression (theta)
#
A = np.hstack([x**0, np.exp(x), np.sin(x), np.cos(x)])
A = np.asmatrix(A)
y = np.asmatrix(y)
theta = np.ones([10,1])
theta = (A.T*A).I*A.T*y
print(theta)
The estimated coefficients, $ \theta_1 $, $ \theta_2 $, $ \theta_3 $, quantify the strength and type of contributions of each term in the nonlinear regression model for predicting the temperature of a semiconductor. These coefficients correspond to the following terms, respectively:
$ 1 $: The coefficient $ \theta_0 $ is the intercept term, representing the baseline value of $y$ when $ x = 0$
$ e^{x} $: The coefficient $ \theta_1 $ scales the exponential effect of the distance between atoms. A larger absolute value of $ \theta_1 $ suggests a exponential influence on the temperature.
$ \sin(x) $: The coefficient $ \theta_2 $ is associated with the sine of the distance between atoms. Since sine oscillates between -1 and 1, $ \theta_2 $ reflects the amplitude of this periodic influence.
$ \cos(x) $: Similarly, the coefficient $ \theta_3 $ corresponds to the cosine of the distance. The magnitude of $ \theta_3 $ indicates the significance of the cosine's oscillation effect on temperature.
xp
below.xp = np.linspace(-5,5,30).reshape(-1,1)
## write down your code
#
yp = theta[0,0] + theta[1,0]*np.exp(xp) + theta[2,0]*np.sin(xp) + theta[3,0]*np.cos(x)
plt.figure(figsize = (6, 4))
plt.plot(x, y, 'o', label = "Data")
plt.plot(xp, yp, 'r')
plt.xlabel("X", fontsize = 12)
plt.ylabel("y", fontsize = 12)
plt.grid(alpha = 0.3)
plt.show()
y_true = 14.822
xp = 8
## write your code here
#
yp = theta[0,0] + theta[1,0]*np.exp(xp) + theta[2,0]*np.sin(xp) + theta[3,0]*np.cos(xp)
abs(((yp-y_true)/y_true))*100
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-5,5,30)
y = np.array([15.16043501, 14.93790538, 14.3694888, 13.52210637, 12.49552255,
11.41059962, 10.39506827, 9.56848955, 9.02817864, 8.8377477,
9.01961664, 9.55237357, 10.37329569, 11.38573378, 12.47049097,
13.49985608, 14.35263938, 14.92844055, 15.15946909, 15.01852543,
14.52220321, 13.72893569, 12.73211624, 11.64910292, 10.60740159,
9.72965436, 9.1192006, 8.84791056, 8.94772391, 9.40688936])
m = np.shape(x)[0]
x = np.reshape(x,(-1,1))
y = np.reshape(y,(-1,1))
## write down your code to get coefficients of polynomial regression (theta)
#
B = np.hstack([x**i for i in range(8)])
B = np.asmatrix(B)
y = np.asmatrix(y)
theta = np.ones([10,1])
theta = (B.T*B).I*B.T*y
print(theta)
xp
below.xp = np.linspace(-5,5,30).reshape(-1,1)
## write down your code
#
yp = theta[0,0] + theta[1,0]*xp + theta[2,0]*xp**2 + theta[3,0]*xp**3 + theta[4,0]*xp**4 + theta[5,0]*xp**5 + theta[6,0]*xp**6
plt.figure(figsize = (6, 4))
plt.plot(x, y, 'o', label = "Data")
plt.plot(xp, yp, 'r')
plt.xlabel("X", fontsize = 12)
plt.ylabel("y", fontsize = 12)
plt.grid(alpha = 0.3)
plt.show()
y_true = 14.822
xp = 8
## write your code here
#
yp = theta[0,0] + theta[1,0]*xp + theta[2,0]*xp**2 + theta[3,0]*xp**3 + theta[4,0]*xp**4 + theta[5,0]*xp**5 + theta[6,0]*xp**6
print(abs(((yp-y_true)/y_true))*100)
Load lasso_input_data.pkl and lasso_target_data.pkl using the following command.
from google.colab import drive
drive.mount('/content/drive')
from six.moves import cPickle
x = cPickle.load(open('/content/drive/MyDrive/ML_Colab/ML_data/lasso_input_data.pkl', 'rb'))
y = cPickle.load(open('/content/drive/MyDrive/ML_Colab/ML_data/lasso_target_data.pkl', 'rb'))
print(x.shape)
print(y.shape)
In this problem, you are asked to predict a target value $\hat y$ from 13 features (input data). Assume that your model has the following relationship ($\omega_0 = 0$):
(Hint : You can use cvxpy module to solve the above optimization problem)
# Write your code here
#
import cvxpy as cvx
A = x
y = y.reshape(506,1)
d = 13
lamb = 20
theta = cvx.Variable([d, 1])
obj = cvx.Minimize(cvx.sum_squares(A@theta - y) + lamb*cvx.norm(theta, 1))
prob = cvx.Problem(obj, [])
result = prob.solve()
print(theta.value)
# Write your code here
#
plt.figure(figsize = (6, 4))
plt.title(r'magnitude of $\theta$', fontsize = 12)
plt.xlabel(r'$\theta$', fontsize = 12)
plt.ylabel('magnitude', fontsize = 12)
plt.stem(np.linspace(0, 12, 13).reshape(-1, 1), theta.value)
plt.axis('equal')
plt.xlim([0.5, 10.5])
plt.grid(alpha = 0.3)
plt.show()
# Write your code here
#
d = 4
features = [3, 5, 7, 10]
low_A = A[:, features]
theta = cvx.Variable([d, 1])
obj = cvx.Minimize(cvx.sum_squares(low_A@theta - y))
prob = cvx.Problem(obj, [])
result = prob.solve()
print(theta.value)
We are going to classify some fashion items by FDA (LDA). Let's load the dataset.
from google.colab import drive
drive.mount('/content/drive')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
data = np.load('/content/drive/MyDrive/ML_Colab/ML_data/fashion_data.npy')
data_labels = np.load('/content/drive/MyDrive/ML_Colab/ML_data/fashion_data_labels.npy')
target_dict = {
0: 'T-shirt/top',
1: 'Trouser',
2: 'Pullover',
3: 'Dress',
4: 'Coat',
5: 'Sandal',
6: 'Shirt',
7: 'Sneaker',
8: 'Bag',
9: 'Ankle boot',
}
plt.figure(figsize = (8, 8))
for i in range(0, 20):
plt.subplot(5, 5, i+1)
plt.imshow(data[i], 'gray')
plt.title(target_dict[(data_labels[i])], fontsize = 10)
plt.xticks([])
plt.yticks([])
## fill out the blank
#
T_shirt_top_data = data[data_labels == 0]
Trouser_data = data[data_labels == 1]
## Write your code here
#
T_shirt_top_data = T_shirt_top_data[:1000]
Trouser_data = Trouser_data[:1000]
print(Trouser_data.shape)
## Write your code here
#
plt.figure(figsize = (8, 8))
for i in range(0, 20):
plt.subplot(5,5, i+1)
plt.imshow(T_shirt_top_data[i], 'gray')
plt.title(target_dict[0])
plt.xticks([])
plt.yticks([])
## Write your code here
#
plt.figure(figsize = (8, 8))
for i in range(0, 20):
plt.subplot(5,5, i+1)
plt.imshow(Trouser_data[i], 'gray')
plt.title(target_dict[1])
plt.xticks([])
plt.yticks([])
T_shirt_top
and Trouser
. The following two features are recommended(feature 1) The total variance pixels over the entire location.
(feature 2) The average pixels located at the center of the image (img[15:25,13:15]
).
## fill out the blank
#
feature_T_shirt_top_var = np.var(T_shirt_top_data, axis = (1, 2))
feature_Trouser_var = np.var(Trouser_data, axis = (1, 2))
feature_T_shirt_top_mean = np.mean(T_shirt_top_data[:,15:25,13:15], axis = (1, 2))
feature_Trouser_mean = np.mean(Trouser_data[:,15:25,13:15], axis = (1, 2))
## Write your code here
#
feature_T_shirt_top_var = feature_T_shirt_top_var.reshape(len(T_shirt_top_data),-1)
feature_Trouser_var = feature_Trouser_var.reshape(len(Trouser_data),-1)
feature_T_shirt_top_mean = feature_T_shirt_top_mean.reshape(len(T_shirt_top_data),-1)
feature_Trouser_mean = feature_Trouser_mean.reshape(len(Trouser_data),-1)
## Write your code here
#
plt.figure(figsize = (6, 4))
plt.title('Feature Space', fontsize = 12)
plt.plot(feature_T_shirt_top_var, feature_T_shirt_top_mean, '.', label = 'T_shirt_top', color = 'b')
plt.plot(feature_Trouser_var, feature_Trouser_mean, '.', label = 'Trouser', color = 'Hotpink')
plt.plot()
plt.grid(alpha = 0.3)
plt.legend(fontsize = 12)
plt.xlabel('Feature 1', fontsize = 12)
plt.ylabel('Feature 2', fontsize = 12)
plt.show()
from sklearn import discriminant_analysis
## Write your code here
#
Theta0 = np.hstack([feature_T_shirt_top_var, feature_T_shirt_top_mean])
Theta1 = np.hstack([feature_Trouser_var, feature_Trouser_mean])
X = np.vstack([Theta0, Theta1])
y = np.vstack([np.ones([Theta0.shape[0], 1]), np.zeros([Theta1.shape[0], 1])])
# projection line
n0 = len(T_shirt_top_data)
n1 = len(Trouser_data)
mu0 = np.mean(Theta0.T, axis = 1)
mu1 = np.mean(Theta1.T, axis = 1)
S0 = 1/(n0 - 1)*np.matrix(Theta0.T - mu0.reshape(2,1))*np.matrix(Theta0.T - mu0.reshape(2,1)).T
S1 = 1/(n1 - 1)*np.matrix(Theta1.T - mu1.reshape(2,1))*np.matrix(Theta1.T - mu1.reshape(2,1)).T
prj_w = (n0*S0 + n1*S1).I*(mu0.reshape(2,1) - mu1.reshape(2,1))
y1 = Theta0*prj_w
y2 = Theta1*prj_w
plt.figure(figsize = (6, 4))
plt.title('Histogram', fontsize = 12)
plt.hist(y1, 50, label = 'T_shirt_top', color = 'b', rwidth = 0.6)
plt.hist(y2, 50, label = 'Trouser', color = 'Hotpink', rwidth = 0.6)
plt.legend()
plt.show()
## Write your code here
#
clf = discriminant_analysis.LinearDiscriminantAnalysis()
clf.fit(X, np.ravel(y))
# classifier boundary, not a projection line
clf_w = clf.coef_[0]
clf_w0 = clf.intercept_[0]
print(clf_w)
print(clf_w0)
xp = np.arange(-0.5, 1, 0.1)
prj_yp = prj_w[1,0]/prj_w[0,0]*xp
clf_yp = - clf_w[0]/clf_w[1]*xp - clf_w0/clf_w[1]
plt.figure(figsize = (6, 6))
plt.plot(Theta0[:,0], Theta0[:,1], '.', color = 'b', label = 'T_shirt_top')
plt.plot(Theta1[:,0], Theta1[:,1], '.', color = 'hotpink', label ='Trouser')
plt.plot(xp, prj_yp, 'k', label = 'FDA(LDA) projection line', linewidth = 2.5)
plt.plot(xp, clf_yp, 'g', label = 'FDA(LDA) classification boundary', linewidth = 2.5)
plt.legend(fontsize = 10)
plt.grid(alpha = 0.3)
plt.axis('equal')
plt.ylim([-0.2, 1])
plt.xlim([-0.05, 0.24])
plt.xlabel('Feature 1', fontsize = 12)
plt.ylabel('Feature 2', fontsize = 12)
plt.show()
Let's consider a situation where Sam has a dataset (denoted as $Z$ below) that consists of two types of battery data: lithium battery data (denoted as X below) and aluminum battery data (denoted as Y below) where $Z = X \cup Y$
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
X = np.random.uniform(low = -1, high = 1, size = (100, 2))
X[:, 1] = 5*X[:, 0]**2 + np.random.normal(loc = 0, scale = 1, size = 100) + 5
Y = np.random.uniform(low = -1, high = 1, size = (100, 2))
Y[:, 1] = -8*(Y[:, 0]-0.4)**2 + np.random.normal(loc = 0, scale = 1, size = 100)
Z = np.concatenate([X, Y])
# X: lithium battery data, Y: aluminum battery data, Z: battery dataset
# X.shape = (100,2)
# Y.shape = (100,2)
# Z.shape = (200,2)
## your code here for non-linear regression algorithm
#
X_values = Z[:, 0]
Y_values = Z[:, 1]
X_values = np.reshape(X_values,(-1,1))
Y_values = np.reshape(Y_values,(-1,1))
m = np.shape(X_values)[0]
X_poly = np.hstack([np.ones([m,1]), X_values, X_values**2, X_values**3, X_values**4, X_values**5])
y = np.asmatrix(Y_values)
X_poly = np.asmatrix(X_poly)
theta = np.ones([6, 1])
lamda = 1
alpha = 1e-3
n_iter = 100000
for i in range(n_iter):
grad = 2*X_poly.T*X_poly*theta - 2*X_poly.T*y + 2*lamda*theta
theta = theta - alpha*grad
print(theta)
## your code here for the plot
#
xp = np.linspace(-1, 1, 1000)
yp = theta[0,0] + theta[1,0]*xp + theta[2,0]*xp**2 + theta[3,0]*xp**3 + theta[4,0]*xp**4 + theta[5,0]*xp**5
plt.figure(figsize = (6, 4))
plt.plot(X_values, Y_values, 'o')
plt.plot(xp,yp, 'r')
plt.xlabel('feature 0', fontsize = 12)
plt.ylabel('feature 1', fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
## your code here for the classification algorithm
#
X0 = np.hstack([np.ones([X.shape[0],1]), X])
X1 = np.hstack([np.ones([Y.shape[0],1]), Y])
T = np.vstack([X0, X1])
T = np.asmatrix(T)
import cvxpy as cvx
m = 100
d = 2
n = 100
g = 1
w = cvx.Variable((d+1,1))
u = cvx.Variable((m, 1))
v = cvx.Variable((m, 1))
obj = cvx.Minimize(cvx.norm(w, 2) + g*(np.ones([1, 100])@u + np.ones([1, 100])@v))
const = [X0@w >= 1-u, X1@w <= -(1-v), u >= 0, v >= 0 ]
prob = cvx.Problem(obj, const).solve()
w = w.value
print(w)
## your code here for the plot
#
xp = np.linspace(-1, 1)
yp = - w[1,0]/w[2,0]*xp - w[0,0]/w[2,0]
plt.figure(figsize = (6, 4))
plt.plot(X[:,0], X[:,1], '.', label = 'Lithium')
plt.plot(Y[:,0], Y[:,1], '.', label = 'Aluminum')
plt.plot(xp,yp, 'r')
plt.xlabel('feature 0', fontsize = 12)
plt.ylabel('feature 1', fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.legend(fontsize = 12)
plt.show()
## your code here for non-linear regression algorithm on the lithium battery data
#
X_values1 = X[:, 0]
Y_values1 = X[:, 1]
X_values1 = np.reshape(X_values1,(-1, 1))
Y_values1 = np.reshape(Y_values1,(-1, 1))
X_poly1 = np.hstack([np.ones([m,1]), X_values1, X_values1**2, X_values1**3, X_values1**4, X_values1**5])
m = np.shape(X_values1)[0]
y1 = np.asmatrix(Y_values1)
X_poly1 = np.asmatrix(X_poly1)
theta1 = np.ones([6, 1])
lamda = 1
alpha = 1e-3
n_iter = 100000
for i in range(n_iter):
grad = 2*X_poly1.T*X_poly1*theta1 - 2*X_poly1.T*y1 + 2*lamda*theta1
theta1 = theta1 - alpha*grad
print(theta1)
## your code here for the plot
#
xp = np.linspace(-1, 1, 1000)
yp1 = theta1[0,0] + theta1[1,0]*xp + theta1[2,0]*xp**2 + theta1[3,0]*xp**3 + theta1[4,0]*xp**4 + theta1[5,0]*xp**5
plt.figure(figsize = (6, 4))
plt.plot(X_values1, Y_values1, 'o', label = 'Lithium')
plt.plot(xp,yp1, 'r')
plt.xlabel('feature 0', fontsize = 12)
plt.ylabel('feature 1', fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
## your code here for non-linear regression algorithm on the aluminum battery data
#
X_values2 = Y[:, 0]
Y_values2 = Y[:, 1]
X_values2 = np.reshape(X_values2,(-1,1))
Y_values2 = np.reshape(Y_values2,(-1,1))
X_poly2 = np.hstack([np.ones([m,1]), X_values2, X_values2**2, X_values2**3, X_values2**4, X_values2**5])
m = np.shape(X_values2)[0]
y2 = np.asmatrix(Y_values2)
X_poly2 = np.asmatrix(X_poly2)
theta2 = np.ones([6, 1])
lamda = 1
alpha = 1e-3
n_iter = 100000
for i in range(n_iter):
grad = 2*X_poly2.T*X_poly2*theta2 - 2*X_poly2.T*y2 + 2*lamda*theta2
theta2 = theta2 - alpha*grad
print(theta2)
## your code here for the plot
#
xp = np.linspace(-1, 1, 1000)
yp2 = theta2[0,0] + theta2[1,0]*xp + theta2[2,0]*xp**2 + theta2[3,0]*xp**3 + \
theta2[4,0]*xp**4 + theta2[5,0]*xp**5
plt.figure(figsize = (6, 4))
plt.plot(X_values2, Y_values2, 'o', label = 'Aluminum')
plt.plot(xp,yp2, 'r')
plt.xlabel('feature 0', fontsize = 15)
plt.ylabel('feature 1', fontsize = 15)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
## your code here for the aggregated plot
#
plt.figure(figsize = (6, 4))
plt.plot(X_values1, Y_values1, 'o', label = 'Lithium')
plt.plot(X_values2, Y_values2, 'o', label = 'Aluminum')
plt.plot(xp,yp1, 'r', linewidth = 3)
plt.plot(xp,yp2, 'g', linewidth = 3)
plt.xlabel('feature 0', fontsize = 12)
plt.ylabel('feature 1', fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.legend(fontsize = 12)
plt.show()
Take home message: When you receive data for regression analysis, it's essential to have a thorough look at the data. Check whether the data can be separated into different classes. Regression algorithms on the entire data that consists of multiple classes will result in poor performance, while regression algorithms on each class of the data will result in good performance.
We would like to continue the above problem where Sam has a dataset (denoted as Z below) that consists of two types of battery data: lithium battery data (denoted as X below) and aluminum battery data (denoted as Y below) where $Z = X \cup Y$. However, let's consider a situation where Sam does not know the labels of each dataset. That means Sam does not know whether the given data belongs to lithium batteries or aluminum batteries. In other words, Sam is just given the data $Z$ without $X$ and $Y$ information. We will first use K-means to assign labels (class 0 & class 1) and employ supervised regression algorithms on each dataset.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
X = np.random.uniform(low = -1, high = 1, size = (100, 2))
X[:, 1] = 5*X[:, 0]**2 + np.random.normal(loc = 0, scale = 1, size = 100) + 5
Y = np.random.uniform(low = -1, high = 1, size = (100, 2))
Y[:, 1] = -8*(Y[:, 0]-0.4)**2 + np.random.normal(loc = 0, scale = 1, size = 100)
Z = np.concatenate([X, Y])
# X: lithium battery data, Y: aluminum battery data, Z: battery dataset
# X.shape = (100,2)
# Y.shape = (100,2)
# Z.shape = (200,2)
## your code here for regression algorithm
#
X_values = Z[:, 0]
Y_values = Z[:, 1]
X_values = np.reshape(X_values,(-1,1))
Y_values = np.reshape(Y_values,(-1,1))
from sklearn import neighbors
reg = neighbors.KNeighborsRegressor(n_neighbors = 30)
reg.fit(X_values, Y_values)
## your code here for the plot
#
xp = np.linspace(-1, 1, 100).reshape(-1, 1)
yp = reg.predict(xp)
plt.figure(figsize = (6, 4))
plt.title('k-Nearest Neighbor Regression', fontsize = 12)
plt.plot(X_values,Y_values, '.', label = 'Original Data')
plt.plot(xp, yp, label = 'kNN')
plt.legend(fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
## your code here for K-means clustering algorithm
#
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters = 2, random_state = 0)
kmeans.fit(Z)
plt.figure(figsize = (6, 4))
plt.plot(Z[kmeans.labels_ == 0, 0], Z[kmeans.labels_ == 0, 1], 'g.', label = 'C0')
plt.plot(Z[kmeans.labels_ == 1, 0], Z[kmeans.labels_ == 1, 1], 'r.', label = 'C1')
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.legend(fontsize = 12)
plt.show()
## your code here for regression algorithm on the data with K-means label of C0
#
X_values1 = Z[kmeans.labels_ == 0, 0]
Y_values1 = Z[kmeans.labels_ == 0, 1]
X_values1 = np.reshape(X_values1,(-1, 1))
Y_values1 = np.reshape(Y_values1,(-1, 1))
from sklearn import neighbors
reg1 = neighbors.KNeighborsRegressor(n_neighbors = 8)
reg1.fit(X_values1, Y_values1)
## your code here for the plot
#
xp = np.linspace(-1, 1, 100).reshape(-1, 1)
yp = reg1.predict(xp)
plt.figure(figsize = (6, 4))
plt.title('k-Nearest Neighbor Regression 1', fontsize = 12)
plt.plot(X_values1,Y_values1, 'g.', label = 'C0 Data')
plt.plot(xp, yp, label = 'kNN')
plt.legend(fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
## your code here for the plot
#
X_values2 = Z[kmeans.labels_ == 1, 0]
Y_values2 = Z[kmeans.labels_ == 1, 1]
X_values2 = np.reshape(X_values2,(-1, 1))
Y_values2 = np.reshape(Y_values2,(-1, 1))
from sklearn import neighbors
reg2 = neighbors.KNeighborsRegressor(n_neighbors = 8)
reg2.fit(X_values2, Y_values2)
## your code here for the plot
#
xp = np.linspace(-1, 1, 100).reshape(-1, 1)
yp = reg2.predict(xp)
plt.figure(figsize = (6, 4))
plt.title('k-Nearest Neighbor Regression 2', fontsize = 12)
plt.plot(X_values2,Y_values2, 'r.', label = 'C1 Data')
plt.plot(xp, yp, label = 'kNN')
plt.legend(fontsize = 12)
plt.xlim([-1.3, 1.3])
plt.ylim([-20, 13])
plt.grid(alpha = 0.3)
plt.show()
In this problem we will refer to the binary classification task which we attempt to solve with the simple linear logistic regression model:
(For simplicity we do not use the bias parameter $\omega_0$ or think as $\omega_0 = 0$).
import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
%matplotlib inline
x = np.load("/content/drive/MyDrive/ML_Colab/ML_data/logistic_reg_x.npy")
y = np.load("/content/drive/MyDrive/ML_Colab/ML_data/logistic_reg_y.npy")
plt.figure(figsize = (6, 4))
plt.plot(x[y==1, 0], x[y==1, 1], 'or', label = 'Class:1')
plt.plot(x[y==0, 0], x[y==0, 1], 'ob', label = 'Class:0')
plt.legend(fontsize = 12)
plt.axis('equal')
plt.xlim([-5, 5])
plt.ylim([-5,5])
plt.grid(alpha = 0.3)
plt.show()
y = np.reshape(y, (-1,1))
x = np.asmatrix(x)
y = np.asmatrix(y)
w = cvx.Variable([2,1])
obj = cvx.Maximize(y.T@x@w - cvx.sum(cvx.logistic(x@w)))
prob = cvx.Problem(obj).solve()
w = w.value
xp = np.linspace(-6,6,100)
yp = -w[0,0]/w[1,0]*xp
x = np.array(x)
y = np.array(y)
plt.figure(figsize = (6, 4))
plt.plot(x[y[:,0]==1, 0], x[y[:,0]==1, 1], 'or', label = 'Class:1')
plt.plot(x[y[:,0]==0, 0], x[y[:,0]==0, 1], 'ob', label = 'Class:0')
plt.plot(xp,yp, 'k', label = 'Logistic Regression')
plt.legend(fontsize = 12)
plt.axis('equal')
plt.xlim([-5, 5])
plt.ylim([-5,5])
plt.grid(alpha = 0.3)
plt.show()
$$ \sum_{i=1}^{m} - \log P\left( y_i \mid x_i, \, \omega_1, \, \omega_2\right) + \lambda \,\omega_2^2 $$
x = np.asmatrix(x)
y = np.asmatrix(y)
lamda = 100
w = cvx.Variable([2, 1])
obj = cvx.Minimize(- y.T@x@w + cvx.sum(cvx.logistic(x@w)) + lamda*w[1,0]**2)
prob = cvx.Problem(obj).solve()
w = w.value
xp = np.linspace(-6,6,100)
yp = -w[0,0]/w[1,0]*xp
x = np.array(x)
y = np.array(y)
plt.figure(figsize = (6, 4))
plt.plot(x[y[:,0]==1,0], x[y[:,0]==1,1], 'or', label='Class:1')
plt.plot(x[y[:,0]==0,0], x[y[:,0]==0,1], 'ob', label='Class:0')
plt.plot(xp,yp, 'k', label = 'Logistic Regression')
plt.legend(fontsize = 12)
plt.axis('equal')
plt.xlim([-5, 5])
plt.ylim([-5,5])
plt.grid(alpha = 0.3)
plt.show()
This regularization term penalizes $\omega_2^2$, which makes it as small as possible.
$$ \min \sum_{i=1}^{m} - \log P\left( y_i \mid x_i,\omega_1, \omega_2\right) + \lambda \,\left( \lvert \omega_1 \rvert + \lvert \omega_2 \rvert \right)$$
x = np.asmatrix(x)
y = np.asmatrix(y)
lamda = 0.1
w = cvx.Variable([2, 1])
obj = cvx.Minimize(- y.T@x@w + cvx.sum(cvx.logistic(x@w)) + lamda*cvx.norm(w,1))
prob = cvx.Problem(obj).solve()
w = w.value
xp = np.linspace(-6,6,100)
yp = -w[0,0]/w[1,0]*xp
x = np.array(x)
y = np.array(y)
plt.figure(figsize = (6, 4))
plt.plot(x[y[:,0]==1,0], x[y[:,0]==1,1], 'or', label='Class:1')
plt.plot(x[y[:,0]==0,0], x[y[:,0]==0,1], 'ob', label='Class:0')
plt.plot(xp,yp, 'k', label ='Logistic Regression')
plt.legend(fontsize = 12)
plt.axis('equal')
plt.xlim([-5, 5])
plt.ylim([-5,5])
plt.grid(alpha = 0.3)
plt.show()
This regularization term penalizes $\omega$ in a sense of L1 norm, which makes it small and sparse at the same time. It results in $\omega_1 = 0$