Optimization for Machine Learning
Table of Contents
3 key components
Procedures
In mathematical expression
$\;\;\; $where
Remarks) equivalent
$\;\;\; $where
$f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a convex function and
Feasible region $\mathcal{C}$ is a convex set
Remarks)
Key property of convex optimization:
Linear Interpolation between Two Points
Convex Function
Convex Set
Generalization for multivariate function $f:\mathbb{R}^n \rightarrow \ \mathbb{R}$
$$
x = \begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix} \quad \quad \quad \quad \nabla _x f(x) = \begin{bmatrix}
\partial f(x) \over \partial x_1 \\
\vdots\\
\partial f(x) \over \partial x_n
\end{bmatrix}
$$
Direct solution
$$
\begin{align*}
f(x) &= 2x_1^2+ x_2^2 + x_1 x_2 -6 x_1 -5 x_2\\\\
\Longrightarrow \nabla _x f(x) &= \begin{bmatrix}
4x_1+x_2-6\\
2x_2 + x_1 -5
\end{bmatrix} = \begin{bmatrix}0\\0 \end{bmatrix}\\\\
\therefore x^* &= \begin{bmatrix}
4 & 1\\
1 & 2
\end{bmatrix} ^{-1} \begin{bmatrix}
6 \\ 5\\
\end{bmatrix} = \begin{bmatrix}
1 \\ 2\\
\end{bmatrix}
\end{align*}
$$
Exampels
Note: Revisit Least-Square Solution of $J(x) = \lVert Ax - y \rVert ^2$
Iterative methods
$$ x \leftarrow x - \alpha \nabla _x f(x) \quad \quad \text{for some step size } \alpha > 0$$
$\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad \bullet \, \text{Random initialization}$
$\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad \bullet \, \text{Multiple trials}$
$$
\begin{align*}
\min& \quad (x_1-3)^{2} + (x_2-3)^{2}\\\\
=\min& \quad \frac{1}{2} \begin{bmatrix} x_1 & x_2 \end{bmatrix}
\begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}
\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} -
\begin{bmatrix} 6 & 6 \end{bmatrix}
\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} + 18
\end{align*}
$$
import numpy as np
H = np.matrix([[2, 0],[0, 2]])
g = -np.matrix([[6],[6]])
x = np.zeros((2,1))
alpha = 0.2
for i in range(25):
df = H*x + g
x = x - alpha*df
print(x)
$$
\begin{align*} \max \; & \; 3x_1 + \frac{3}{2}x_2 \quad \quad \leftarrow \text{objective function}\\ \\
\text{subject to} \; & -1 \leq x_1 \leq 2 \quad \leftarrow \text{constraints}\\
& \quad 0 \leq x_2 \leq 3
\end{align*}
$$
Method 1: graphical approach
$$ 3 x_1 + 1.5 x_2 = C \qquad \Rightarrow \qquad x_2 = -2 x_1 + \frac{2}{3}C $$Method 2: Gradient Descent
$$
\begin{array}{Icr}\begin{align*}
\max_{x} \quad & 3x_1 + {3 \over 2}x_2 \\
\text{subject to} \quad
& -1 \leq x_1 \leq 2 \\
& \quad 0 \leq x_2 \leq 3
\end{align*}\end{array}
\quad\implies\quad
\begin{array}{I}
\quad \min_{x} \quad & - \begin{bmatrix} 3 \\ 3 / 2 \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\
\text{subject to} \quad
& \begin{bmatrix} -1 \\ 0 \end{bmatrix} \leq \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \leq \begin{bmatrix} 2 \\ 3 \\ \end{bmatrix}
\end{array}
$$
f = -np.array([[3],
[3/2]])
lb = np.array([[-1],
[0]])
ub = np.array([[2],
[3]])
x = np.zeros(shape = (2, 1))
alpha = 0.2
for i in range(25):
x = x - alpha * f
# lb constraints
lb_TF = lb < x
x = x * lb_TF + lb * ~lb_TF
# ub constraints
ub_TF = x < ub
x = x * ub_TF + ub * ~ub_TF
print(x)
Method 3: CVXPY-based solver
$$
\begin{array}{Icr}\begin{align*}
\max_{x} \quad & 3x_1 + {3 \over 2}x_2 \\
\text{subject to} \quad
& -1 \leq x_1 \leq 2 \\
& \quad 0 \leq x_2 \leq 3
\end{align*}\end{array}
\quad\implies\quad
\begin{array}{I}
\quad \min_{x} \quad & - \begin{bmatrix} 3 \\ 3 / 2 \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\
\text{subject to} \quad
& \begin{bmatrix} -1 \\ 0 \end{bmatrix} \leq \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \leq \begin{bmatrix} 2 \\ 3 \\ \end{bmatrix}
\end{array}
$$
import numpy as np
import cvxpy as cvx
f = np.array([[3], [3/2]])
lb = np.array([[-1], [0]])
ub = np.array([[2], [3]])
x = cvx.Variable([2,1])
obj = cvx.Minimize(-f.T @ x)
constraints = [lb <= x, x <= ub]
prob = cvx.Problem(obj, constraints)
result = prob.solve()
print(x.value)
print(result)
$$
\begin{array}{Icr}\begin{align*}
\min \quad &\frac{1}{2}x_1^2+3x_1+4x_2 \\
\text{subject to} \quad
& x_1+3x_2 \geq 15 \\
& 2x_1+5x_2 \leq 100 \\
& 3x_1+4x_2 \leq 80 \\
& x_1,x_2 \geq 0 \\
\end{align*}\end{array}
\quad\implies\quad
\begin{array}{I}
\min_{X} \quad & X^T HX + f^T X \\
\text{subject to} \quad
& AX \leq b \\
& A_{eq}X = b_{eq} \\
& lb \leq X \leq ub
\end{array}
$$
$$
\begin{array}{Icr}\begin{align*}
\min \quad &\frac{1}{2}x_1^2+3x_1+4x_2 \\
\text{subject to} \quad
& x_1+3x_2 \geq 15 \\
& 2x_1+5x_2 \leq 100 \\
& 3x_1+4x_2 \leq 80 \\
& x_1,x_2 \geq 0 \\
\end{align*}\end{array}
\quad\implies\quad
\begin{array}{I}
\min_{X} \quad & \begin{bmatrix}x_1 & x_2\end{bmatrix} \begin{bmatrix} {1 \over 2} & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} + \begin{bmatrix} 3 & 4 \end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} \\
\text{subject to} \quad
& \begin{bmatrix} -1 & -3 \\ 2 & 5 \\ 3 & 4 \end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} \leq \begin{bmatrix} -15 \\ 100 \\ 80 \end{bmatrix} \\
& \begin{bmatrix} 0 \\ 0 \end{bmatrix} \leq \begin{bmatrix}x_1 \\ x_2\end{bmatrix}
\end{array}
$$
f = np.array([[3], [4]])
H = np.array([[1/2, 0], [0, 0]])
A = np.array([[-1, -3],
[2, 5],
[3, 4]])
b = np.array([[-15],
[100],
[80]])
lb = np.array([[0],
[0]])
x = cvx.Variable([2,1])
obj = cvx.Minimize(cvx.quad_form(x, H) + f.T@x)
constraints = [A@x <= b, lb <= x]
prob = cvx.Problem(obj, constraints)
result = prob.solve()
print(x.value)
print(result)
f = np.array([[3], [4]])
H = np.array([[1/2, 0], [0, 0]])
A = np.array([[-1, -3],
[2, 5],
[3, 4],
[-1, 0],
[0, -1]])
b = np.array([[-15],
[100],
[80],
[0],
[0]])
x = cvx.Variable([2,1])
obj = cvx.Minimize(cvx.quad_form(x, H) + f.T@x)
constraints = [A@x <= b]
prob = cvx.Problem(obj, constraints)
result = prob.solve()
print(x.value)
print(result)
$$ \min \; d_1 + d_2 = \min \left\rVert \vec{a} - \begin{bmatrix}x\\0\end{bmatrix}\right\rVert_2 + \left\rVert \vec{b} - \begin{bmatrix}x\\0\end{bmatrix}\right\rVert_2$$
a = np.array([[0], [1]])
b = np.array([[4], [2]])
Aeq = np.array([0,1])
beq = 0
x = cvx.Variable([2,1])
mu = 1
obj = cvx.Minimize(cvx.norm(a-x, 2) + mu * cvx.norm(b-x, 2))
constraints = [Aeq @ x == beq]
prob = cvx.Problem(obj, constraints)
result = prob.solve()
print(x.value)
print(result)
%%html
<center><iframe src="https://www.youtube.com/embed/QJSSWNIAPlw?rel=0"
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>
%%html
<center><iframe src="https://www.youtube.com/embed/OJ397Y3fVkY?rel=0"
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>
%%html
<center><iframe src="https://www.youtube.com/embed/tLxsqbLIvx4?rel=0"
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>
%%html
<center><iframe src="https://www.youtube.com/embed/91DpsoKFpLw?rel=0"
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')