Optimization for Machine Learning
Table of Contents
3 key components
Procedures
In mathematical expression
$\;\;\; $where
Remarks) equivalent
$\;\;\; $where
$f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a convex function and
Feasible region $\mathcal{C}$ is a convex set
Remarks)
Key property of convex optimization:
Linear Interpolation between Two Points
Convex Function
Convex Set
Generalization for multivariate function $f:\mathbb{R}^n \rightarrow \ \mathbb{R}$
$$
x = \begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix} \quad \quad \quad \quad \nabla _x f(x) = \begin{bmatrix}
\partial f(x) \over \partial x_1 \\
\vdots\\
\partial f(x) \over \partial x_n
\end{bmatrix}
$$
Direct solution
$$
\begin{align*}
f(x) &= 2x_1^2+ x_2^2 + x_1 x_2 -6 x_1 -5 x_2\\\\
\Longrightarrow \nabla _x f(x) &= \begin{bmatrix}
4x_1+x_2-6\\
2x_2 + x_1 -5
\end{bmatrix} = \begin{bmatrix}0\\0 \end{bmatrix}\\\\
\therefore x^* &= \begin{bmatrix}
4 & 1\\
1 & 2
\end{bmatrix} ^{-1} \begin{bmatrix}
6 \\ 5\\
\end{bmatrix} = \begin{bmatrix}
1 \\ 2\\
\end{bmatrix}
\end{align*}
$$
Exampels
Note: Revisit Least-Square Solution of $J(x) = \lVert Ax - y \rVert ^2$
Iterative methods
$$ x \leftarrow x - \alpha \nabla _x f(x) \quad \quad \text{for some step size } \alpha > 0$$