Linear Algebra

By Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

# 1. Linear Equations¶

• Set of linear equations (two equations, two unknowns)

\begin{align*} 4x_{1} − 5x_{2} &= −13\\ −2x_{1} + 3x_{2} &= 9 \end{align*}

## 1.1. Solving Linear Equations¶

• Two linear equations

\begin{align*} 4x_{1} − 5x_{2} &= −13\\ −2x_{1} + 3x_{2} &= 9 \end{align*}

• In vector form, $Ax = b$, with

$$A = \begin{bmatrix} 4 & -5 \\ -2 & 3 \end{bmatrix} , \quad x = \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} , \quad b = \begin{bmatrix} -13 \\ 9 \end{bmatrix}$$

• Solution using inverse

\begin{align*} Ax &= b \\ A^{-1}Ax &= A^{-1}b \\ x &= A^{-1}b \end{align*}

• Won’t worry here about how to compute inverse, but it’s very siminp.linargr to the standard method for solving linear equations

• We will use a numpy to compute

In [1]:
import numpy as np

In [2]:
A = np.array([[4, -5],[-2, 3]])
print(A)

[[ 4 -5]
[-2  3]]

In [3]:
A = np.array([[4, -5],
[-2, 3]])
print(A)

[[ 4 -5]
[-2  3]]

In [4]:
A[0]

Out[4]:
array([ 4, -5])
In [5]:
A[1]

Out[5]:
array([-2,  3])
In [6]:
A[0,1]

Out[6]:
-5
In [7]:
A[0][1]

Out[7]:
-5
In [8]:
b = np.array([[-13],[9]])
print(b)

[[-13]
[  9]]


$A^{-1} b$: matrix multiplication

In [9]:
x = np.linalg.inv(A).dot(b)
print(x)

[[3.]
[5.]]

In [10]:
x = np.linalg.inv(A) @ b
print(x)

[[3.]
[5.]]

In [11]:
A = np.asmatrix(A)
b = np.asmatrix(b)

In [12]:
x = A.I*b
print(x)

[[3.]
[5.]]


## 1.2. System of Linear Equations¶

• Consider system of linear equations

\begin{align*} y_1 &= a_{11}x_{1} + a_{12}x_{2} + \cdots + a_{1n}x_{n} \\ y_2 &= a_{21}x_{1} + a_{22}x_{2} + \cdots + a_{2n}x_{n} \\ &\, \vdots \\ y_m &= a_{m1}x_{1} + a_{m2}x_{2} + \cdots + a_{mn}x_{n} \end{align*}

• Can be written in a matrix form as $y = Ax$, where

$$y= \begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{m} \end{bmatrix} \qquad A = \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n} \\ a_{21}&a_{22}&\cdots&a_{2n} \\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \\ \end{bmatrix} \qquad x= \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix}$$

## 1.3. Elements of a Matrix¶

• Can write a matrix in terms of its columns

$$A = \begin{bmatrix} \mid&\mid&&\mid\\ a_{1} & a_{2} & \cdots & a_{n}\\ \mid&\mid&&\mid\\ \end{bmatrix}$$

• Careful, $a_{i}$ here corresponds to an entire vector $a_{i} \in \mathbb{R}^{m}$, not an element of a vector
• Can write a matrix in terms of rows

$$A = \begin{bmatrix} - & b_{1}^T& - \\ - & b_{2}^T& - \\ &\vdots& \\ - & b_{m}^T& - \end{bmatrix}$$

• $b_{i} \in \mathbb{R}^{n}$

## 1.4. Vector-Vector Products¶

• Inner product: $x, y \in \mathbb{R}^{n}$

$$x^{T}y = \sum\limits_{i=1}^{n}x_{i}\,y_{i} \quad \in \mathbb{R}$$
In [13]:
x = np.array([[1],
[1]])

y = np.array([[2],
[3]])

a = x.T.dot(y)

print(a)        # double list
print(a[0])     # list
print(a[0][0])  # scalar or number

[[5]]
[5]
5

In [14]:
print(x.T @ y)

[[5]]

In [15]:
x = np.asmatrix(x)
y = np.asmatrix(y)

print(x.T*y)

[[5]]


## 1.5. Matrix-Vector Products¶

• $A \in \mathbb{R}^{m \times n}, x \in \mathbb{R}^{n} \Longleftrightarrow Ax \in \mathbb{R}^{m}$
• Writing $A$ by rows, each entry of $Ax$ is an inner product between $x$ and a row of $A$

$$A = \begin{bmatrix} - &b_{1}^{T} & - \\ -& b_{2}^{T}&- \\ &\vdots& \\ -& b_{m}^{T}&- \end{bmatrix} ,\qquad Ax \in \mathbb{R}^{m} = \begin{bmatrix} b_{1}^{T}x \\ b_{2}^{T}x \\ \vdots \\ b_{m}^{T}x \end{bmatrix}$$

• Writing $A$ by columns, $Ax$ is a linear combination of the columns of $A$, with coefficients given by $x$

$$A = \begin{bmatrix} \mid&\mid&&\mid\\ a_{1} & a_{2} & \cdots & a_{n}\\ \mid&\mid&&\mid\\ \end{bmatrix} ,\qquad Ax \in \mathbb{R}^{m} = \sum\limits_{i=1}^{n}a_{i}x_{i}$$

# 2. Norms (strenth or distance in linear space)¶

• A vector norm is any function $f : \mathbb{R}^{n} \rightarrow \mathbb{R}$ with

1. $f(x) \geq 0 \;$ and $\;f(x) = 0 \quad \Longleftrightarrow \quad x = 0$
2. $f(ax) = \lvert a \rvert f(x) \;$ for $\; a \in \mathbb{R}$
3. $f(x + y) \leq f(x) + f(y)$

$$\left\lVert x \right\rVert _{p} = (\sum\limits_{i}\vert x_{i}\vert ^p)^{1 \over p}$$

• $l_{2}$ norm
$$\left\lVert x \right\rVert _{2} = \sqrt{\sum\limits_{i=1}^{n}x_{i}^2}$$
• $l_{1}$ norm
$$\left\lVert x \right\rVert _{1} = \sum\limits_{i=1}^{n} \left\lvert x_{i} \right\rvert$$
• $\lVert x\rVert$ measures length of vector (from origin)
$$x = \begin{bmatrix} 4\\3 \end{bmatrix}$$
In [16]:
x = np.array([[4],
[3]])

np.linalg.norm(x, 2)

Out[16]:
5.0
In [17]:
np.linalg.norm(x, 1)

Out[17]:
7.0

# 3. System of Linear Equations¶

1. well-determined linear systems
2. under-determined linear systems
3. over-determined linear systems

## 3.1. Well-Determined Linear Systems¶

• System of linear equations

\begin{array}{c}\ 2x_1 + 3x_2 & = 7\\ x_1 + 4x_2 & = 6 \end{array} \; \implies \; \begin{array}{l}\begin{align*} x_1^{*} & = 2\\ x_2^{*} & = 1 \end{align*}\end{array}

• Geometric point of view

• Matrix form

$$\begin{array}{c}\ a_{11}x_1 + a_{12}x_2 = b_1\\ a_{21}x_1 + a_{22}x_2 = b_2 \end{array} \begin{array}{c}\ \quad \text{Matrix form}\\ \implies \end{array} \quad \begin{array}{l} \begin{bmatrix} a_{11} & a_{12}\\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} = \begin{bmatrix} b_{1}\\ b_{2} \end{bmatrix} \end{array}$$

## 3.2. Under-Determined Linear Systems¶

• System of linear equations

\begin{array}{c}\ 2x_1 + 3x_2 = 7 \end{array} \; \implies \; \begin{array}{l}\begin{align} \text{Many solutions} \end{align}\end{array}

• Geometric point of view

• Matrix form

$$\begin{array}{c}\ a_{11}x_1 + a_{12}x_2 = b_1 \end{array} \begin{array}{c}\ \quad \text{Matrix form}\\ \implies \end{array} \quad \begin{array}{l} \begin{bmatrix} a_{11} & a_{12} \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} = b_{1}\\ \end{array}$$

\begin{array}{l} \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \large AX = B \end{array}\begin{array}{l} \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \therefore \; \text{ Many Solutions when $A$ is fat} \end{array}

## 3.3. Over-Determined Linear Systems¶

• System of linear equations

\begin{array}{c}\ 2x_1 + 3x_2 & = 7\\ x_1 + 4x_2 & = 6\\ x_1 + x_2 & = 4 \end{array} \; \implies \; \begin{array}{l}\begin{align} \text{No solutions} \end{align}\end{array}

• Geometric point of view

• Matrix form

$$\begin{array}{c}\ a_{11}x_1 + a_{12}x_2 = b_1\\ a_{21}x_1 + a_{22}x_2 = b_2\\ a_{31}x_1 + a_{32}x_2 = b_3\\ \end{array} \begin{array}{c}\ \quad \text{Matrix form}\\ \implies \end{array} \quad \begin{array}{l} \begin{bmatrix} a_{11} & a_{12}\\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} = \begin{bmatrix} b_{1}\\ b_{2}\\ b_{3} \end{bmatrix} \end{array}$$

\begin{array}{l} \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \large AX = B \end{array}\begin{array}{l} \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \therefore \; \text{ No Solutions when $A$ is skinny} \end{array}

## 3.4. Summary of Linear Systems¶

$$\large AX = B$$

• Square: Well-determined Linear Systems

$$\begin{bmatrix} a_{11} & a_{12}\\ a_{21} & a_{22}\\ \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} = \begin{bmatrix} b_{1}\\ b_{2}\\ \end{bmatrix}$$

• Fat: Under-determined Linear Systems

$$\begin{bmatrix} a_{11} & a_{12}\\ \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix}=b_1$$

• Skinny: Over-determined Linear Systems

$$\begin{bmatrix} a_{11} & a_{12}\\ a_{21} & a_{22}\\ a_{31} & a_{32}\\ \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} = \begin{bmatrix} b_{1}\\ b_{2}\\ b_{3}\\ \end{bmatrix}$$

# 4. Optimization Point of View¶

## 4.1. Least-Square Solution¶

• For over-determined linear system

\begin{align*} \begin{bmatrix} a_{11} & a_{12}\\ a_{21} & a_{22}\\ a_{31} & a_{32} \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2} \end{bmatrix} &\neq \begin{bmatrix} b_{1}\\ b_{2}\\ b_{3} \end{bmatrix} \quad\text{ or }\quad AX \neq B \\ \\ x_1 \begin{bmatrix} a_{11} \\ a_{21} \\ a_{31} \end{bmatrix} + x_2 \begin{bmatrix} a_{12} \\ a_{22} \\ a_{32} \end{bmatrix} &\neq \begin{bmatrix} b_{1}\\ b_{2}\\ b_{3} \end{bmatrix} \end{align*}

Find $X$ that minimizes $\lVert E \rVert$ or $\lVert E \rVert^2$

i.e. optimization problem

\begin{align*} \min\limits_{X}{\lVert E\rVert}^2 & = \min\limits_{X}{\lVert AX - B\rVert}^2\\ X^{*} & = \left( A^TA \right)^{-1}A^TB\\ B^{*} = AX^{*} & = A\left( A^TA \right)^{-1}A^TB \end{align*}

• Geometric point of view

• Often estimation problem

## 4.2. Geometic Point of View: Projection¶

### 4.2.1. Orthogonal Projection onto a Subspace¶

• Projection of $B$ onto a subspace $U$ of span of $A_1$ and $A_2$

• Orthogonality

$$A \perp \left( AX^{*}-B\right)$$\begin{align*} A^T \left(AX^{*} - B \right) & = 0\\ A^TAX^{*} & = A^TB\\ X^{*} & = \left( A^TA \right)^{-1}A^TB\\ B^{*} = AX^{*} & = A\left( A^TA \right)^{-1}A^TB \end{align*}

In [18]:
import numpy as np

A = np.matrix([[1,0],
[0,1],
[0,0]])
B = np.matrix([[1],
[1],
[1]])

X = (A.T*A).I*A.T*B
print(X)

[[1.]
[1.]]

In [19]:
Bstar = A*X
print(Bstar)

[[1.]
[1.]
[0.]]


# 5. Video Lectures¶

In [20]:
%%html
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>

In [21]:
%%html
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>

In [22]:
%%html
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>

In [23]:
%%html
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>

In [24]:
%%html
width="420" height="315" frameborder="0" allowfullscreen></iframe></center>

In [25]:
%%html

%%html

%%javascript