Table of Contents
Correlation of Two Random Variables
Correlation Coefficient Plot
Covariance Matrix
How?
idea: highly correlated data contains redundant features
Now consider a change of axes
Each example $x$ has 2 features $\{u_1,u_2\}$
Consider ignoring the feature $u_2$ for each example
Each 2-dimensional example $x$ now becomes 1-dimensional $x = \{u_1\}$
Are we losing much information by throwing away $u_2$ ?
No. Most of the data spread is along $u_1$ (very little variance along $u_2$)