Machine Learning

Table of Contents

1. Introduction

  • (unofficial TA) Hojin Lee (hojin12312@gmail.com)
    • Office: 112 / 604-4
    • Phone: 2728
    • Office hours: any time

1.1. Course Info

  • Machine learning
    • Linear algebra
    • Optimization
    • Statistical and probabilistic approaches
  • Python (or MATLAB) in class and assignments
    • use it a lot
    • Provide all necessary .py codes (.m code) for a class
  • Evaluation
    • Two exams (30% + 35%)
    • Many assignments (25%)
    • Class participation (10%)

1.2. What is Machine Learning

  • Draw a meaningful conclusion, given a set of data (observation, measurement)
  • In 1959, Arthur Samuel defined machine learning as a "Field of study that gives computers the ability to learn without being explicitly programmed"

    • Often hand programming not possible
    • Solution? Get the computer to program itself, by showing it examples of the behavior we want! This is the learning approach of AI
    • Really, we write the structure of the program and the computer tunes many internal parameters
  • Many related terms:
    • Pattern recognition
    • Neural networks $\rightarrow$ Deep learning
    • Data mining
    • Adaptive control
    • Statistical modeling
    • Data analytics / data science
    • Artificial intelligence
    • Machine learning

(source: lecture video from The Machine Learning Summer School by Zoubin Ghahramani, Univ. of Cambridge)

1.3. Learning: the View from Different Fields

  • Engineering
    • Signal processing, system identification, adaptive and optimal control, information theory, robotics, …
  • Computer science
    • Artificial intelligence, computer vision, …
  • Statistics
    • Learning theory, data mining, learning and inference from data, …
  • Cognitive science and psychology
    • Perception, movement control, reinforcement learning, mathematical psychology, …
  • Economics
    • Decision theory, game theory, operational research, …

(source: lecture video from The Machine Learning Summer School by Zoubin Ghahramani, Univ. of Cambridge)

1.4. Course Roadmap

  • Supervised Learning
    • Regression
      • linear, nonlinear (kernel), ridge ($L_1$ norm regularization), lasso ($L_2$ norm regularization)
    • Classification
      • perceptron, logistic regression, SVM, Beysian classifier
  • Unsupervised Learning
    • Clustering
      • k-means, Gaussian Mixture Model
      • graph partitioning (spectral clustering)
  • Required tools
    • linear algebra
      • matrix
      • $Ax = b$
      • projection
      • eigen analysis
      • SVD
    • optimization
      • least squares
      • cvx, linprog, intlinprog
    • statistics
      • Law of large numbers, central limit theorem
      • correlation
    • probability
      • Random variable, Gaussian density distribution, conditional probability,
      • maximum likelihood, maximum a posterior (MAP), Beysian thinking



2. What Will We Cover?

  • I will show you some examples

2.1. Data Fitting or Approximation (Regression)

  • a statistical process for estimating the relationships among variables (source: wikipedia)





2.2. Classification

  • the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known (source: wikipedia)



2.3. Sudoku (part of optimization demos)

  • The objective is to fill a 9×9 grid with digits so that each column, each row, and each of the nine 3×3 sub-grids that compose the grid contains all of the digits from 1 to 9 (source: wikipedia)



2.4. Gaussian Density Distribution for Probabilistic Approach

  • Conditional of a joint Gaussian is Gaussian



2.5. Machine Learning in Image Processing

  • Low rank approximation

  • Data compression



2.6. Handwritten Digit Recognition

  • famous classification problem



2.7. Face Recognition

  • famous classification problem



2.8. Dimension Reduction (Multiple Senses + Principal Components)

  • the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.



2.9. Google PageRank

PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named after Larry Page one of the founders of Google. PageRank is a way of measuring the importance of website pages. According to Google:

PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.

Networks

2.10. Community Detection in Social Networks (Facebook)

  • Visualizing Polarization in Political Blogs (animation)

  • Networks



Next Class

In [3]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')