Regression and Classification with TensorFlow

By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

Table of Contents

1. Deep Learning Libraries¶

Tensorflow

Platform: Linux, Mac OS, Windows
Written in: C++, Python
Interface: Python, C/C++, Java, Go, R
https://www.tensorflow.org/

Keras

https://keras.io/

PyTorch

https://pytorch.org/

2. TensorFlow¶

TensorFlow is an open-source software library for deep learning.

It’s a framework to perform computation very efficiently, and it can tap into the GPU (Graphics Processor Unit) in order to speed it up even further. This will make a huge effect as we shall see shortly. TensorFlow can be controlled by a simple Python API.

Tensorflow is one of the widely used libraries for implementing machine learning and other algorithms involving large number of mathematical operations. Tensorflow was developed by Google and it’s one of the most popular Machine Learning libraries on GitHub. Google uses Tensorflow for implementing Machine learning in almost all applications.

Tensor

TensorFlow gets its name from tensors, which are arrays of arbitrary dimensionality. A vector is a 1-d array and is known as a 1st-order tensor. A matrix is a 2-d array and a 2nd-order tensor. The "flow" part of the name refers to computation flowing through a graph. Training and inference in a neural network, for example, involves the propagation of matrix computations through many nodes in a computational graph.

2.1. Computational Graph¶

tf.constant
tf.Variable
tf.placeholder

tf.constant¶

tf.constant creates a constant tensor specified by value, dtype, shape and so on.

import tensorflow as tf

a = tf.constant([1,2,3])
b = tf.constant(4, shape=[1,3])

A = a + b
B = a*b

The result of the lines of code is an abstract tensor in the computation graph. However, contrary to what you might expect, the result doesn’t actually get calculated. It just defined the model, but no process ran to calculate the result.

A

<tf.Tensor 'add:0' shape=(1, 3) dtype=int32>

B

<tf.Tensor 'mul:0' shape=(1, 3) dtype=int32>

To run any of the three defined operations, we need to create a session for that graph. The session will also allocate memory to store the current value of the variable.

When you think of doing things in TensorFlow, you might want to think of creating tensors (like matrices), adding operations (that output other tensors), and then executing the computation (running the computational graph). In particular, it's important to realize that when you add an operation on tensors, it doesn't execute immediately. Rather, TensorFlow waits for you to define all the operations you want to perform. Then, TensorFlow optimizes the computation graph, deciding how to execute the computation, before generating the data. Because of this, a tensor in TensorFlow isn't so much holding the data as a placeholder for holding the data, waiting for the data to arrive when a computation is executed.

sess = tf.Session()
sess.run(A)

array([[5, 6, 7]])

sess.run(B)

array([[ 4,  8, 12]])

You can also use the following lines of code to start up an interactive Session, run the result and close the Session automatically again after printing the output:

a = tf.constant([1,2,3])
b = tf.constant([4,5,6])

result = tf.multiply(a, b)

with tf.Session() as sess:
    output = sess.run(result)
    print(output)

[ 4 10 18]

tf.Variable¶

tf.Variable is regarded as the decision variable in optimization. We should initialize variables to use tf.Variable.

x1 = tf.Variable([1, 1], dtype = tf.float32)
x2 = tf.Variable([2, 2], dtype = tf.float32)
y = x1 + x2

print(y)

Tensor("add_1:0", shape=(2,), dtype=float32)

sess = tf.Session()

init = tf.global_variables_initializer()
sess.run(init)

sess.run(y)

array([3., 3.], dtype=float32)

tf.placeholder¶

The value of tf.placeholder must be fed using the feed_dict optional argument to Session.run().

sess = tf.Session()
x = tf.placeholder(tf.float32, shape = [2,2])

sess.run(x, feed_dict = {x : [[1,2],[3,4]]})

array([[1., 2.],
       [3., 4.]], dtype=float32)

a = tf.placeholder(tf.float32, shape = [2])
b = tf.placeholder(tf.float32, shape = [2])

sum = a + b

sess.run(sum, feed_dict = {a : [1,2], b : [3,4]})

array([4., 6.], dtype=float32)

2.2. Tensor Manipulation¶

Adding Matrices
Multiplying Matrices
Reshape

Adding Matrices¶

x1 = tf.constant(1, shape = [3])
x2 = tf.constant(2, shape = [3])
output = tf.add(x1, x2)

with tf.Session() as sess:
    result = sess.run(output)
    print(result)

[3 3 3]

x1 = tf.constant(1, shape = [2, 3])
x2 = tf.constant(2, shape = [2, 3])
output = tf.add(x1, x2)

with tf.Session() as sess:
    result = sess.run(output)
    print(result)

[[3 3 3]
 [3 3 3]]

Multiplying Matrices¶

x1 = tf.constant([[1, 2], 
                  [3, 4]])
x2 = tf.constant([[2],[3]])

output1 = tf.matmul(x1, x2)

with tf.Session() as sess:
    result = sess.run(output1)
    print(result)

[[ 8]
 [18]]

output2 = x1*x2

with tf.Session() as sess:
    result = sess.run(output2)
    print(result)

[[ 2  4]
 [ 9 12]]

Reshape¶

x = [1, 2, 3, 4, 5, 6, 7, 8]

x_re = tf.reshape(x, [4,2])

sess = tf.Session()
sess.run(x_re)

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

x_re = tf.reshape(x, [2,-1])

sess = tf.Session()
sess.run(x_re)

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

2.3. TensorFlow as Optimization Solver¶

$$\min_{\omega}\;\;(\omega - 4)^2$$

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

w = tf.Variable(0, dtype = tf.float32) 
cost =  w*w - 8*w +16

LR = 0.05
optm = tf.train.GradientDescentOptimizer(LR).minimize(cost)

init = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init)

cost_record = []
for _ in range(50):
    sess.run(optm)
    print(sess.run(cost))
    cost_record.append(sess.run(cost))    
    
print("\n optimal w =", sess.run(w))

plt.figure(figsize = (10,8))
plt.plot(cost_record)
plt.xlabel('iteration', fontsize = 15)
plt.ylabel('cost', fontsize = 15)
plt.show()

12.96
10.497601
8.503056
6.887476
5.5788546
4.5188723
3.660286
2.9648323
2.401514
1.9452267
1.575634
1.2762632
1.0337734
0.83735657
0.6782589
0.54938984
0.44500542
0.36045456
0.29196835
0.23649406
0.19155979
0.15516376
0.12568283
0.101802826
0.0824604
0.06679344
0.054101944
0.043823242
0.03549671
0.028752327
0.02328968
0.018864632
0.01527977
0.012376785
0.010025024
0.008120537
0.0065774918
0.0053281784
0.0043153763
0.0034952164
0.002831459
0.0022935867
0.0018577576
0.0015048981
0.0012187958
0.0009870529
0.00080013275
0.00064754486
0.0005245209
0.00042533875

 optimal w = 3.979385

3. Machine Learning with TensorFlow¶

3.1. Linear Regression¶

$$\hat{y} = \omega x + b$$

Given $x$ and $y$
Want to estimate $\omega$ and $b$

Data generation

# data points in column vector [input, output]
train_x = np.array([0.1, 0.4, 0.7, 1.2, 1.3, 1.7, 2.2, 2.8, 3.0, 4.0, 4.3, 4.4, 4.9]).reshape(-1, 1)
train_y = np.array([0.5, 0.9, 1.1, 1.5, 1.5, 2.0, 2.2, 2.8, 2.7, 3.0, 3.5, 3.7, 3.9]).reshape(-1, 1)

m = train_x.shape[0]

plt.figure(figsize = (10,8))
plt.plot(train_x, train_y, 'ko')
plt.title('Data', fontsize = 15)
plt.xlabel('X', fontsize = 15)
plt.ylabel('Y', fontsize = 15)
plt.axis('equal')
plt.grid(alpha = 0.3)
plt.xlim([0, 5])
plt.show()

Given $(x_i, y_i)$ for $i=1,\cdots, m$

$$ \hat{y}_{i} = \omega x_{i} + b \; \quad \text{ such that }\quad \min\limits_{\omega, b}\sum\limits_{i = 1}^{m} (\hat{y}_{i} - y_{i})^2$$

LR = 0.001                                                       
n_iter = 10000                                                     

x = tf.placeholder(tf.float32, [m, 1])
y = tf.placeholder(tf.float32, [m, 1])

w = tf.Variable([[0]], dtype = tf.float32)
b = tf.Variable([[0]], dtype = tf.float32)

#y_pred = tf.matmul(x, w) + b
y_pred = tf.add(tf.matmul(x, w), b)
loss = tf.square(y_pred - y)
loss = tf.reduce_mean(loss)

optm = tf.train.GradientDescentOptimizer(LR).minimize(loss)

sess = tf.Session()                
sess.run(tf.global_variables_initializer())

loss_record = []
for epoch in range(n_iter):                                                                         
    _, c = sess.run([optm, loss], feed_dict = {x: train_x, y: train_y})
    loss_record.append(c)
    
w_val = sess.run(w)
b_val = sess.run(b)

sess.close()

print(b_val)
print(w_val)

plt.figure(figsize = (10,8))
plt.plot(loss_record)
plt.xlabel('iteration', fontsize = 15)
plt.ylabel('loss', fontsize = 15)
plt.show()

[[0.6515197]]
[[0.67176265]]

xp = np.arange(0, 5, 0.01).reshape(-1, 1)
yp = w_val*xp + b_val

plt.figure(figsize = (10,8))
plt.plot(train_x, train_y, 'ko')
plt.plot(xp, yp, 'r')
plt.title('Data', fontsize = 15)
plt.xlabel('X', fontsize = 15)
plt.ylabel('Y', fontsize = 15)
plt.axis('equal')
plt.grid(alpha = 0.3)
plt.xlim([0, 5])
plt.show()

Linear regression with with tf.Session() as sess:

LR = 0.001                                                       
n_iter = 10000                                                     

x = tf.placeholder(tf.float32, [m, 1])
y = tf.placeholder(tf.float32, [m, 1])

w = tf.Variable([[0]], dtype = tf.float32)
b = tf.Variable([[0]], dtype = tf.float32)

#y_pred = tf.matmul(x, w) + b
y_pred = tf.add(tf.matmul(x, w), b)
loss = tf.square(y_pred - y)
loss = tf.reduce_mean(loss)

optm = tf.train.GradientDescentOptimizer(LR).minimize(loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(n_iter):                                                                         
        sess.run(optm, feed_dict = {x: train_x, y: train_y})    
    w_val = sess.run(w)
    b_val = sess.run(b)

xp = np.arange(0, 5, 0.01).reshape(-1, 1)
yp = w_val*xp + b_val

plt.figure(figsize = (10,8))
plt.plot(train_x, train_y, 'ko')
plt.plot(xp, yp, 'r')
plt.title('Data', fontsize = 15)
plt.xlabel('X', fontsize = 15)
plt.ylabel('Y', fontsize = 15)
plt.axis('equal')
plt.grid(alpha = 0.3)
plt.xlim([0, 5])
plt.show()

3.2. Logistic Regression¶

$$ \begin{align*} \omega &= \begin{bmatrix} \omega_0 \\ \omega_1 \\ \omega_2\end{bmatrix}, \qquad x = \begin{bmatrix} 1 \\ x_1 \\ x_2\end{bmatrix}\\ \\ X &= \begin{bmatrix} \left(x^{(1)}\right)^T \\ \left(x^{(2)}\right)^T \\ \left(x^{(3)}\right)^T \\ \vdots\end{bmatrix} = \begin{bmatrix} 1 & x_1^{(1)} & x_2^{(1)} \\ 1 & x_1^{(2)} & x_2^{(2)} \\ 1 & x_1^{(3)} & x_2^{(3)} \\ \vdots & \vdots & \vdots \\\end{bmatrix}, \quad y = \begin{bmatrix} y^{(1)}\\ y^{(2)} \\y^{(3)} \\ \vdots \end{bmatrix} \end{align*} $$

# datat generation

m = 1000
true_w = np.array([[-6], [2], [1]])
train_X = np.hstack([np.ones([m,1]), 5*np.random.rand(m,1), 4*np.random.rand(m,1)])

true_w = np.asmatrix(true_w)
train_X = np.asmatrix(train_X)

train_y = 1/(1 + np.exp(-train_X*true_w)) > 0.5 

C1 = np.where(train_y == True)[0]
C0 = np.where(train_y == False)[0]

train_y = np.empty([m,1])
train_y[C1] = 1
train_y[C0] = 0

plt.figure(figsize = (10,8))
plt.plot(train_X[C1,1], train_X[C1,2], 'ro', alpha = 0.3, label='C1')
plt.plot(train_X[C0,1], train_X[C0,2], 'bo', alpha = 0.3, label='C0')
plt.xlabel(r'$x_1$', fontsize = 15)
plt.ylabel(r'$x_2$', fontsize = 15)
plt.legend(loc = 1, fontsize = 12)
plt.axis('equal')
plt.ylim([0,4])
plt.show()

$$ \begin{align*} \ell(\omega) = \log \mathscr{L}(\omega) &= \sum_{i=1}^{m} y^{(i)} \log h_{\omega} \left(x^{(i)} \right) + \left(1-y^{(i)} \right) \log \left(1-h_{\omega} \left(x^{(i)} \right) \right)\\ &\Rightarrow \frac{1}{m} \sum_{i=1}^{m} y^{(i)} \log h_{\omega} \left(x^{(i)} \right) + \left(1-y^{(i)} \right) \log \left(1-h_{\omega} \left(x^{(i)} \right) \right) \end{align*} $$

LR = 0.05
n_iter = 15000

X = tf.placeholder(tf.float32, [m, 3])
y = tf.placeholder(tf.float32, [m, 1])

w = tf.Variable([[0],[0],[0]], dtype = tf.float32)

y_pred = tf.sigmoid(tf.matmul(X,w))
loss = - y*tf.log(y_pred) - (1-y)*tf.log(1-y_pred)
loss = tf.reduce_mean(loss)

optm = tf.train.GradientDescentOptimizer(LR).minimize(loss)

loss_record = []
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(n_iter): 
        _, c = sess.run([optm, loss], feed_dict = {X: train_X, y: train_y})
        loss_record.append(c)
    
    w_hat = sess.run(w)

print(w_hat)    

plt.figure(figsize = (10,8))
plt.plot(loss_record)
plt.xlabel('iteration', fontsize = 15)
plt.ylabel('loss', fontsize = 15)
plt.show()

[[-11.492279 ]
 [  3.989941 ]
 [  1.8672042]]

xp = np.arange(0, 4, 0.01).reshape(-1, 1)
yp = - w_hat[1,0]/w_hat[2,0]*xp - w_hat[0,0]/w_hat[2,0]

plt.figure(figsize = (10,8))
plt.plot(train_X[C1,1], train_X[C1,2], 'ro', alpha = 0.3, label = 'C1')
plt.plot(train_X[C0,1], train_X[C0,2], 'bo', alpha = 0.3, label = 'C0')
plt.plot(xp, yp, 'g', linewidth = 3, label = 'Logistic Regression')
plt.xlabel(r'$x_1$', fontsize = 15)
plt.ylabel(r'$x_2$', fontsize = 15)
plt.legend(loc = 1, fontsize = 12)
plt.axis('equal')
plt.ylim([0,4])
plt.show()

TensorFlow embedded functions

tf.nn.sigmoid_cross_entropy_with_logits for binary classification
tf.nn.softmax_cross_entropy_with_logits for multiclass classification

LR = 0.05
n_iter = 30000

X = tf.placeholder(tf.float32, [m, 3])
y = tf.placeholder(tf.float32, [m, 1])

w = tf.Variable(tf.random_normal([3,1]), dtype = tf.float32)

y_pred = tf.matmul(X,w)
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits = y_pred, labels = y)
loss = tf.reduce_mean(loss)

optm = tf.train.GradientDescentOptimizer(LR).minimize(loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(n_iter):                                                                         
        sess.run(optm, feed_dict = {X: train_X, y: train_y})   
    
    w_hat = sess.run(w)

print(w_hat)    

xp = np.arange(0, 4, 0.01).reshape(-1, 1)
yp = - w_hat[1,0]/w_hat[2,0]*xp - w_hat[0,0]/w_hat[2,0]

plt.figure(figsize=(10,8))
plt.plot(train_X[C1,1], train_X[C1,2], 'ro', alpha = 0.3, label = 'C1')
plt.plot(train_X[C0,1], train_X[C0,2], 'bo', alpha = 0.3, label = 'C0')
plt.plot(xp, yp, 'g', linewidth = 3, label = 'Logistic Regression')
plt.xlabel(r'$x_1$', fontsize = 15)
plt.ylabel(r'$x_2$', fontsize = 15)
plt.legend(loc = 1, fontsize = 12)
plt.axis('equal')
plt.ylim([0,4])
plt.show()

WARNING: Logging before flag parsing goes to stderr.
W0812 15:25:13.135410  4008 deprecation.py:323] From c:\users\seungchul\appdata\local\programs\python\python35\lib\site-packages\tensorflow\python\ops\nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where

[[-14.93718  ]
 [  5.1005177]
 [  2.464404 ]]

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')