Convolutional Neural Network

By Suhyun Kim
iSystems Design Lab
http://isystems.unist.ac.kr/
UNIST

Table of Contents

1. Convolutional Neural Networks

CNNs are simply neural networks that use convolution in place of general matrix multiplication in at least one of their layers

  • Discrete convolution can be viewed as multiplication by a matrix

CNN Structure

2. Convolutional Neural Network in TensorFlow



  • MNIST example
  • CNN 구조를 이용하여 숫자를 분류하는 네트워크를 구성하는 Example

2.1. Import Library

In [1]:
# Import Library
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

2.2. Load MNIST example

  • MNIST 예제 데이터는 아래 링크를 통해 받을 수 있습니다.
In [2]:
from six.moves import cPickle

mnist = cPickle.load(open('data_files/mnist.pkl', 'rb'))
trainimgs   = mnist.train.images
trainlabels = mnist.train.labels
testimgs    = mnist.test.images
testlabels  = mnist.test.labels
ntrain      = trainimgs.shape[0]
ntest       = testimgs.shape[0]
print ("Packages loaded")
Packages loaded

Design Neural Network

  • Define Variable

    • 학습 과정에서 필요한 변수 (iteration 횟수, learning rate 등) 정의
  • Define Network size

    • Network를 구성하는데 필요한 변수 (hidden layer 수, classes 개수 등) 정의
  • Define Weights

    • 학습될 변수(파라미터) 정의
    • Gradient Descent를 통해서 변화될 변수
      • 처음 시작점은 정규분포를 따르는 임의의 값
  • Define Network

  • Define Cost

    • Neural Net의 output과 label y의 차이가 최소가 되도록 구성

2.3. Define Variable

  • 학습 과정에서 필요한 변수 (iteration 횟수, learning rate 등) 정의
In [3]:
# Define Variable
'''
batch size
learning rate
n_iter
flag
'''

batch_size = 50
learning_rate = 0.1
n_iter = 2500
flag = 250

2.4. Define Network size

  • Network를 구성하는데 필요한 변수 (hidden layer 수, classes 개수 등) 정의
  • CNN의 경우 convolution의 크기를 함께 정의
  • Convolution filter size
k1_height = 5
k1_width = 5
  • Channel
k1_channel = 3
  • Downsampling
k1_pool_height = 2
k1_pool_width = 2
In [5]:
# Define Network size

'''
input shape
convolution kernel size
channel
hidden layer 개수
'''
input_height = 28
input_width = 28
input_channel = 1

k1_height = 5
k1_width = 5
k1_channel = 3

k1_pool_height = 2
k1_pool_width = 2

k2_height = 5
k2_width = 5
k2_channel = 3

k2_pool_height = 2
k2_pool_width = 2

# conv_result_size는 fully connected 를 하기위해 convolution 결과를 한줄로 길게만든 것을 의미
conv_result_size = 7*7*k2_channel
n_hidden = 50
n_classes = 10

2.5. Define Weights

  • 학습될 변수(파라미터) 정의
  • Gradient Descent를 통해서 변화될 변수
    • 처음 시작점은 정규분포를 따르는 임의의 값

In [6]:
# Define Weights

'''
Network 안에 들어갈 weigth, biases 정의
'''

weights = {
    'conv1_w' : tf.Variable(tf.random_normal([k1_height, k1_width, input_channel, k1_channel], stddev = 0.1)),
    'conv2_w' : tf.Variable(tf.random_normal([k2_height, k2_width, k1_channel, k2_channel], stddev = 0.1)),
    'fc_w' : tf.Variable(tf.random_normal([conv_result_size, n_hidden], stddev = 0.1)),
    'output_w' : tf.Variable(tf.random_normal([n_hidden, n_classes], stddev = 0.1))
}
biases = {
    'conv1_b' : tf.Variable(tf.random_normal([k1_channel], stddev = 0.1)),
    'conv2_b' : tf.Variable(tf.random_normal([k2_channel], stddev = 0.1)),
    'fc_b' : tf.Variable(tf.random_normal([n_hidden], stddev = 0.1)),
    'output_b' : tf.Variable(tf.random_normal([n_classes], stddev = 0.1))
}

2.6. Define Network

1) Convolution layer

  • Stride
  • Padding
conv1 = tf.nn.conv2d(x, weights['conv1_w'], 
                         strides= [1,1,1,1], 
                         padding = 'SAME')

2) Nonlinear activation function

conv1 = tf.nn.relu(conv1 + biases['conv1_b'])

3) Max pooling

  • The maximum of a rectangular neighborhood (max pooling operation)
  • Pooling with downsampling
    • reduce the representation size by a factor of 2, which reduces the computational and statistical burden on the next layer

conv1 = tf.nn.max_pool(conv1, 
                           ksize = [1, k1_pool_height, k1_pool_width, 1], 
                           strides = [1, k1_pool_height, k1_pool_width, 1], 
                           padding ='VALID')

4) Classification

output_layer = tf.matmul(hidden, weights['output_w']) + biases['output_b']



In [13]:
# Define Network
def net(x, weights, biases):
    #first conv filter layer
    conv1 = tf.nn.conv2d(x, weights['conv1_w'], 
                         strides= [1,1,1,1], 
                         padding = 'SAME')
    #first activate layer
    conv1 = tf.nn.relu(conv1 + biases['conv1_b'])
    #first max pooling layer
    conv1 = tf.nn.max_pool(conv1, 
                           ksize = [1, k1_pool_height, k1_pool_width, 1], 
                           strides = [1, k1_pool_height, k1_pool_width, 1], 
                           padding = 'VALID'
                           )
    
    #second conv filter layer
    conv2 = tf.nn.conv2d(conv1, weights['conv2_w'], 
                         strides= [1,1,1,1], 
                         padding = 'SAME')
    #second activate layer
    conv2 = tf.nn.relu(conv2 + biases['conv2_b'])
    #second max pooling layer
    conv2 = tf.nn.max_pool(conv2, 
                           ksize = [1, k2_pool_height, k2_pool_width, 1], 
                           strides = [1, k2_pool_height, k2_pool_width, 1], 
                           padding = 'VALID'
                           )

    shape = conv2.get_shape().as_list()
    
    # fully connected layer
    conv_result = tf.reshape(conv2, [-1, shape[1]*shape[2]*shape[3]])
    hidden = tf.matmul(conv_result, weights['fc_w']) + biases['fc_b']
    hidden = tf.nn.relu(hidden)
    
    output_layer = tf.matmul(hidden, weights['output_w']) + biases['output_b']
    
    return output_layer

2.7. Define Cost

  • Neural Net의 output과 label y의 차이가 최소가 되도록 구성
In [14]:
# Define Cost

x = tf.placeholder(tf.float32, [None, input_width, input_height, input_channel])
y = tf.placeholder(tf.float32, [None, n_classes])
In [15]:
pred = net(x, weights, biases)
cost = tf.reduce_mean(tf.square(pred - y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
  • 지금까지 만든 Tensor Graph

2.8. Optimize

In [16]:
# Optimize 
init = tf.initialize_all_variables()
sess = tf.Session()

sess.run(init)

# Training cycle
for epoch in range(n_iter):
    batch_x, batch_y = mnist.train.next_batch(batch_size)
    batch_x = np.reshape(batch_x, [-1, input_width, input_height, input_channel])
    sess.run(optimizer, feed_dict={x: batch_x,  y: batch_y})
    c = sess.run(cost, feed_dict={x: batch_x,  y: batch_y})
    if epoch % flag == 0:
        print ("Iter : {}".format(epoch))
        print ("Cost : {}".format(c))
Iter : 0
Cost : 0.1009894609451294
Iter : 250
Cost : 0.06790834665298462
Iter : 500
Cost : 0.057210423052310944
Iter : 750
Cost : 0.05038468539714813
Iter : 1000
Cost : 0.03442450240254402
Iter : 1250
Cost : 0.03906258940696716
Iter : 1500
Cost : 0.034061431884765625
Iter : 1750
Cost : 0.030880838632583618
Iter : 2000
Cost : 0.03595203161239624
Iter : 2250
Cost : 0.03039967082440853

2.9. Test

In [17]:
test_x, test_y = mnist.test.next_batch(1)

plt.imshow(test_x.reshape(28, 28))
plt.show()
In [18]:
test_x = np.reshape(test_x, [-1, input_width, input_height, input_channel])
predict_x = sess.run(pred, feed_dict={x: test_x})
print (np.argmax(predict_x, 1)[0])
7
In [2]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')