KAIST 산학협동 공개강좌

인공지능과 설계: 해석 예측에서 설계 최적화까지


Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Practice Aims & Objectives

  1. Implement convolution neural networks (CNN) for classification tasks
  2. Perform pre-processing for image data
In [ ]:
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
import cv2
%matplotlib inline

import sklearn
from sklearn.model_selection import train_test_split

# import warnings
# import os
# from os.path import join

1. Convolution

1.1 1D Convolution


1.2 Images


1.3 Convolution on Image (= Convolution in 2D)

Filter (or Kernel)

  • Modify or enhance an image by filtering
  • Filter images to emphasize certain features or remove other features
  • Filtering includes smoothing, sharpening and edge enhancement

  • Discrete convolution can be viewed as element-wise multiplication by a matrix


How to find the right Kernels

  • We learn many different kernels that make specific effect on images

  • Let’s apply an opposite approach

  • We are not designing the kernel, but are learning the kernel from data

  • Can learn feature extractor from data using a deep learning framework

2. Semiconductor Wafer Failure Maps

2.1 Data Description

  • Wafer failure classification for process analysis in semoconductor fields.


2.2 Load Data

In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
In [ ]:
df = pd.read_pickle('/content/drive/MyDrive/tutorials/산학협동강좌/data/wafers.pkl')

# USB로 파일을 다운 받으신 경우
# df = pd.read_pickle('wafers.pkl')
In [ ]:
df.head()
Out[ ]:
waferMap encoded_labels
0 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 0
1 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 0
2 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 1, 2, 1,... 0
3 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 1,... 0
4 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 1, 1, 2,... 0
In [ ]:
print('Number of Center Failures: {}'.format(len(np.where(df.encoded_labels == 0)[0])))
print('Number of Edge-Local Failures: {}'.format(len(np.where(df.encoded_labels == 1)[0])))
print('Number of Edge-Ring Failures: {}'.format(len(np.where(df.encoded_labels == 2)[0])))
print('Number of Local Failures: {}'.format(len(np.where(df.encoded_labels == 3)[0])))
Number of Center Failures: 300
Number of Edge-Local Failures: 300
Number of Edge-Ring Failures: 300
Number of Local Failures: 300

2.3 Image Pre-processing

Different Image Size → Image Resizing

In [ ]:
random_indices = np.random.choice(range(1200), 20, replace=False)

plt.figure(figsize = (10, 10))

for i in range(20):
    ax = plt.subplot(4, 5, i + 1)
    ax.axis('off')
    ax.imshow(df['waferMap'][random_indices[i]])
    ax.set_title('Shape: {}'.format(df['waferMap'][random_indices[i]].shape))

Resizing for Same Size

In [ ]:
resized_images = []

for i in range(len(df)):
    img = df.iloc[i]['waferMap']
    resized_img = cv2.resize(img, (32, 32))
    resized_images.append(resized_img)

df['waferMap_resized'] = resized_images
In [ ]:
plt.figure(figsize = (10, 10))
for i in range(20):
    ax = plt.subplot(4, 5, i + 1)
    ax.axis('off')
    ax.imshow(df['waferMap_resized'][random_indices[i]])
    ax.set_title('Shape: {}'.format(df['waferMap_resized'][random_indices[i]].shape))

Data Split: Train/Test

In [ ]:
all_x = df['waferMap_resized'].values
all_y = df['encoded_labels'].values

train_x, test_x, train_y, test_y = train_test_split(all_x, all_y, test_size = 0.2, random_state = 42)
In [ ]:
print(train_y.shape)
print(test_y.shape)
(960,)
(240,)
In [ ]:
train_x = np.stack(train_x).reshape(-1, 32, 32, 1)
train_y = np.stack(train_y)

test_x = np.stack(test_x).reshape(-1, 32, 32, 1)
test_y = np.stack(test_y)
In [ ]:
print('Train x : {}, y : {}'.format(train_x.shape, train_y.shape))
print('Test x: {}, y : {}'.format(test_x.shape, test_y.shape))
Train x : (960, 32, 32, 1), y : (960,)
Test x: (240, 32, 32, 1), y : (240,)

2.4 Train and Results

Build CNN Model


In [ ]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (32, 32, 1)),
    tf.keras.layers.MaxPool2D((2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (14, 14, 32)),
    tf.keras.layers.MaxPool2D((2, 2)),
    tf.keras.layers.Flatten(input_shape = (7, 7)),
    tf.keras.layers.Dense(32, activation = 'relu'),
    tf.keras.layers.Dense(4, activation = 'softmax')
])

model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 32, 32, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2  (None, 16, 16, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 8, 8, 64)          0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 4096)              0         
                                                                 
 dense (Dense)               (None, 32)                131104    
                                                                 
 dense_1 (Dense)             (None, 4)                 132       
                                                                 
=================================================================
Total params: 150052 (586.14 KB)
Trainable params: 150052 (586.14 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Training

In [ ]:
model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])
In [ ]:
history = model.fit(train_x, train_y,
                    epochs = 10,
                    batch_size = 32)
Epoch 1/10
30/30 [==============================] - 5s 108ms/step - loss: 1.2703 - accuracy: 0.4500
Epoch 2/10
30/30 [==============================] - 3s 103ms/step - loss: 0.8890 - accuracy: 0.6271
Epoch 3/10
30/30 [==============================] - 3s 83ms/step - loss: 0.7140 - accuracy: 0.6969
Epoch 4/10
30/30 [==============================] - 3s 85ms/step - loss: 0.5432 - accuracy: 0.7844
Epoch 5/10
30/30 [==============================] - 2s 83ms/step - loss: 0.4355 - accuracy: 0.8625
Epoch 6/10
30/30 [==============================] - 3s 99ms/step - loss: 0.3500 - accuracy: 0.8729
Epoch 7/10
30/30 [==============================] - 3s 106ms/step - loss: 0.3065 - accuracy: 0.9031
Epoch 8/10
30/30 [==============================] - 3s 84ms/step - loss: 0.2884 - accuracy: 0.8896
Epoch 9/10
30/30 [==============================] - 3s 84ms/step - loss: 0.2109 - accuracy: 0.9219
Epoch 10/10
30/30 [==============================] - 3s 93ms/step - loss: 0.1928 - accuracy: 0.9365
In [ ]:
history.history.keys()
Out[ ]:
dict_keys(['loss', 'accuracy'])

Learning Curve

In [ ]:
# accuracy plot
plt.plot(history.history['accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

# loss plot
plt.plot(history.history['loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.show()

Test Result

In [ ]:
test_loss, test_acc = model.evaluate(test_x, test_y, verbose = 2)

print('loss = {}, Accuracy = {} %'.format(test_loss, test_acc * 100))
8/8 - 0s - loss: 0.3378 - accuracy: 0.8625 - 294ms/epoch - 37ms/step
loss = 0.3377847969532013, Accuracy = 86.2500011920929 %
In [ ]:
pred_y = model.predict_on_batch(test_x[:4])

for i in range(4):
    plt.figure(figsize = (8, 4))
    plt.subplot(1, 2, 1)
    plt.imshow(test_x[i].reshape(32, 32))
    plt.axis('off')

    plt.subplot(1, 2, 2)
    plt.stem(pred_y[i])
    plt.xticks(np.arange(4), ['Center', 'Edge-Loc', 'Edge-Ring', 'Loc'])  # Set text labels.
    plt.show()
    print('Prediction : {}'.format(np.argmax(pred_y[i])))
    print('Probability : {}'.format(pred_y[i]))
Prediction : 3
Probability : [0.00114925 0.259371   0.0041999  0.73527986]
Prediction : 2
Probability : [2.2722630e-05 3.2596304e-03 9.9552864e-01 1.1890426e-03]
Prediction : 0
Probability : [9.9995279e-01 2.6666816e-05 1.3303502e-05 7.1745517e-06]
Prediction : 1
Probability : [0.00144468 0.9030464  0.02681382 0.06869499]

Confusion Matrix

In [ ]:
import seaborn as sns
from sklearn.metrics import confusion_matrix

#Predict
pred_y = model.predict(test_x)
pred_y = np.argmax(pred_y, axis = 1)
result = confusion_matrix(test_y, pred_y, normalize = 'pred')
print(result)
8/8 [==============================] - 0s 22ms/step
[[0.92537313 0.         0.         0.        ]
 [0.         0.69512195 0.03389831 0.03125   ]
 [0.         0.03658537 0.96610169 0.        ]
 [0.07462687 0.26829268 0.         0.96875   ]]
In [ ]:
plt.figure(figsize = (6, 5))

heatmap = sns.heatmap(result, annot = True)
heatmap.yaxis.set_ticklabels(['Center', 'Edge-Loc', 'Edge-Ring', 'Loc'], fontsize = 10)
heatmap.xaxis.set_ticklabels(['Center', 'Edge-Loc', 'Edge-Ring', 'Loc'], fontsize = 10)
plt.ylabel('True label', fontsize = 13)
plt.xlabel('Predicted label', fontsize = 13)

plt.show()