Deep Learning for Mechanical Engineering

Homework 06

Due Friday, 10/13/2023, 23:59 PM


Instructor: Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
  • For your handwritten solutions, please scan or take a picture of them. Alternatively, you can write them in markdown if you prefer.

  • Only .ipynb files will be graded for your code.

    • Ensure that your NAME and student ID are included in your .ipynb files. ex) IljeokKim_20202467_HW06.ipynb
  • Compress all the files into a single .zip file.

    • In the .zip file's name, include your NAME and student ID. ex) DogyeomPark_20202467_HW06.zip
    • Submit this .zip file on KLMS
  • Do not submit a printed version of your code, as it will not be graded.

Problem 1: Sign Language Classification with CNN¶

As we have learned in the lectures, Convolutional Neural Networks (CNN) have the capability to classify images directly. This characteristic makes CNN highly versatile and applicable to a wide range of industries. In this particular task, we aim to employ CNN for the classification of sign language. This application will enable individuals to engage in direct communication with those who are hearing impaired, even if they are not familiar with sign language.

(1) Load and plot 6 random data points. You need to load a total of 4 files.

In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
# Change file paths if necessary

train_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_train_img.npy')
train_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_train_label.npy')

test_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_test_img.npy')
test_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_test_label.npy')
In [ ]:
# check the shape of data

print(train_x.shape)
print(train_y.shape)
print(test_x.shape)
print(test_y.shape)
(900, 100, 100, 1)
(900, 6)
(337, 100, 100, 1)
(337, 6)
In [ ]:
## your code here
#

(2) Design your CNN structure and train it with the training data.

  • input
  • filter size
  • pooling size
  • hidden layer
  • output
In [ ]:
## your code here
#
In [ ]:
model.compile(optimizer = 'adam',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])
In [ ]:
model.fit(train_x, train_y, epochs = 20)
Epoch 1/20
29/29 [==============================] - 6s 157ms/step - loss: 32.3575 - accuracy: 0.2044
Epoch 2/20
29/29 [==============================] - 3s 110ms/step - loss: 4.9883 - accuracy: 0.4733
Epoch 3/20
29/29 [==============================] - 3s 95ms/step - loss: 2.0535 - accuracy: 0.6544
Epoch 4/20
29/29 [==============================] - 3s 94ms/step - loss: 1.2098 - accuracy: 0.7400
Epoch 5/20
29/29 [==============================] - 3s 101ms/step - loss: 0.8137 - accuracy: 0.8189
Epoch 6/20
29/29 [==============================] - 5s 170ms/step - loss: 0.8209 - accuracy: 0.8256
Epoch 7/20
29/29 [==============================] - 3s 94ms/step - loss: 0.5930 - accuracy: 0.8622
Epoch 8/20
29/29 [==============================] - 3s 94ms/step - loss: 0.3508 - accuracy: 0.9111
Epoch 9/20
29/29 [==============================] - 3s 95ms/step - loss: 0.3858 - accuracy: 0.8978
Epoch 10/20
29/29 [==============================] - 3s 120ms/step - loss: 0.3434 - accuracy: 0.9156
Epoch 11/20
29/29 [==============================] - 4s 143ms/step - loss: 0.2289 - accuracy: 0.9233
Epoch 12/20
29/29 [==============================] - 3s 95ms/step - loss: 0.2852 - accuracy: 0.9200
Epoch 13/20
29/29 [==============================] - 3s 94ms/step - loss: 0.2753 - accuracy: 0.9189
Epoch 14/20
29/29 [==============================] - 3s 95ms/step - loss: 0.2312 - accuracy: 0.9333
Epoch 15/20
29/29 [==============================] - 4s 147ms/step - loss: 0.1573 - accuracy: 0.9511
Epoch 16/20
29/29 [==============================] - 6s 203ms/step - loss: 0.1503 - accuracy: 0.9533
Epoch 17/20
29/29 [==============================] - 3s 98ms/step - loss: 0.2285 - accuracy: 0.9322
Epoch 18/20
29/29 [==============================] - 3s 94ms/step - loss: 0.1649 - accuracy: 0.9522
Epoch 19/20
29/29 [==============================] - 5s 160ms/step - loss: 0.0878 - accuracy: 0.9689
Epoch 20/20
29/29 [==============================] - 3s 104ms/step - loss: 0.0619 - accuracy: 0.9767
Out[ ]:
<keras.src.callbacks.History at 0x7caf404de950>

(4) Test your model. Calculate accuracy and plot a random image with its predicted and true label.

  • Note: test accuracy should be higher than 80%.
In [ ]:
## your code here
#
11/11 [==============================] - 1s 37ms/step - loss: 0.7458 - accuracy: 0.8783
loss = 0.7458071708679199, Accuracy = 87.83382773399353 %
In [ ]:
## your code here
#
1/1 [==============================] - 0s 105ms/step
True: 4
Predict: 4

Problem 2: Understanding the Feature Map for Each Layer¶

Let's build a CNN model that classifies steel surface defects.

  • In order for CNN to preceed with classification, training was carried out using a convolution filter.
  • Let's visualize the feature map for each convolutional layer to understand the CNN's decision boundary.
  • NEU steel surface defects example.
  • To classify defects images into 6 classes.



Download NEU steel surface defects images and labels

(1) Load and plot 3 random images for 6 classes.

In [ ]:
# Change file paths if necessary

train_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_train_imgs.npy')
train_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_train_labels.npy')

test_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_test_imgs.npy')
test_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_test_labels.npy')

print(train_x.shape)
print(train_y.shape)
print(test_x.shape)
print(test_y.shape)
(1500, 200, 200, 1)
(1500,)
(300, 200, 200, 1)
(300,)
In [ ]:
name = ['scratches', 'rolled-in scale', 'pitted surface', 'patches', 'inclusion', 'crazing']

plt.figure(figsize = (12,7))

## your code here
#

(2) Design your CNN structure and train it with the training data.

  • input
  • filter size
  • pooling size
  • hidden layer
  • output

$\quad$Note:

  • Check the data shape. (CNN input is image !)
  • Check y-label shape. (one-hot encoding)
  • Construct 5-convolution blocks (conv_layer and pool_layer)
In [ ]:
## your code here
#
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_2 (Conv2D)           (None, 200, 200, 32)      320       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 100, 100, 32)      0         
 g2D)                                                            
                                                                 
 conv2d_3 (Conv2D)           (None, 100, 100, 64)      18496     
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 50, 50, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_4 (Conv2D)           (None, 50, 50, 128)       73856     
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 25, 25, 128)       0         
 g2D)                                                            
                                                                 
 conv2d_5 (Conv2D)           (None, 25, 25, 256)       295168    
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 12, 12, 256)       0         
 g2D)                                                            
                                                                 
 conv2d_6 (Conv2D)           (None, 12, 12, 512)       1180160   
                                                                 
 max_pooling2d_6 (MaxPoolin  (None, 6, 6, 512)         0         
 g2D)                                                            
                                                                 
 flatten_1 (Flatten)         (None, 18432)             0         
                                                                 
 dense_2 (Dense)             (None, 128)               2359424   
                                                                 
 dense_3 (Dense)             (None, 6)                 774       
                                                                 
=================================================================
Total params: 3928198 (14.98 MB)
Trainable params: 3928198 (14.98 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
model.compile(optimizer = tf.keras.optimizers.Adam(0.001),
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])
In [ ]:
model.fit(train_x, train_y, epochs = 20)
Epoch 1/20
47/47 [==============================] - 227s 5s/step - loss: 1.6457 - accuracy: 0.2753
Epoch 2/20
47/47 [==============================] - 217s 5s/step - loss: 0.8876 - accuracy: 0.6693
Epoch 3/20
47/47 [==============================] - 206s 4s/step - loss: 0.5415 - accuracy: 0.8087
Epoch 4/20
47/47 [==============================] - 210s 4s/step - loss: 0.2644 - accuracy: 0.9153
Epoch 5/20
47/47 [==============================] - 211s 5s/step - loss: 0.4374 - accuracy: 0.8687
Epoch 6/20
47/47 [==============================] - 206s 4s/step - loss: 0.2087 - accuracy: 0.9287
Epoch 7/20
47/47 [==============================] - 204s 4s/step - loss: 0.1840 - accuracy: 0.9373
Epoch 8/20
47/47 [==============================] - 206s 4s/step - loss: 0.1260 - accuracy: 0.9573
Epoch 9/20
47/47 [==============================] - 205s 4s/step - loss: 0.1366 - accuracy: 0.9493
Epoch 10/20
47/47 [==============================] - 204s 4s/step - loss: 0.1638 - accuracy: 0.9447
Epoch 11/20
47/47 [==============================] - 204s 4s/step - loss: 0.0993 - accuracy: 0.9693
Epoch 12/20
47/47 [==============================] - 205s 4s/step - loss: 0.0773 - accuracy: 0.9720
Epoch 13/20
47/47 [==============================] - 206s 4s/step - loss: 0.1709 - accuracy: 0.9493
Epoch 14/20
47/47 [==============================] - 203s 4s/step - loss: 0.1172 - accuracy: 0.9640
Epoch 15/20
47/47 [==============================] - 206s 4s/step - loss: 0.0822 - accuracy: 0.9740
Epoch 16/20
47/47 [==============================] - 202s 4s/step - loss: 0.0830 - accuracy: 0.9780
Epoch 17/20
47/47 [==============================] - 206s 4s/step - loss: 0.0608 - accuracy: 0.9780
Epoch 18/20
47/47 [==============================] - 204s 4s/step - loss: 0.0626 - accuracy: 0.9807
Epoch 19/20
47/47 [==============================] - 206s 4s/step - loss: 0.0641 - accuracy: 0.9773
Epoch 20/20
47/47 [==============================] - 203s 4s/step - loss: 0.0466 - accuracy: 0.9840
Out[ ]:
<keras.src.callbacks.History at 0x7caf49a47850>

(4) Test your model. Compute accuracy and plot a random image with its predicted and true label.

  • Note: test accuracy should be higher than 90%.
In [ ]:
## your code here
#
10/10 [==============================] - 19s 2s/step - loss: 0.1601 - accuracy: 0.9467
In [ ]:
## your code here
#
Prediction : inclusion
True Label : inclusion

(5) Visualize the feature maps in convolutional layers 1 through 5.

  • Use the first data from the test dataset.
  • Visualize up to 6 channels per layer.
In [ ]:
outputs = [layer.output for layer in model.layers]

outputs
Out[ ]:
[<KerasTensor: shape=(None, 200, 200, 32) dtype=float32 (created by layer 'conv2d_2')>,
 <KerasTensor: shape=(None, 100, 100, 32) dtype=float32 (created by layer 'max_pooling2d_2')>,
 <KerasTensor: shape=(None, 100, 100, 64) dtype=float32 (created by layer 'conv2d_3')>,
 <KerasTensor: shape=(None, 50, 50, 64) dtype=float32 (created by layer 'max_pooling2d_3')>,
 <KerasTensor: shape=(None, 50, 50, 128) dtype=float32 (created by layer 'conv2d_4')>,
 <KerasTensor: shape=(None, 25, 25, 128) dtype=float32 (created by layer 'max_pooling2d_4')>,
 <KerasTensor: shape=(None, 25, 25, 256) dtype=float32 (created by layer 'conv2d_5')>,
 <KerasTensor: shape=(None, 12, 12, 256) dtype=float32 (created by layer 'max_pooling2d_5')>,
 <KerasTensor: shape=(None, 12, 12, 512) dtype=float32 (created by layer 'conv2d_6')>,
 <KerasTensor: shape=(None, 6, 6, 512) dtype=float32 (created by layer 'max_pooling2d_6')>,
 <KerasTensor: shape=(None, 18432) dtype=float32 (created by layer 'flatten_1')>,
 <KerasTensor: shape=(None, 128) dtype=float32 (created by layer 'dense_2')>,
 <KerasTensor: shape=(None, 6) dtype=float32 (created by layer 'dense_3')>]
In [ ]:
test_img = test_x[0]

plt.figure(figsize = (4,4))
plt.imshow(test_img.reshape(200, 200), 'gray')
plt.axis('off')
plt.show()
In [ ]:
## your code here
#