AI for Mechanical Engineering

Convolutional Neural Networks (CNN)

Problem 1: Sign Language Classification with CNN

As we have learned in the lectures, Convolutional Neural Networks (CNN) have the capability to classify images directly. This characteristic makes CNN highly versatile and applicable to a wide range of industries. In this particular task, we aim to employ CNN for the classification of sign language. This application will enable individuals to engage in direct communication with those who are hearing impaired, even if they are not familiar with sign language.

  1. Load and plot 6 random data points. You need to load a total of 4 files.
In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
# Change file paths if necessary

train_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_train_img.npy')
train_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_train_label.npy')

test_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_test_img.npy')
test_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/sign_language_test_label.npy')
In [ ]:
# check the shape of data

print(train_x.shape)
print(train_y.shape)
print(test_x.shape)
print(test_y.shape)
(900, 100, 100, 1)
(900, 6)
(337, 100, 100, 1)
(337, 6)
In [ ]:
## your code here
#

np.random.seed(0)
random_indices = np.random.choice(train_x.shape[0], 6, replace=False)

fig, axes = plt.subplots(2, 3)
for i, ax in enumerate(axes.flatten()):
    ax.imshow(train_x[random_indices[i], :, :, 0], cmap='gray')
    ax.axis('off')
  1. Design your CNN structure and train it with the training data.
  • input
  • filter size
  • pooling size
  • hidden layer
  • output
In [ ]:
## your code here
#

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 1)),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(6, activation='softmax')
])
In [ ]:
model.compile(optimizer = 'adam',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])
In [ ]:
model.fit(train_x, train_y, epochs = 20)
Epoch 1/20
29/29 [==============================] - 9s 21ms/step - loss: 9.4731 - accuracy: 0.2344
Epoch 2/20
29/29 [==============================] - 0s 11ms/step - loss: 1.4592 - accuracy: 0.3633
Epoch 3/20
29/29 [==============================] - 0s 11ms/step - loss: 1.0346 - accuracy: 0.6089
Epoch 4/20
29/29 [==============================] - 0s 11ms/step - loss: 0.6508 - accuracy: 0.7500
Epoch 5/20
29/29 [==============================] - 0s 11ms/step - loss: 0.4737 - accuracy: 0.8400
Epoch 6/20
29/29 [==============================] - 0s 15ms/step - loss: 0.3264 - accuracy: 0.8711
Epoch 7/20
29/29 [==============================] - 0s 13ms/step - loss: 0.3085 - accuracy: 0.8889
Epoch 8/20
29/29 [==============================] - 0s 11ms/step - loss: 0.2252 - accuracy: 0.9344
Epoch 9/20
29/29 [==============================] - 0s 11ms/step - loss: 0.2044 - accuracy: 0.9189
Epoch 10/20
29/29 [==============================] - 0s 11ms/step - loss: 0.2147 - accuracy: 0.9178
Epoch 11/20
29/29 [==============================] - 0s 11ms/step - loss: 0.1732 - accuracy: 0.9289
Epoch 12/20
29/29 [==============================] - 0s 11ms/step - loss: 0.1674 - accuracy: 0.9544
Epoch 13/20
29/29 [==============================] - 0s 11ms/step - loss: 0.1858 - accuracy: 0.9344
Epoch 14/20
29/29 [==============================] - 0s 11ms/step - loss: 0.0663 - accuracy: 0.9800
Epoch 15/20
29/29 [==============================] - 0s 11ms/step - loss: 0.0831 - accuracy: 0.9633
Epoch 16/20
29/29 [==============================] - 0s 11ms/step - loss: 0.1703 - accuracy: 0.9400
Epoch 17/20
29/29 [==============================] - 0s 11ms/step - loss: 0.1174 - accuracy: 0.9733
Epoch 18/20
29/29 [==============================] - 0s 11ms/step - loss: 0.0940 - accuracy: 0.9656
Epoch 19/20
29/29 [==============================] - 0s 11ms/step - loss: 0.0911 - accuracy: 0.9678
Epoch 20/20
29/29 [==============================] - 0s 11ms/step - loss: 0.0454 - accuracy: 0.9833
Out[ ]:
<keras.src.callbacks.History at 0x7f4f77e74e50>
  1. Test your model. Calculate accuracy and plot a random image with its predicted and true label.
  • Note: test accuracy should be higher than 80%.
In [ ]:
## your code here
#

test_loss, test_acc = model.evaluate(test_x, test_y, verbose=2)
predictions = model.predict(test_x)
11/11 - 0s - loss: 0.2171 - accuracy: 0.9703 - 146ms/epoch - 13ms/step
11/11 [==============================] - 0s 5ms/step
In [ ]:
## your code here
#

predicted_classes = np.argmax(predictions, axis=1)
true_classes = np.argmax(test_y, axis=1)

random_index = np.random.randint(0, len(test_x))

plt.imshow(test_x[random_index, :, :, 0], cmap='gray')
plt.title(f"Predicted: {predicted_classes[random_index]}, True: {true_classes[random_index]}")
plt.axis('off')
plt.show()

Problem 2: Understanding the Feature Map for Each Layer

Let's build a CNN model that classifies steel surface defects.

  • In order for CNN to preceed with classification, training was carried out using a convolution filter.
  • Let's visualize the feature map for each convolutional layer to understand the CNN's decision boundary.
  • NEU steel surface defects example.
  • To classify defects images into 6 classes.


Download NEU steel surface defects images and labels

  1. Load and plot 3 random images for 6 classes.
In [ ]:
# Change file paths if necessary

train_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_train_imgs.npy')
train_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_train_labels.npy')

test_x = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_test_imgs.npy')
test_y = np.load('/content/drive/MyDrive/DL_Colab/DL_data/NEU_test_labels.npy')

print(train_x.shape)
print(train_y.shape)
print(test_x.shape)
print(test_y.shape)
(1500, 200, 200, 1)
(1500,)
(300, 200, 200, 1)
(300,)
In [ ]:
name = ['scratches', 'rolled-in scale', 'pitted surface', 'patches', 'inclusion', 'crazing']

plt.figure(figsize = (12,7))

## your code here
#

images_per_class = 3
num_classes = len(name)

for j in range(num_classes):
    class_indices = np.where(train_y == j)[0]
    for i in range(images_per_class):
        # Calculate position for each subplot
        plt_idx = i * num_classes + j + 1
        ax = plt.subplot(images_per_class, num_classes, plt_idx)
        ax.imshow(train_x[class_indices[i]], cmap='gray')
        ax.axis('off')
        # Set title for the first image in each column
        if i == 0:
            ax.set_title(name[j])

plt.tight_layout()
plt.show()