Machine Learning for Mechanical Engineering

Final Exam: Part II

06/10/2024, 8:00 PM to 10:00 PM (120 minutes)


Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Problem 01

(a) Suppose you have a dataset as depicted in the figure below. Can the perceptron algorithm classify this dataset?

  • If yes, explain how the perceptron algorithm can be applied to solve this problem.
  • If not, explain the reasons why the perceptron algorithm cannot be used for this problem.
In [ ]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline
In [ ]:
# data generation (provided)

N = 250
n_class = 3

X, Y = [], []
r = np.linspace(0, 1, N)

for i in range(n_class):
    theta = np.linspace(4*i, 4*(i+1), N) + 0.2*np.random.randn(N)
    xp = np.array([r*np.sin(theta), r*np.cos(theta)])
    yp = i*np.ones(N)
    X.append(xp)
    Y.append(yp)

train_x = np.concatenate(X, axis = 1).T
train_y = np.concatenate(Y)

plt.figure(figsize = (6, 6))
plt.plot(train_x[train_y == 0,0], train_x[train_y == 0,1], 'r.', alpha = 0.4, label = 'Class 0')
plt.plot(train_x[train_y == 1,0], train_x[train_y == 1,1], 'g.', alpha = 0.4, label = 'Class 1')
plt.plot(train_x[train_y == 2,0], train_x[train_y == 2,1], 'b.', alpha = 0.4, label = 'Class 2')
plt.title('Training Data')
plt.legend(loc = 'lower right')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.grid(alpha = 0.3)
plt.show()

b) In this problem, we aim to develop an ANN model specifically designed for tackling a multiclass non-linear classification problem.

In [ ]:
print(train_x.shape)
print(train_y.shape)
(750, 2)
(750,)
In [ ]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 100, activation = 'relu', input_shape = (2,)),
    tf.keras.layers.Dense(units = 100, activation = 'relu'),
    tf.keras.layers.Dense(units = 3, activation = 'softmax')
])
In [ ]:
model.compile(optimizer = tf.keras.optimizers.Adam(0.001),
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])
In [ ]:
model.fit(train_x, train_y, epochs = 50, verbose = 0)

Let's evaluate the accuracy for a new test dataset.

In [ ]:
## generate a new test dataset (provided)

m = 100
n_class = 3

X, Y = [], []
r = np.linspace(0, 1, m)

for i in range(n_class):
    theta = np.linspace(4*i, 4*(i+1), m) + 0.2*np.random.randn(m)
    xp = np.array([r*np.sin(theta), r*np.cos(theta)])
    yp = i*np.ones(m)
    X.append(xp)
    Y.append(yp)

test_x = np.concatenate(X, axis = 1).T
test_y = np.concatenate(Y)

plt.figure(figsize = (6, 6))
plt.title('Test Data')
plt.plot(test_x[test_y == 0,0], test_x[test_y == 0,1], 'b.')
plt.plot(test_x[test_y == 1,0], test_x[test_y == 1,1], 'b.')
plt.plot(test_x[test_y == 2,0], test_x[test_y == 2,1], 'b.')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.grid(alpha = 0.3)
plt.show()

c) Calculate the accuracy for the test dataset and plot it with the predicted classes.

In [ ]:
test_loss, test_acc = model.evaluate(test_x, test_y)
10/10 [==============================] - 1s 6ms/step - loss: 0.0805 - accuracy: 0.9867
In [ ]:
predict = model.predict(test_x)
my_pred = np.argmax(predict, axis = 1)

plt.figure(figsize = (6, 6))
plt.plot(test_x[my_pred == 0,0], test_x[my_pred == 0,1], 'r.', alpha = 0.4, label = 'Class 0')
plt.plot(test_x[my_pred == 1,0], test_x[my_pred == 1,1], 'g.', alpha = 0.4, label = 'Class 1')
plt.plot(test_x[my_pred == 2,0], test_x[my_pred == 2,1], 'b.', alpha = 0.4, label = 'Class 2')
plt.legend(loc = 'lower right')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.grid(alpha = 0.3)
plt.show()
10/10 [==============================] - 0s 3ms/step

Problem 02

Anomaly detection is a technique used to identify unusual patterns that do not conform to expected behavior, called outliers. Typically, this is treated as an unsupervised learning problem where the anomalous samples are not known a priori and it is assumed that the majority of the training dataset consists of "normal" data.

In the context of the MNIST dataset, let's consider digit 7 as representative of "normal" data. We will exclusively train an autoencoder using this "normal" data to enable the model to capture the features inherent to this normal data. As a result, it is reasonable to expect that when anomalous data is input into this trained autoencoder, the reconstruction performance will not be ideal. In other words, the reconstruction error, often referred to as the anomaly score, for anomalous data will likely be greater than that for normal data.

In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
In [ ]:
# Load Data

(train_imgs, train_labels), (test_imgs, test_labels) = tf.keras.datasets.mnist.load_data()
train_imgs, test_imgs = train_imgs.reshape(-1, 784)/255.0, test_imgs.reshape(-1, 784)/255.0
  1. Designate digit 7 as representative of "normal" data. Extract all occurrences of digit 7 from train_imgs and test_imgs, creating new datasets named normal_train_imgs and normal_test_imgs, respectively.
In [ ]:
normal_train_imgs = train_imgs[np.where(train_labels == 7)]
normal_test_imgs = test_imgs[np.where(test_labels == 7)]
  1. Design an autoencoder model and train it using the normal_train_imgs dataset.
In [ ]:
# Define Structure

# Encoder Structure
encoder = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 500, activation = 'relu', input_shape = (784,)),
    tf.keras.layers.Dense(units = 300, activation = 'relu'),
    tf.keras.layers.Dense(units = 2, activation = None)
    ])

# Decoder Structure
decoder = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 300, activation = 'relu', input_shape = (2,)),
    tf.keras.layers.Dense(units = 500, activation = 'relu'),
    tf.keras.layers.Dense(units = 28*28, activation = None)
    ])

# Autoencoder = Encoder + Decoder
autoencoder = tf.keras.models.Sequential([encoder, decoder])
In [ ]:
autoencoder.compile(optimizer = tf.keras.optimizers.Adam(0.001),
                    loss = 'mean_squared_error',)
In [ ]:
# Train Model & Evaluate Test Data

training = autoencoder.fit(normal_train_imgs, normal_train_imgs, batch_size = 50, epochs = 10)
Epoch 1/10
126/126 [==============================] - 11s 8ms/step - loss: 0.0385
Epoch 2/10
126/126 [==============================] - 1s 7ms/step - loss: 0.0321
Epoch 3/10
126/126 [==============================] - 1s 8ms/step - loss: 0.0308
Epoch 4/10
126/126 [==============================] - 1s 6ms/step - loss: 0.0292
Epoch 5/10
126/126 [==============================] - 1s 7ms/step - loss: 0.0281
Epoch 6/10
126/126 [==============================] - 1s 11ms/step - loss: 0.0276
Epoch 7/10
126/126 [==============================] - 1s 11ms/step - loss: 0.0270
Epoch 8/10
126/126 [==============================] - 1s 11ms/step - loss: 0.0268
Epoch 9/10
126/126 [==============================] - 1s 8ms/step - loss: 0.0265
Epoch 10/10
126/126 [==============================] - 1s 8ms/step - loss: 0.0264
  1. Randomly select an image from normal_test_imgs and display its reconstructed image.
In [ ]:
idx = np.random.randint(normal_test_imgs.shape[0])
test_img = normal_test_imgs[idx]

reconst_img = autoencoder.predict(test_img.reshape(1, 28*28))

plt.figure(figsize = (8, 4))
plt.subplot(1,2,1)
plt.imshow(test_img.reshape(28,28), 'gray')
plt.title('Input Image', fontsize = 12)
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(reconst_img.reshape(28,28), 'gray')
plt.title('Reconstructed Image', fontsize = 12)
plt.axis('off')
plt.show()
1/1 [==============================] - 0s 375ms/step
  1. Digit 5 will be considered as representative of "abnormal" data. Extract all digit 5 from test_imgs and create new datasets named abnormal_test_imgs.
In [ ]:
# Use digit 5 (treated as abnormal data)

abnormal_test_imgs = test_imgs[np.where(test_labels == 5)]
  1. Randomly select one image from the abnormal_test_imgs dataset and display its reconstructed image.
  • Note that the reconstruction performance is not as good as that of normal data. This is expected because the autoencoder was trained primarily on normal data, and its ability to reconstruct abnormal data is limited.
In [ ]:
idx = np.random.randint(abnormal_test_imgs.shape[0])
abnormal_test_img = abnormal_test_imgs[idx]

reconst_img = autoencoder.predict(abnormal_test_img.reshape(1, 28*28))

plt.figure(figsize = (8, 4))
plt.subplot(1,2,1)
plt.imshow(abnormal_test_img.reshape(28,28), 'gray')
plt.title('Input Image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(reconst_img.reshape(28,28), 'gray')
plt.title('Reconstructed Image')
plt.axis('off')
plt.show()
1/1 [==============================] - 0s 100ms/step
  1. Compute and plot anomaly scores for the first 200 normal test images and the first 200 abnormal test images. To do this, you can calculate the reconstruction error (anomaly score) for each image and then create plots to visualize these scores. This will help you identify anomalies among the test images.
  • Use the mean squared error as the anomaly score
  • After computing the anomaly scores for both normal and abnormal test images, you can set a threshold to differentiate between normal and abnormal images based on the anomaly scores. The threshold will determine what level of anomaly score is considered "anomalous."
  • Select a threshold value based on your visual inspection.
In [ ]:
normal_err = []
abnormal_err = []

for i in range(200):
    normal_img = normal_test_imgs[i].reshape(-1,784)
    normal_err.append(autoencoder.evaluate(normal_img, normal_img, verbose = 0))

    abnormal_img = abnormal_test_imgs[i].reshape(-1,784)
    abnormal_err.append(autoencoder.evaluate(abnormal_img, abnormal_img, verbose = 0))
In [ ]:
threshold = 0.05

plt.figure(figsize = (6, 4))
plt.plot(normal_err, '.', label = 'Normal')
plt.plot(abnormal_err, '.', label = 'Abnormal')
plt.xlabel('Data point index')
plt.ylabel('Reconstruction error')
plt.axhline(y = threshold, color = 'r', linestyle='--')
plt.legend()
plt.show()

Problem 03

We intend to perform multiple tasks with the Fashion MNIST dataset. For more details, please refer to the description provided at the following link: https://keras.io/api/datasets/fashion_mnist/