AI for Mechanical Engineering

Recurrent Neural Networks (RNN)

Problem 1: LSTM with TensorFlow

  • In this problem, you will make a LSTM model to predict the half of an MNIST image using the other half.

  • You will split an MNIST image into 28 pieces.

  • MNIST is 28 x 28 image. The model predicts a piece of 1 x 28 image.

  • First, 14 x 28 image will be feeded into a model as time series, then the model predict the last 14 x 28 image, recursively.



In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
  1. Load MNIST Data
  • Download MNIST data from the tensorflow tutorial example
In [ ]:
(train_imgs, train_labels), (test_imgs, test_labels) = tf.keras.datasets.mnist.load_data()

train_imgs = train_imgs/255.0
test_imgs = test_imgs/255.0

print('train_x: ', train_imgs.shape)
print('test_x: ', test_imgs.shape)
  1. Plot a ramdomly selected data with its label
In [ ]:
def batchmaker(x, y, n_batch):
    idx = np.random.randint(len(x), size = n_batch)
    return x[idx], y[idx]
In [ ]:
train_x, train_y = batchmaker(train_imgs, train_labels, 1)
img = train_x[0].reshape(28, 28)

plt.figure(figsize = (5, 3))
plt.imshow(img,'gray')
plt.title("Label : {}".format(train_y[0]))
plt.xticks([])
plt.yticks([])
plt.show()
  1. Define LSTM Structure


In [ ]:
n_step = 14
n_input = 28

## LSTM shape
n_lstm1 = 10
n_lstm2 = 10

## Fully connected
n_hidden = 100
n_output = 28
In [ ]:
lstm_network = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape = (n_step, n_input)),
    tf.keras.layers.LSTM(n_lstm1, return_sequences = True),
    tf.keras.layers.LSTM(n_lstm2),
    tf.keras.layers.Dense(n_hidden, activation = 'relu'),
    tf.keras.layers.Dense(n_output),
])

lstm_network.summary()
  1. Define Cost, Initializer and Optimizer Loss
  • Regression: Squared loss

$$ \frac{1}{N} \sum_{i=1}^{N} (\hat{y}^{(i)} - y^{(i)})^2$$

  • Initializer

    • Initialize all the empty variables
  • Optimizer

    • AdamOptimizer: the most popular optimizer
In [ ]:
lstm_network.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0005),
                     loss = 'mean_squared_error')
  1. Define optimization configuration and then optimize
  • Highly recommand to use GPU or Colab.
In [ ]:
n_iter = 5000
n_prt = 500
batch_size = 50

for i in range(n_iter+1):
    train_x, train_y = batchmaker(train_imgs, train_labels, batch_size)

    for j in range(n_step):
        loss = lstm_network.train_on_batch(train_x[:, j:j+n_step, :], train_x[:, j+n_step, :])
  1. Test or Evaluate
  • Predict the MNIST image
  • MNIST is 28 x 28 image. The model predicts a piece of 1 x 28 image.
  • First, 14 x 28 image will be feeded into a model, then the model predict the last 14 x 28 image, recursively.
In [ ]:
test_x, test_y = batchmaker(test_imgs, test_labels, 5)

for idx in range(5):
    gen_img = []

    sample = test_x[idx, 0:14, :]
    input_img = sample.copy()

    feeding_img = test_x[idx, 0:0+n_step, :]

    for i in range(n_step):
        test_pred = lstm_network.predict(feeding_img.reshape(1, 14, 28))
        feeding_img = np.delete(feeding_img, 0, 0)
        feeding_img = np.vstack([feeding_img, test_pred])
        gen_img.append(test_pred)

    for i in range(n_step):
        sample = np.vstack([sample, gen_img[i]])

    plt.figure(figsize = (8, 20))
    plt.subplot(1,3,1)
    plt.imshow(test_x[idx], 'gray')
    plt.title('Original Img')
    plt.xticks([])
    plt.yticks([])

    plt.subplot(1,3,2)
    plt.imshow(input_img, 'gray')
    plt.title('Input')
    plt.xticks([])
    plt.yticks([])

    plt.subplot(1,3,3)
    plt.imshow(sample, 'gray')
    plt.title('Generated Img')
    plt.xticks([])
    plt.yticks([])
    plt.show()

Problem 2

  • In this problem, we have bearing data with 3 classes (healthy, inner fault, outer fault).

  • The objective is to classify the given data using state-of-the-art deep learning models.

  • You will build and test a deep learning model.

Dataset Description

The bearing data is collected by a sensory system which has 2 channels: vibration and rotational speed. You can refer to the paper to see the specification in detail. The experimental setup is shown in the below figure. The dataset contains 36 files with 3 classes, 2 sensor positions, and 4 speed varying conditions. Every data is sampled at 200,000 Hz of sampling frequency and 10 seconds of duration. We will use only the increasing speed condition and the channel 1 (vibration data) for the sake of simplicity.