AI for Mechanical Engineering

Recurrent Neural Networks (RNN)

Problem 1: LSTM with TensorFlow

  • In this problem, you will make a LSTM model to predict the half of an MNIST image using the other half.

  • You will split an MNIST image into 28 pieces.

  • MNIST is 28 x 28 image. The model predicts a piece of 1 x 28 image.

  • First, 14 x 28 image will be feeded into a model as time series, then the model predict the last 14 x 28 image, recursively.



In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
  1. Load MNIST Data
  • Download MNIST data from the tensorflow tutorial example
In [ ]:
(train_imgs, train_labels), (test_imgs, test_labels) = tf.keras.datasets.mnist.load_data()

train_imgs = train_imgs/255.0
test_imgs = test_imgs/255.0

print('train_x: ', train_imgs.shape)
print('test_x: ', test_imgs.shape)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 0s 0us/step
train_x:  (60000, 28, 28)
test_x:  (10000, 28, 28)
  1. Plot a ramdomly selected data with its label
In [ ]:
def batchmaker(x, y, n_batch):
    idx = np.random.randint(len(x), size = n_batch)
    return x[idx], y[idx]
In [ ]:
train_x, train_y = batchmaker(train_imgs, train_labels, 1)
img = train_x[0].reshape(28, 28)

plt.figure(figsize = (5, 3))
plt.imshow(img,'gray')
plt.title("Label : {}".format(train_y[0]))
plt.xticks([])
plt.yticks([])
plt.show()
No description has been provided for this image
  1. Define LSTM Structure


In [ ]:
## write your code here
#
  1. Define Cost, Initializer and Optimizer Loss
  • Regression: Squared loss

$$ \frac{1}{N} \sum_{i=1}^{N} (\hat{y}^{(i)} - y^{(i)})^2$$

  • Initializer

    • Initialize all the empty variables
  • Optimizer

    • AdamOptimizer: the most popular optimizer
In [ ]:
## write your code here
#
  1. Define optimization configuration and then optimize
  • Highly recommand to use GPU or Colab.
In [ ]:
## write your code here
#
  1. Test or Evaluate
  • Predict the MNIST image
  • MNIST is 28 x 28 image. The model predicts a piece of 1 x 28 image.
  • First, 14 x 28 image will be feeded into a model, then the model predict the last 14 x 28 image, recursively.
In [ ]:
## write your code here
#
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Problem 2

  • In this problem, we have bearing data with 3 classes (healthy, inner fault, outer fault).

  • The objective is to classify the given data using state-of-the-art deep learning models.

  • You will build and test a deep learning model.

Dataset Description

The bearing data is collected by a sensory system which has 2 channels: vibration and rotational speed. You can refer to the paper to see the specification in detail. The experimental setup is shown in the below figure. The dataset contains 36 files with 3 classes, 2 sensor positions, and 4 speed varying conditions. Every data is sampled at 200,000 Hz of sampling frequency and 10 seconds of duration. We will use only the increasing speed condition and the channel 1 (vibration data) for the sake of simplicity.



Download & Load Data

We already made the data ready for you. You can download the data in .npy format. Three files are prepared for a train set and three files for a test set.

Plot all data with the title of its filename.

In [16]:
import tensorflow as tf
import numpy as np
import os
import matplotlib.pyplot as plt
%matplotlib inline
In [15]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [17]:
Healthy_train = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_Healthy_train.npy')
InnerFault_train = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_InnerFault_train.npy')
OuterFault_train = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_OuterFault_train.npy')

Healthy_test = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_Healthy_test.npy')
InnerFault_test = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_InnerFault_test.npy')
OuterFault_test = np.load('/content/drive/MyDrive/DL_Colab/DL_data/Bearing_OuterFault_test.npy')
In [18]:
## Your code here
#
No description has been provided for this image

In the following subproblems, you will be introduced deep learning models which are used for time-series data classification. You are asked to build the model with the given architecture and evaluate the performance.

Various deep learning models for time-series data classification

In the deep learning field, there are plenty of models for time-series data classification. Compared to conventional machine learning algorithms, recent deep neural network models show much higher accuracy as you can see in the table below. In this problem, we will build the CNN-LSTM model which is one of the most successful, and whose elements are covered in this course.

"Deep Learning Algorithms for Bearing Fault Diagnostics – A Comprehensive Review", 2020, S Zhang et al.

Implementation of CNN-LSTM model

The CNN-LSTM model is introduced in "An Improved Bearing Fault Diagnosis Method using One-Dimensional CNN and LSTM". The authors combined 1D CNN and LSTM succesfully and it shows high performance in terms of both computation time and accuracy. The following configuration shows a part of the structure of the model. The model takes the segmented data as an input. A data segment which has a length of 1,600 is randomly cropped from the original time-series data.





Build the model based on the above information and print the structure.

Build the model based on the above information and print the structure. You can freely assign the other parameters that are not described in the above configuration in order to achieve better performance. (You don't need to refer the original paper for those.) Please refer to the summary of the model structure.

You will use the 1D convolution layer for this problem. We have learned 2D convolution in class and 1D convolution is nothing but 2D convolution with a height as 1.

Input_shape of data = (4000, 2000, 1)

2000 $\rightarrow$ 20 $\times$ 100 $\rightarrow$ CNN layers $\rightarrow$ LSTM layer

In [21]:
## Your code here
#
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 reshape (Reshape)           (None, 20, 100)           0         
                                                                 
 conv1d (Conv1D)             (None, 20, 32)            204832    
                                                                 
 max_pooling1d (MaxPooling1  (None, 10, 32)            0         
 D)                                                              
                                                                 
 lstm_2 (LSTM)               (None, 128)               82432     
                                                                 
 dense_2 (Dense)             (None, 3)                 387       
                                                                 
=================================================================
Total params: 287651 (1.10 MB)
Trainable params: 287651 (1.10 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Create trainset and testset

Create trainset and testset to feed the model you designed. Set the number of segments equals to 4,000 for trainset and 1,000 for testset (for each class).

In [22]:
## Your code here
#

Train the model and print the training procedure.

In [24]:
## Your code here
#
Epoch 1/10
240/240 [==============================] - 6s 19ms/step - loss: 0.2681 - accuracy: 0.8704
Epoch 2/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0325 - accuracy: 0.9886
Epoch 3/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0198 - accuracy: 0.9923
Epoch 4/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0274 - accuracy: 0.9921
Epoch 5/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0096 - accuracy: 0.9967
Epoch 6/10
240/240 [==============================] - 4s 19ms/step - loss: 0.0151 - accuracy: 0.9944
Epoch 7/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0116 - accuracy: 0.9961
Epoch 8/10
240/240 [==============================] - 5s 19ms/step - loss: 0.0035 - accuracy: 0.9984
Epoch 9/10
240/240 [==============================] - 4s 19ms/step - loss: 4.5720e-05 - accuracy: 1.0000
Epoch 10/10
240/240 [==============================] - 5s 19ms/step - loss: 1.7450e-05 - accuracy: 1.0000

Evaluate the model in terms of accuracy of the testset.

In [25]:
## Your code here
#
60/60 [==============================] - 1s 8ms/step - loss: 0.0077 - accuracy: 0.9990
Out[25]:
[0.007681192364543676, 0.9990000128746033]

Problem 3

  • In this problem, we are going to predict sequence data using LSTM model

  • You will build and test a deep learning model.

The data, which includes variables such as combustor power, compressor power, compressor pressure, blade tip temperature, blade tip acceleration, blade tip velocity, and relative pressure, is given at every minute. We will create a deep learning model that makes use of sequential properties. The compressor power for time $t$ will be predicted using the data provided at time $t-1$. In other words, 7 input variables at time $t-1$ is used to predict combustor power at time $t$.

Download & Load Data

We've already prepared the data. You can download the data in .txt format.

In [ ]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

import keras
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
In [ ]:
df = pd.read_csv('./data.txt', sep=';',
                 parse_dates={'dt' : ['Date', 'Time']}, infer_datetime_format=True,
                 low_memory=False, na_values=['nan','?'], index_col='dt')

# filling nan with mean in any columns
for j in range(0,7):
    df.iloc[:,j]=df.iloc[:,j].fillna(df.iloc[:,j].mean())

df

In order to reduce the computation time, data were resampled in hour units to reduce the computational time (original data is provided in minutes). This will reduce the size of data from 2075259 to 34589 but keep the overall strucure of data as shown in the below.

In [ ]:
df_resample = df.resample('h').mean()
df_resample

In order to train the deep learning model using dataset, we are going to change the data type from dataframe to array.

In [ ]:
dataset = df_resample.values
print(dataset.shape)

Data visualization

Plot all data with the title of its variable name.

In [ ]:
# Your code here

Preprocessing

Each variables has different scale. Normalize the dataset.

In [ ]:
# Your code here
In [ ]:
print('min value:{:.1f}, max value:{:.1f}'.format(np.min(norm_dataset), np.max(norm_dataset)))

Split dataset

(2.3.1) Referring to below image, create train and target data.



In [ ]:
 
In [ ]:
# Your code here

(2.3.2) Set the trainset to 3 years and the testset to the remaining datasets.
Hint: Since the data set is recorded hour by hour, 365*24 data are one-year data

In [ ]:
# Your code here

Create model

Create a model by referring to below result. Use mean squared error as loss function.

In [ ]:
# Your code here

Train the model

In [ ]:
# Your code here

Plot the result

Predict your test dataset and compare the prediction value with target value.

In [ ]:
# Your code here