1. Recurrent Neural Network (RNN)¶

RNNs are a family of neural networks for processing sequential data

1.1. Feedforward Network and Sequential Data¶

Separate parameters for each value of the time index

Cannot share statistical strength across different time indices

1.2. Representation Shortcut¶

Input at each time is a vector
Each layer has many neurons
Output layer too may have many neurons
But will represent everything simple boxes
Each box actually represents an entire layer with many units

1.3. An Alternate Model for Infinite Response Systems¶

The state-space model

$$ \begin{align*} h_t &= f(x_t, h_{t-1})\\ y_t &= g(h_t) \end{align*} $$

This is a recurrent neural network
State summarizes information about the entire past

Single Hidden Layer RNN (Simplest State-Space Model)

Multiple Recurrent Layer RNN

Recurrent Neural Network
- Simplified models often drawn
- The loops imply recurrence

2. LSTM Networks¶

2.1. Long-Term Dependencies¶

Gradients propagated over many stages tend to either vanish or explode
Difficulty with long-term dependencies arises from the exponentially smaller weights given to long-term interactions
Introduce a memory state that runs through only linear operators
Use gating units to control the updates of the state

Example: "I grew up in France… I speak fluent French."

2.2. Long Short-Term Memory (LSTM)¶

Consists of a memory cell and a set of gating units
- Memory cell is the context that carries over
- Forget gate controls erase operation
- Input gate controls write operation
- Output gate controls the read operation

Connect LSTM cells in a recurrent manner
Train parameters in LSTM cells

2.2.1. LSTM for Classification¶

2.2.2. LSTM for Prediction¶

3. LSTM with TensorFlow¶

An example for predicting a next piece of an image
Regression problem
Again, MNIST dataset

Time series data and RNN

3.1. Import Library¶

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from six.moves import cPickle

3.2. Load MNIST Data¶

Import acceleration signal of rotation machinery
rnn_time_signal.pkl

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

data =  cPickle.load(open('/content/drive/MyDrive/DL_Colab/DL_data/rnn_time_signal.pkl', 'rb'))

plt.figure(figsize = (8, 4))
plt.title('Time signal for RNN')
plt.plot(data[0:2000])
plt.xlim(0,2000)
plt.show()

3.3. LSTM Model Training¶

n_step = 25
n_input = 100

# LSTM shape
n_lstm1 = 100
n_lstm2 = 100

# fully connected
n_hidden = 100
n_output = 100

lstm_network = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape = (n_step, n_input)),
    tf.keras.layers.LSTM(n_lstm1, return_sequences = True),
    tf.keras.layers.LSTM(n_lstm2),
    tf.keras.layers.Dense(n_hidden),
    tf.keras.layers.Dense(n_output),
])

lstm_network.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (None, 25, 100)           80400     
                                                                 
 lstm_1 (LSTM)               (None, 100)               80400     
                                                                 
 dense (Dense)               (None, 100)               10100     
                                                                 
 dense_1 (Dense)             (None, 100)               10100     
                                                                 
=================================================================
Total params: 181,000
Trainable params: 181,000
Non-trainable params: 0
_________________________________________________________________

lstm_network.compile(optimizer = 'adam',
                     loss = 'mean_squared_error',
                     metrics = ['mse'])

def dataset(data, n_samples, n_step = n_step, dim_input = n_input, dim_output = n_output, stride = 5):

    train_x_list = []
    train_y_list = []
    for i in range(n_samples):
        train_x = data[i*stride:i*stride + n_step*dim_input]
        train_x = train_x.reshape(n_step, dim_input)
        train_x_list.append(train_x)

        train_y = data[i*stride + n_step*dim_input:i*stride + n_step*dim_input + dim_output]
        train_y_list.append(train_y)

    train_data = np.array(train_x_list)
    train_label = np.array(train_y_list)

    test_data = data[10000:10000 + n_step*dim_input]
    test_data = test_data.reshape(1, n_step, dim_input)

    return train_data, train_label, test_data

train_data, train_label, test_data = dataset(data, 5000)

lstm_network.fit(train_data, train_label, epochs = 3)

Epoch 1/3
157/157 [==============================] - 40s 144ms/step - loss: 0.0332 - mse: 0.0332
Epoch 2/3
157/157 [==============================] - 9s 57ms/step - loss: 0.0069 - mse: 0.0069
Epoch 3/3
157/157 [==============================] - 14s 87ms/step - loss: 0.0052 - mse: 0.0052

<keras.callbacks.History at 0x7afa141bb7f0>

3.4. Testing or Evaluating¶

Predict future time signal

test_pred = lstm_network.predict(test_data).ravel()
test_label = data[10000:10000 + n_step*n_input + n_input]

plt.figure(figsize = (8, 4))
plt.plot(np.arange(0, n_step*n_input + n_input), test_label, 'b', label = 'Ground truth')
plt.plot(np.arange(n_step*n_input, n_step*n_input + n_input), test_pred, 'r', label = 'Prediction')
plt.vlines(n_step*n_input, -1, 1, colors = 'r', linestyles = 'dashed')
plt.legend(fontsize = 15, loc = 'upper left')
plt.xlim(0, len(test_label))
plt.show()

1/1 [==============================] - 2s 2s/step

gen_signal = []

for i in range(n_step):
    test_pred = lstm_network.predict(test_data, verbose = 0)
    gen_signal.append(test_pred.ravel())
    test_pred = test_pred[:, np.newaxis, :]

    test_data = test_data[:, 1:, :]
    test_data = np.concatenate([test_data, test_pred], axis = 1)

gen_signal = np.concatenate(gen_signal)

test_label = data[10000:10000 + n_step*n_input + n_step*n_input]

plt.figure(figsize = (8, 4))
plt.plot(np.arange(0, n_step*n_input + n_step*n_input), test_label, 'b', label = 'Ground truth')
plt.plot(np.arange(n_step*n_input,  n_step*n_input + n_step*n_input), gen_signal, 'r', label = 'Prediction')
plt.vlines(n_step*n_input, -1, 1, colors = 'r', linestyles = 'dashed')
plt.legend(fontsize=15, loc = 'upper left')
plt.xlim(0, len(test_label))
plt.show()

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')