Recurrent Neural Network

By Prof. Seungchul Lee
Industrial AI Lab at POSTECH

Table of Contents

1. Recurrent Neural Network (RNN)

  • RNNs are a family of neural networks for processing sequential data

1.1. Feedforward Network and Sequential Data

  • Separate parameters for each value of the time index
  • Cannot share statistical strength across different time indices

1.2. Representation Shortcut

  • Input at each time is a vector
  • Each layer has many neurons
  • Output layer too may have many neurons
  • But will represent everything simple boxes
  • Each box actually represents an entire layer with many units

1.3. An Alternate Model for Infinite Response Systems

  • The state-space model
$$ \begin{align*} h_t &= f(x_t, h_{t-1})\\ y_t &= g(h_t) \end{align*} $$
  • This is a recurrent neural network
  • State summarizes information about the entire past
  • Single Hidden Layer RNN (Simplest State-Space Model)

  • Multiple Recurrent Layer RNN

  • Recurrent Neural Network
    • Simplified models often drawn
    • The loops imply recurrence

2. LSTM Networks

2.1. Long-Term Dependencies

  • Gradients propagated over many stages tend to either vanish or explode
  • Difficulty with long-term dependencies arises from the exponentially smaller weights given to long-term interactions
  • Introduce a memory state that runs through only linear operators
  • Use gating units to control the updates of the state

Example: "I grew up in France… I speak fluent French."