Anomaly Detection


By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Table of Contents


1. Anomaly

  • Anomalies and outliers are essentially the same thing
    • Objects that are different from most other objects
    • Something that deviates from what is standard, or expected (one classification)


Causes of Anomalies

  • Data from different class of object or underlying mechanism

    • disease vs. non-disease
    • fraud vs. not fraud
  • Data measurement and collection errors

  • Natural variation

    • tails on a Gaussian distribution

$$P_Y(Y=y) = \frac{1}{\sqrt{2\pi}} \exp \left(-\frac{1}{2} y^2 \right)$$


Anomaly Detection

  • Finding outliers


Applications of Anomaly Detection

  • Security & Surveillance


  • Biomedical Applications


  • Industrial Damage Detection


  • Machinery Defects Diagnostics
    • Diagnosis of machinery conditions
    • Early alarm of malfunctioning


Difficulties with Anomaly Detection

  • Scarcity of Anomalies

    • It is not easy to get anomaly data, because anomaly rarely happens
    • Overfitting issue occurs when there is only small number of data
  • Diverse Types of Anomalies

    • There are so many causes of anomalies
    • At the training stage of neural network, we cannot have all possible anomalies as input data


Use of Data Labels in Anomaly Detection

  • Supervised Anomaly Detection

    • Labels available for both normal data and anomalies
    • Similar to classification with high class imbalance
  • Semi-supervised Anomaly Detection

    • Labels available only for normal data
  • Unsupervised Anomaly Detection

    • No labels assumed
    • Based on the assumption that anomalies are very rare compared to normal data

Output of Anomaly Detection

  • Label

    • Each test instance is given a normal or anomaly label
    • Same as the typical output of classification-based approaches
  • Score

    • Each test instance is assigned an anomaly score
    • Allows outputs to be ranked in the order of anomaly scores
    • Requires an additional threshold parameter

Variants of Anomaly Detection Problem

  • Given a dataset $D$, find all the data points $x \in D$ with anomaly scores greater than some threshold $t$

  • Given a dataset $D$, find all the data points $x \in D$ having the top-n largest anomaly scores

  • Given a dataset $D$, containing mostly normal data points, and a test point $x$, compute the anomaly score of $x$ with respect to $D$

2. Statistical Anomaly Detection

  • Anomalies (outliers) are objects that are fit poorly by a statistical model

  • Estimate a parametric model describing the distribution of the data

  • Apply a statistical test that depends on

    • Properties of test instance
    • Parameters of model (e.g., mean, variance)
    • Confidence limit (related to number of expected outliers)

Univariate Gaussian Distribution

  • Outlier defined by Z-score $>$ threshold

$$z_i = \frac{x_i-\bar x}{s}$$

$\quad \;$ where $\bar x$ is a sample mean and $s$ is a sample variance.




Multivariate Gaussian Distribution

  • Outlier defined by Mahalanobis distance $>$ threshold


Pros and Cons

  • Pros

    • Statistical tests are well-understood and well-validated
    • Quantitative measure of degree to which object is an outlier
  • Cons

    • Data may be hard to model parametrically
    • multiple modes
    • variable density
    • In high dimensions, data may be insufficient to estimate true distribution

3. Deep Learning-based Anomaly Detection

  • Train autoencoders only with normal data

    • Trained autoencoders will only capture features of normal data
  • Test with (normal + anomaly) data


Convolutional Autoencoder (CAE)

  • CAE is trained to compress/decompress normal images to/from the latent space
  • CAE compresses the input image into few latent variables
  • CAE decompresses (reconstruct) the image that includes some error compared to the original
  • For anomalous data, the reconstruction error (anomaly score) would be greater than that of normal images


  • Training using normal data

  • Anomaly detection with test data

    • Use the CAE that was trained with normal data


Anomaly Score

  • Reconstruction Error

  • Root mean squared error (RMSE)





4. Anomaly Detection with CAE in TensorFlow

In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import random

Load MNIST Data

In [ ]:
(train_imgs, train_labels), (test_imgs, test_labels) = tf.keras.datasets.mnist.load_data()
train_imgs, test_imgs = train_imgs/255.0, test_imgs/255.0
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 2s 0us/step
In [ ]:
print('shape of x_train:', train_imgs.shape)
print('shape of y_train:', train_labels.shape)
print('shape of x_test:', test_imgs.shape)
print('shape of y_test:', test_labels.shape)
shape of x_train: (60000, 28, 28)
shape of y_train: (60000,)
shape of x_test: (10000, 28, 28)
shape of y_test: (10000,)

Seperate Normal and Abnormal Data

In [ ]:
normal_train_index = np.hstack([np.where(train_labels == 7)])[0]
normal_test_index = np.hstack([np.where(test_labels == 7)])[0]
abnormal_test_index = np.hstack([np.where(test_labels == 5)])[0]
In [ ]:
normal_train_x = train_imgs[normal_train_index].reshape(-1,28,28,1)
normal_train_y = train_labels[normal_train_index]

normal_test_x = test_imgs[normal_test_index].reshape(-1,28,28,1)
normal_test_y = test_labels[normal_test_index]

abnormal_test_x = test_imgs[abnormal_test_index].reshape(-1,28,28,1)
abnormal_test_y = test_labels[abnormal_test_index]
In [ ]:
print('shape of normal_train_x:', normal_train_x.shape)
print('shape of normal_test_x:', normal_test_x.shape)
print('shape of abnormal_test_x:', abnormal_test_x.shape)
shape of normal_train_x: (6265, 28, 28, 1)
shape of normal_test_x: (1028, 28, 28, 1)
shape of abnormal_test_x: (892, 28, 28, 1)

Plot Normal and Abnormal Data

In [ ]:
random.seed(6)
idx = random.sample(range(normal_train_x.shape[0]), 4)
In [ ]:
plt.figure(figsize = (8, 3))

for i in range(4):
    plt.subplot(1,4,i+1)
    plt.imshow(normal_train_x[idx[i]], 'gray')
    plt.title('Normal')
    plt.axis('off')

plt.tight_layout()
plt.show()
In [ ]:
random.seed(11)
idx = random.sample(range(abnormal_test_x.shape[0]), 4)

plt.figure(figsize = (8, 3))

for i in range(4):
    plt.subplot(1,4,i+1)
    plt.imshow(abnormal_test_x[idx[i]], 'gray')
    plt.title('Abnormal')
    plt.axis('off')

plt.tight_layout()
plt.show()

Build a Model

In [ ]:
# Encoder

encoder = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(filters = 32,
                           kernel_size = (3, 3),
                           strides = (2, 2),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (28, 28, 1)),

    tf.keras.layers.Conv2D(filters = 64,
                           kernel_size = (3, 3),
                           strides = (2, 2),
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (14, 14, 32)),

    tf.keras.layers.Conv2D(filters = 2,
                           kernel_size = (7, 7),
                           padding = 'VALID',
                           input_shape = (7, 7, 64))
])

encoder.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 14, 14, 32)        320       
                                                                 
 conv2d_1 (Conv2D)           (None, 7, 7, 64)          18496     
                                                                 
 conv2d_2 (Conv2D)           (None, 1, 1, 2)           6274      
                                                                 
=================================================================
Total params: 25090 (98.01 KB)
Trainable params: 25090 (98.01 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
# Decoder

decoder = tf.keras.models.Sequential([
    tf.keras.layers.Conv2DTranspose(filters = 64,
                                    kernel_size = (7, 7),
                                    strides = (1, 1),
                                    activation = 'relu',
                                    padding = 'VALID',
                                    input_shape = (1, 1, 2)),

    tf.keras.layers.Conv2DTranspose(filters = 32,
                                    kernel_size = (3, 3),
                                    strides = (2, 2),
                                    activation = 'relu',
                                    padding = 'SAME',
                                    input_shape = (7, 7, 64)),

    tf.keras.layers.Conv2DTranspose(filters = 1,
                                    kernel_size = (7, 7),
                                    strides = (2, 2),
                                    padding = 'SAME',
                                    input_shape = (14,14,32))
])

decoder.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_transpose (Conv2DTr  (None, 7, 7, 64)          6336      
 anspose)                                                        
                                                                 
 conv2d_transpose_1 (Conv2D  (None, 14, 14, 32)        18464     
 Transpose)                                                      
                                                                 
 conv2d_transpose_2 (Conv2D  (None, 28, 28, 1)         1569      
 Transpose)                                                      
                                                                 
=================================================================
Total params: 26369 (103.00 KB)
Trainable params: 26369 (103.00 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
latent = encoder.output
result = decoder(latent)
In [ ]:
cae_model = tf.keras.Model(inputs = encoder.input, outputs = result)
In [ ]:
cae_model.compile(optimizer = 'adam',
                  loss = 'mean_squared_error')
In [ ]:
cae_model.fit(normal_train_x, normal_train_x, epochs = 10)
Epoch 1/10
196/196 [==============================] - 12s 6ms/step - loss: 0.0441
Epoch 2/10
196/196 [==============================] - 1s 7ms/step - loss: 0.0327
Epoch 3/10
196/196 [==============================] - 1s 5ms/step - loss: 0.0322
Epoch 4/10
196/196 [==============================] - 1s 5ms/step - loss: 0.0318
Epoch 5/10
196/196 [==============================] - 1s 4ms/step - loss: 0.0307
Epoch 6/10
196/196 [==============================] - 1s 5ms/step - loss: 0.0292
Epoch 7/10
196/196 [==============================] - 1s 5ms/step - loss: 0.0283
Epoch 8/10
196/196 [==============================] - 1s 5ms/step - loss: 0.0279
Epoch 9/10
196/196 [==============================] - 1s 4ms/step - loss: 0.0275
Epoch 10/10
196/196 [==============================] - 1s 4ms/step - loss: 0.0273
Out[ ]:
<keras.src.callbacks.History at 0x7808eaac5bd0>

Look at Latent Space

In [ ]:
random.seed(2)
idx_n = np.random.choice(normal_test_x.shape[0], 1000)
idx_a = np.random.choice(abnormal_test_x.shape[0], 50)
In [ ]:
test_x_n, test_y_n = normal_test_x[idx_n], normal_test_y[idx_n]
test_x_a, test_y_a = abnormal_test_x[idx_a], abnormal_test_y[idx_a]
In [ ]:
normal_latent = encoder.predict(test_x_n)
normal_latent = normal_latent.reshape(-1,2)
abnormal_latent = encoder.predict(test_x_a)
abnormal_latent = abnormal_latent.reshape(-1,2)
32/32 [==============================] - 0s 3ms/step
2/2 [==============================] - 0s 63ms/step
In [ ]:
plt.figure(figsize = (6, 6))
plt.scatter(normal_latent[test_y_n == 7, 0], normal_latent[test_y_n == 7, 1], label = 'Normal 7')

plt.scatter(abnormal_latent[:, 0], abnormal_latent[:, 1], label = 'Abnormal')
plt.title('Latent Space', fontsize = 15)
plt.xlabel('Z1', fontsize = 15)
plt.ylabel('Z2', fontsize = 15)
plt.legend(fontsize = 15)
plt.show()

Test

In [ ]:
# Normal
normal_input = normal_test_x[0].reshape(-1,28,28,1)
normal_recon = cae_model.predict(normal_input)
n_recon_err = cae_model.evaluate(normal_input, normal_input)
1/1 [==============================] - 0s 190ms/step
1/1 [==============================] - 0s 162ms/step - loss: 0.0174
In [ ]:
plt.figure(figsize = (8, 4))
plt.subplot(1,2,1)
plt.imshow(normal_input[0], 'gray')
plt.title('Input image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(normal_recon[0], 'gray')
plt.title('Reconstructed image')
plt.axis('off')
plt.show()
print('Reconstruciton error: ', n_recon_err)
Reconstruciton error:  0.01736810803413391
In [ ]:
# Abnormal
abnormal_input = abnormal_test_x[0].reshape(-1,28,28,1)
abnormal_recon = cae_model.predict(abnormal_input)
ab_recon_err = cae_model.evaluate(abnormal_input, abnormal_input)
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 23ms/step - loss: 0.1113
In [ ]:
plt.figure(figsize = (8, 4))
plt.subplot(1,2,1)
plt.imshow(abnormal_input[0], 'gray')
plt.title('Input image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(abnormal_recon[0], 'gray')
plt.title('Reconstructed image')
plt.axis('off')
plt.show()
print('Reconstruciton error: ', ab_recon_err)
Reconstruciton error:  0.1113334372639656

Anomaly Detection

In [ ]:
normal_err = []
abnormal_err = []

for i in range(200):
    img = normal_test_x[i].reshape(-1,28,28,1)
    normal_err.append(cae_model.evaluate(img, img, verbose = 0))

for j in range(200):
    img = abnormal_test_x[j].reshape(-1,28,28,1)
    abnormal_err.append(cae_model.evaluate(img, img, verbose = 0))
In [ ]:
import scipy.stats as st
threshold = 0.05

plt.figure(figsize = (6, 4))
plt.plot(normal_err, '.', label = 'Normal')
plt.plot(abnormal_err, '.', label = 'Abnormal')
plt.xlabel('Data point index')
plt.ylabel('Reconstruction error')
plt.axhline(y = threshold, color = 'r', linestyle = '-')
plt.legend()
plt.show()

5. Anomaly Detection with GAN

5.1. AnoGAN (Anomaly Detection with GAN)

  • Train only normal (healthy) data, no abnormal data

  • How can I find an anomaly with a ‘generative model’?

    • Remind Probability density estimation problem
  • After generating data randomly, optimize data as similar as possible to the target data through iteration.

    • If the target data is different from the learned data (normal), it will not be generated well $\rightarrow$ bigger anomaly score


  • Train only normal (healthy) data, no abnormal data

  • Input data (unseen) to a well-trained GAN model → Compare anomaly scores to categorize



5.2. f-AnoGAN

  • AnoGAN requires an iterative procedure to find the latent $z$ that generates the target data.

    • Key idea: Let’s make this process faster
  • Train only normal (healthy) data, no abnormal data

  • Train an additional encoder model to predict latent z from images

  • Generator is fixed

    • Pixel reconstruction loss (discriminator feature loss can also be used)


  • Query data is regenerated directly through the encoder and generator

    • If data is normal (trained), data will be regenerated well.

    • Otherwise, anomaly score will be high



5.3. f-AnoGAN Implementation

In [ ]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import random

Data Load

Train Dataset: digit 2 only (normal images)



Test Dataset: digit 2 and digit 6 (normal + anomaly images)



In [ ]:
(train_imgs, train_labels), (test_imgs, test_labels) = tf.keras.datasets.mnist.load_data()
train_imgs, test_imgs = train_imgs/127.5 - 1.0, test_imgs/127.5 - 1.0

normal_train_index = np.hstack([np.where(train_labels == 2)])[0]
normal_test_index = np.hstack([np.where(test_labels == 2)])[0]
abnormal_test_index = np.hstack([np.where(test_labels == 6)])[0]
In [ ]:
normal_train_x = train_imgs[normal_train_index].reshape(-1,28,28,1)
normal_train_y = train_labels[normal_train_index]

normal_test_x = test_imgs[normal_test_index].reshape(-1,28,28,1)
normal_test_y = test_labels[normal_test_index]

abnormal_test_x = test_imgs[abnormal_test_index].reshape(-1,28,28,1)
abnormal_test_y = test_labels[abnormal_test_index]
In [ ]:
print('shape of normal_train_x:', normal_train_x.shape)
print('shape of normal_test_x:', normal_test_x.shape)
print('shape of abnormal_test_x:', abnormal_test_x.shape)
shape of normal_train_x: (5958, 28, 28, 1)
shape of normal_test_x: (1032, 28, 28, 1)
shape of abnormal_test_x: (958, 28, 28, 1)
In [ ]:
idx = random.sample(range(normal_train_x.shape[0]), 4)

plt.figure(figsize = (8, 3))

for i in range(4):
    plt.subplot(1,4,i+1)
    plt.imshow(normal_train_x[idx[i]], 'gray')
    plt.title('Normal')
    plt.axis('off')

plt.tight_layout()
plt.show()
In [ ]:
idx = random.sample(range(abnormal_test_x.shape[0]), 4)

plt.figure(figsize = (8, 3))

for i in range(4):
    plt.subplot(1,4,i+1)
    plt.imshow(abnormal_test_x[idx[i]], 'gray')
    plt.title('Abnormal')
    plt.axis('off')

plt.tight_layout()
plt.show()

Build GAN Model

Generator

  • Train with only normal data (digit 2 only)


In [ ]:
generator = tf.keras.models.Sequential([
    tf.keras.layers.Dense(7*7*256,
                          use_bias = False,
                          input_shape = (100,)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Reshape((7, 7, 256)),

    tf.keras.layers.Conv2DTranspose(128,
                                    kernel_size = 5,
                                    strides = 1,
                                    padding = 'same',
                                    use_bias = False),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Conv2DTranspose(64,
                                    kernel_size = 5,
                                    strides = 2,
                                    padding = 'same',
                                    use_bias = False),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Conv2DTranspose(1,
                                    kernel_size = 5,
                                    strides = 2,
                                    padding = 'same',
                                    use_bias = False,
                                    activation = 'tanh')
])

generator.summary()
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_2 (Dense)             (None, 12544)             1254400   
                                                                 
 batch_normalization_3 (Bat  (None, 12544)             50176     
 chNormalization)                                                
                                                                 
 leaky_re_lu_5 (LeakyReLU)   (None, 12544)             0         
                                                                 
 reshape_1 (Reshape)         (None, 7, 7, 256)         0         
                                                                 
 conv2d_transpose_6 (Conv2D  (None, 7, 7, 128)         819200    
 Transpose)                                                      
                                                                 
 batch_normalization_4 (Bat  (None, 7, 7, 128)         512       
 chNormalization)                                                
                                                                 
 leaky_re_lu_6 (LeakyReLU)   (None, 7, 7, 128)         0         
                                                                 
 conv2d_transpose_7 (Conv2D  (None, 14, 14, 64)        204800    
 Transpose)                                                      
                                                                 
 batch_normalization_5 (Bat  (None, 14, 14, 64)        256       
 chNormalization)                                                
                                                                 
 leaky_re_lu_7 (LeakyReLU)   (None, 14, 14, 64)        0         
                                                                 
 conv2d_transpose_8 (Conv2D  (None, 28, 28, 1)         1600      
 Transpose)                                                      
                                                                 
=================================================================
Total params: 2330944 (8.89 MB)
Trainable params: 2305472 (8.79 MB)
Non-trainable params: 25472 (99.50 KB)
_________________________________________________________________

Discriminator

  • Train with only normal data (digit 2 only)


In [ ]:
discriminator = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64,
                           kernel_size = 5,
                           strides = 2,
                           padding = 'same',
                           input_shape = (28, 28, 1)),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dropout(0.3),

    tf.keras.layers.Conv2D(128,
                           kernel_size = 5,
                           strides = 2,
                           padding = 'same'),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dropout(0.3),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1, activation = 'sigmoid')
])

discriminator.summary()
Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_5 (Conv2D)           (None, 14, 14, 64)        1664      
                                                                 
 leaky_re_lu_8 (LeakyReLU)   (None, 14, 14, 64)        0         
                                                                 
 dropout_2 (Dropout)         (None, 14, 14, 64)        0         
                                                                 
 conv2d_6 (Conv2D)           (None, 7, 7, 128)         204928    
                                                                 
 leaky_re_lu_9 (LeakyReLU)   (None, 7, 7, 128)         0         
                                                                 
 dropout_3 (Dropout)         (None, 7, 7, 128)         0         
                                                                 
 flatten_1 (Flatten)         (None, 6272)              0         
                                                                 
 dense_3 (Dense)             (None, 1)                 6273      
                                                                 
=================================================================
Total params: 212865 (831.50 KB)
Trainable params: 212865 (831.50 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Model (Generator + Discriminator) Compile

In [ ]:
discriminator.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0001),
                      loss = 'binary_crossentropy')
In [ ]:
combined_input = tf.keras.layers.Input(shape = (100,))
generated = generator(combined_input)
discriminator.trainable = False
combined_output = discriminator(generated)

combined = tf.keras.models.Model(inputs = combined_input, outputs = combined_output)
In [ ]:
combined.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0001),
                 loss = 'binary_crossentropy')

Train GAN

In [ ]:
def make_noise(samples):
    return np.random.normal(0, 1, [samples, 100])
In [ ]:
def plot_generated_images(generator, samples = 3):

    noise = make_noise(samples)

    generated_images = generator.predict(noise)
    generated_images = generated_images.reshape(samples, 28, 28)

    for i in range(samples):
        plt.subplot(1, samples, i+1)
        plt.imshow(generated_images[i], 'gray', interpolation = 'nearest')
        plt.axis('off')
        plt.tight_layout()

    plt.show()
In [ ]:
n_iter = 20000
batch_size = 256

fake = np.zeros(batch_size)
real = np.ones(batch_size)

for i in range(n_iter+1):

    # Train Discriminator
    noise = make_noise(batch_size)
    generated_images = generator.predict(noise, verbose = 0)

    idx = np.random.randint(0, normal_train_x.shape[0], batch_size)
    real_images = normal_train_x[idx]

    D_loss_real = discriminator.train_on_batch(real_images, real)
    D_loss_fake = discriminator.train_on_batch(generated_images, fake)
    D_loss = D_loss_real + D_loss_fake

    # Train Generator
    noise = make_noise(batch_size)
    G_loss = combined.train_on_batch(noise, real)

    if i % 2000 == 0:
        print('Discriminator Loss: ', D_loss)
        print('Generator Loss: ', G_loss)

        plot_generated_images(generator)
Discriminator Loss:  1.3417497277259827
Generator Loss:  0.6580652594566345
1/1 [==============================] - 0s 125ms/step
Discriminator Loss:  1.7134812474250793
Generator Loss:  0.7989770174026489
1/1 [==============================] - 0s 16ms/step
Discriminator Loss:  1.5435508489608765
Generator Loss:  0.7616167664527893
1/1 [==============================] - 0s 19ms/step
Discriminator Loss:  1.401880919933319
Generator Loss:  0.7737089395523071
1/1 [==============================] - 0s 16ms/step
Discriminator Loss:  1.351719617843628
Generator Loss:  0.7830795645713806
1/1 [==============================] - 0s 23ms/step
Discriminator Loss:  1.4500216841697693
Generator Loss:  0.7219251394271851
1/1 [==============================] - 0s 24ms/step
Discriminator Loss:  1.281995952129364
Generator Loss:  0.8453808426856995
1/1 [==============================] - 0s 15ms/step
Discriminator Loss:  1.3021539449691772
Generator Loss:  0.7710041999816895
1/1 [==============================] - 0s 16ms/step
Discriminator Loss:  1.360207736492157
Generator Loss:  0.7714222073554993
1/1 [==============================] - 0s 17ms/step
Discriminator Loss:  1.2689515352249146
Generator Loss:  0.892619252204895
1/1 [==============================] - 0s 26ms/step
Discriminator Loss:  1.4173975586891174
Generator Loss:  0.7410640120506287
1/1 [==============================] - 0s 19ms/step

Build Encoder for fast-AnoGAN

Encoder

  • To predict latent $z$ from image


In [ ]:
Encoder = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32,
                           kernel_size = 4,
                           strides = 2,
                           padding = 'same',
                           input_shape = (28, 28, 1)),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Conv2D(64,
                           kernel_size = 4,
                           strides = 2,
                           padding = 'same'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Conv2D(128,
                           kernel_size = 4,
                           strides = 2,
                           padding = 'same'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.LeakyReLU(),

    tf.keras.layers.Conv2D(100,
                           kernel_size = 4,
                           strides = 1,
                           padding = 'valid'),
    tf.keras.layers.Flatten()
])

Encoder.summary()
Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_7 (Conv2D)           (None, 14, 14, 32)        544       
                                                                 
 leaky_re_lu_10 (LeakyReLU)  (None, 14, 14, 32)        0         
                                                                 
 conv2d_8 (Conv2D)           (None, 7, 7, 64)          32832     
                                                                 
 batch_normalization_6 (Bat  (None, 7, 7, 64)          256       
 chNormalization)                                                
                                                                 
 leaky_re_lu_11 (LeakyReLU)  (None, 7, 7, 64)          0         
                                                                 
 conv2d_9 (Conv2D)           (None, 4, 4, 128)         131200    
                                                                 
 batch_normalization_7 (Bat  (None, 4, 4, 128)         512       
 chNormalization)                                                
                                                                 
 leaky_re_lu_12 (LeakyReLU)  (None, 4, 4, 128)         0         
                                                                 
 conv2d_10 (Conv2D)          (None, 1, 1, 100)         204900    
                                                                 
 flatten_2 (Flatten)         (None, 100)               0         
                                                                 
=================================================================
Total params: 370244 (1.41 MB)
Trainable params: 369860 (1.41 MB)
Non-trainable params: 384 (1.50 KB)
_________________________________________________________________

Model (Encoder + Generator) Compile

  • Set parameters in a generator untrainable
In [ ]:
encoder_combined_input = tf.keras.layers.Input(shape = (28, 28, 1))
latentz = Encoder(encoder_combined_input)
generator.trainable = False
regenerated_output = generator(latentz)

e_g_combined = tf.keras.models.Model(inputs = encoder_combined_input, outputs = regenerated_output)
In [ ]:
e_g_combined.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0001),
                     loss = 'mean_squared_error')

Train Encoder

In [ ]:
n_iter = 20000
batch_size = 32

e_losses = []

for i in range(n_iter+1):

    idx = np.random.randint(0, normal_train_x.shape[0], batch_size)
    real_images = normal_train_x[idx]

    recon_loss = e_g_combined.train_on_batch(real_images, real_images)

    if i % 100 == 0:
        e_losses.append(recon_loss)

    if i % 2000 == 0:
        print('recon_loss: ', recon_loss)
recon_loss:  0.521608829498291
recon_loss:  0.050642628222703934
recon_loss:  0.04527051001787186
recon_loss:  0.04172786325216293
recon_loss:  0.030090728774666786
recon_loss:  0.03498825803399086
recon_loss:  0.03760073706507683
recon_loss:  0.030637653544545174
recon_loss:  0.03451306372880936
recon_loss:  0.026821859180927277
recon_loss:  0.0322415865957737

Calculate Anomaly Score

In [ ]:
def compare_images(cls, real_img, generated_img, score, threshold=50):
    real_img = ((real_img+1.0)*255).squeeze()
    generated_img = ((generated_img+1.0)*255).squeeze()

    negative = np.zeros_like(real_img)

    diff_img = real_img - generated_img

    diff_img[diff_img <= threshold] = 0

    anomaly_img = np.zeros(shape=(28, 28, 3))
    anomaly_img[:, :, 0] = real_img - diff_img
    anomaly_img[:, :, 1] = real_img - diff_img
    anomaly_img[:, :, 2] = real_img - diff_img
    anomaly_img[:, :, 0] = anomaly_img[:,:,0] + diff_img
    anomaly_img = anomaly_img.astype(np.uint8)

    fig, plots = plt.subplots(1, 4)
    fig.suptitle(f'Class: {cls} - (anomaly score: {score:.4})')

    fig.set_figwidth(9)
    fig.set_tight_layout(True)
    plots = plots.reshape(-1)
    plots[0].imshow(real_img, cmap='gray', label='real')
    plots[1].imshow(generated_img, cmap='gray')
    plots[2].imshow(diff_img, cmap='gray')
    plots[3].imshow(anomaly_img)

    plots[0].set_title('real')
    plots[1].set_title('generated')
    plots[2].set_title('difference')
    plots[3].set_title('Anomaly Detection')
    plt.show()
In [ ]:
def calculate_anomaly_score(test_image, sample_num, plot_options = True):
    generator.trainable = False
    Encoder.trainable = False

    anomaly_score_list = []

    for i in range(sample_num):
        idx = np.random.randint(0, test_image.shape[0], 1)
        real_img = test_image[idx]

        real_z = Encoder(real_img)
        fake_img = generator(real_z)

        img_difference = np.sum((real_img - fake_img)**2)/(28**2)
        anomaly_score = img_difference
        anomaly_score_list.append(anomaly_score)

        if not plot_options:
            continue

        if anomaly_score >= 0.05:
            cls = 'Abnormal'
        else:
            cls = 'Normal'

        compare_images(cls, real_img, fake_img.numpy(), anomaly_score, threshold = 50)

    if not plot_options:
        return np.array(anomaly_score_list)

Anomaly Score of Normal Data

In [ ]:
calculate_anomaly_score(normal_test_x, 2)
<Figure size 600x400 with 0 Axes>
<Figure size 600x400 with 0 Axes>

Anomaly Score of Abnormal Data

In [ ]:
calculate_anomaly_score(abnormal_test_x, 2)
<Figure size 600x400 with 0 Axes>
<Figure size 600x400 with 0 Axes>

Plot Anomaly Scores

  • Can be more accurate with more f-AnoGAN training.
In [ ]:
normal_scores = calculate_anomaly_score(normal_test_x, 100, plot_options = False)
abnormal_scores = calculate_anomaly_score(abnormal_test_x, 100, plot_options = False)
In [ ]:
plt.figure(figsize = (6, 4))
plt.plot(normal_scores, '.', label = 'Normal')
plt.plot(abnormal_scores, '.', label = 'Abnormal')
plt.xlabel('Data point index')
plt.ylabel('Anomaly score')
plt.axhline(y = 0.05, color = 'r', linestyle = '-')
plt.legend()
plt.show()

6. Anomaly Detection with LSTM

  • LSTM-based anomaly detection utilize'prediction'


  • Predict and calculate anomaly score
    • Train with normal data
    • Good performance on periodic signals


Examples



6.1 Python Implementation

NASA Bearing Dataset

In [ ]:
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random
In [ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [ ]:
AD_bearing = np.load('/content/drive/MyDrive/DL_Colab/DL_data/AD_bearing.npy')
print("Shape of total data: ", AD_bearing.shape)
Shape of total data:  (6324, 4)
In [ ]:
plt.figure(figsize = (8, 6))
plt.plot(AD_bearing[:,0], label = 'Bearing 1', color = 'b', linewidth = 2)
plt.plot(AD_bearing[:,1], label = 'Bearing 2', color = 'r', linewidth = 2)
plt.plot(AD_bearing[:,2], label = 'Bearing 3', color = 'g', linewidth = 2)
plt.plot(AD_bearing[:,3], label = 'Bearing 4', color = 'b', linewidth = 2)
plt.legend(loc = 'upper left')
plt.title('Bearing Sensor Training Data')
plt.show()
  • We will use 'Bearing 3' only (i.e., AD_bearing[:,2])
  • Use first 4000 data points in the original data as the training data
  • Use the rest of data points as the test data
In [ ]:
bearing_3 = AD_bearing[:,2]
train = bearing_3[0:4000].reshape(-1, 1)
test = bearing_3[4000:].reshape(-1, 1)

print("Training dataset shape:", train.shape)
print("Test dataset shape:", test.shape)
Training dataset shape: (4000, 1)
Test dataset shape: (2324, 1)
In [ ]:
plt.figure(figsize = (8, 6))
plt.plot(np.arange(0, train.shape[0]), train, label = 'Bearing 3_train', linewidth = 2)
plt.plot(np.arange(4000, 6324), test, label = 'Bearing 3_test', linewidth = 2)
plt.legend(loc = 'upper left', fontsize = 16)
plt.title('Bearing Sensor Train and Test Data', fontsize = 16)
plt.xlabel('Data points')
plt.show()
  • At the end of the test-to-failure experiment, outer race failure occurred in bearing 3.

LSTM Model


In [ ]:
n_step = 20
n_input = 50

# LSTM shape
n_lstm1 = 300
n_lstm2 = 300
n_lstm3 = 300

# fully connected
n_hidden = 300
n_output = 50
In [ ]:
lstm_network = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape = (n_step, n_input)),
    tf.keras.layers.LSTM(n_lstm1, return_sequences = True),
    tf.keras.layers.LSTM(n_lstm2, return_sequences = True),
    tf.keras.layers.LSTM(n_lstm3),
    tf.keras.layers.Dense(n_hidden, activation = 'relu'),
    tf.keras.layers.Dense(n_output),
])

lstm_network.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (None, 20, 300)           421200    
                                                                 
 lstm_1 (LSTM)               (None, 20, 300)           721200    
                                                                 
 lstm_2 (LSTM)               (None, 300)               721200    
                                                                 
 dense (Dense)               (None, 300)               90300     
                                                                 
 dense_1 (Dense)             (None, 50)                15050     
                                                                 
=================================================================
Total params: 1968950 (7.51 MB)
Trainable params: 1968950 (7.51 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [ ]:
lstm_network.compile(optimizer = 'adam',
                     loss = 'mean_squared_error',
                     metrics = ['mse'])

Train/Test Data Split

In [ ]:
def dataset(train, test, n_samples, n_step = n_step, n_input = n_input, n_output = n_output):

    train_x_list = []
    train_y_list = []

    n_data = train.shape[0]
    random.seed(0)
    start_point = random.sample(list(np.arange(0, n_data-(n_step+1)*n_input)), n_samples)

    for i in start_point:
        train_x_list.append(train[i:i + n_step*n_input].reshape(n_step, n_input))
        train_y_list.append(train[i + n_step*n_input:i + n_step*n_input + n_output])

    train_data = np.array(train_x_list)
    train_label = np.array(train_y_list)

    test_data = test[0:n_step*n_input]
    test_data = test_data.reshape(1, n_step, n_input)
    test_label = test[n_step*n_input:n_step*n_input+n_output].ravel()

    return train_data, train_label, test_data, test_label
In [ ]:
train_data, train_label, test_data, test_label = dataset(train, test, 2000)
print('Train data shape:', train_data.shape)
Train data shape: (2000, 20, 50)

Model Training

In [ ]:
lstm_network.fit(train_data, train_label, epochs = 15)
Epoch 1/15
63/63 [==============================] - 12s 14ms/step - loss: 1.1619e-04 - mse: 1.1619e-04
Epoch 2/15
63/63 [==============================] - 1s 12ms/step - loss: 1.8777e-06 - mse: 1.8777e-06
Epoch 3/15
63/63 [==============================] - 1s 12ms/step - loss: 2.0915e-06 - mse: 2.0915e-06
Epoch 4/15
63/63 [==============================] - 1s 9ms/step - loss: 2.0915e-06 - mse: 2.0915e-06
Epoch 5/15
63/63 [==============================] - 1s 9ms/step - loss: 2.0193e-06 - mse: 2.0193e-06
Epoch 6/15
63/63 [==============================] - 1s 9ms/step - loss: 1.7700e-06 - mse: 1.7700e-06
Epoch 7/15
63/63 [==============================] - 1s 9ms/step - loss: 1.8708e-06 - mse: 1.8708e-06
Epoch 8/15
63/63 [==============================] - 1s 9ms/step - loss: 2.1130e-06 - mse: 2.1130e-06
Epoch 9/15
63/63 [==============================] - 1s 9ms/step - loss: 1.7795e-06 - mse: 1.7795e-06
Epoch 10/15
63/63 [==============================] - 1s 9ms/step - loss: 2.0281e-06 - mse: 2.0281e-06
Epoch 11/15
63/63 [==============================] - 1s 9ms/step - loss: 2.3448e-06 - mse: 2.3448e-06
Epoch 12/15
63/63 [==============================] - 1s 9ms/step - loss: 1.8821e-06 - mse: 1.8821e-06
Epoch 13/15
63/63 [==============================] - 1s 9ms/step - loss: 1.8542e-06 - mse: 1.8542e-06
Epoch 14/15
63/63 [==============================] - 1s 9ms/step - loss: 1.8461e-06 - mse: 1.8461e-06
Epoch 15/15
63/63 [==============================] - 1s 9ms/step - loss: 2.0473e-06 - mse: 2.0473e-06
Out[ ]:
<keras.src.callbacks.History at 0x7d3500182230>

Results

In [ ]:
test_pred = lstm_network.predict(train_data[0:1]).ravel()

plt.figure(figsize = (8, 6))
plt.plot(np.arange(0, n_step*n_input + n_output), np.hstack([train_data[0:1].ravel(), train_label[0:1].ravel()]), 'b', label = 'Ground truth')
plt.plot(np.arange(n_step*n_input, n_step*n_input + n_output), test_pred, 'r', label = 'Prediction')
plt.vlines(n_step*n_input, 0.05, 0.06, colors = 'r', linestyles = 'dashed')
plt.ylim([0.04, 0.07])
plt.legend(fontsize = 13, loc = 'upper left')
plt.xlabel('Data points')
plt.show()
1/1 [==============================] - 0s 23ms/step

Difference Between Predicted and Measured Signal

In [ ]:
gen_signal = []

for i in range((test.shape[0]-n_step*n_input)//n_output):
    test_pred = lstm_network.predict(test_data, verbose = 0)
    gen_signal.append(test_pred.ravel())
    test_pred = test_pred[:, np.newaxis, :]

    test_data = test_data[:, 1:, :]
    test_data = np.concatenate([test_data, test_pred], axis = 1)

gen_signal = np.concatenate(gen_signal)
test_label = test[n_step*n_input:n_step*n_input+n_output*(i+1)]

plt.figure(figsize = (8, 6))
plt.plot(test_label, 'b', label = 'Measured signal')
plt.plot(gen_signal, 'r', label = 'Prediction')
plt.legend(fontsize = 15, loc = 'upper left')
plt.xlabel('Data points')
plt.show()
In [ ]:
plt.figure(figsize = (8, 6))
plt.plot(np.abs(test_label.reshape(-1) - gen_signal), label = 'Anomaly score')
plt.legend(fontsize = 15, loc = 'upper left')
plt.hlines(0.005, 0, 1300, colors = 'r', linestyles = 'dashed')
plt.xlabel('Data points')
plt.ylabel('Anomaly score (difference)')
plt.show()
In [1]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')